NSF Backs Bioinformatics Approach to Understanding Plant RNA Modifications
RNA perform a variety of functions in cells, helping with everything from regulating genes to building proteins. In recent years, it has become clear that chemical modifications to RNA help guide these functions, but only a handful of these modifications have been identified in plants.
On July 24, Andrew Nelson, a faculty member at the Boyce Thompson Institute, and collaborators received a $2 million award from the National Science Foundation (NSF) to identify and infer the functional significance of dozens of different types of RNA modifications in 15 diverse model and crop species. Resources developed by the project will make it easier for plant scientists to utilize and expand upon the discoveries.
The project also places a strong focus on building undergraduate curricula teaching biology as a data-driven science.
“If these RNA modifications have the impact that we think they will,” Nelson explains, “researchers will be able to do some very targeted gene editing in their favorite species and potentially make more stress-tolerant crops, which is becoming increasingly important because of the effects of climate change.”
Nelson is joined in this effort by project co-leaders Rebecca Murphy, an associate professor of biology at Centenary College of Louisiana, Brian Gregory, an associate professor of biology at University of Pennsylvania, and Eric Lyons, an associate professor of plant sciences at University of Arizona.
The first step will be led by Nelson at BTI. His team will map more than a petabyte of publicly available RNA sequence data from at least 15 different species back to their respective genomes, including important crops like corn, rice, wheat and cotton. For perspective, a petabyte is approximately the same amount of data it would take to stream a playlist of music for 2,600 years.
“The amount of publicly available RNA sequencing data for these 15 plants has tripled in the last two years,” says Nelson. “It’s an incredible resource.”
After the data are processed, Gregory’s team will run them through two different algorithms. The first algorithm, called HAMR, was developed by the Gregory lab. HAMR capitalizes on flaws in RNA sequencing technologies, and can identify up to 45 different modifications based on the pattern of mistakes. The second algorithm, called PEA, identifies two important RNA marks that HAMR cannot detect.
Once the modifications have been identified, Nelson will develop a pipeline for identifying the context in which they occur. Do the RNA modifications show up only in roots? Are they present on the same gene in many related species? Do certain genes get modified by a specific mark only under drought conditions? By answering questions like these, he hopes to identify specific RNA modifications that underlie critical cellular processes.
All of these data, as well as the workflow used to process them, will be made available to scientists and the public. This effort, along with additional data analysis and management, will be headed by Lyons.
“We are going to release our data as a curated list that researchers can use to generate hypotheses,” Lyons explains. “In addition, we will be releasing our code and workflows for others to replicate and reuse our work.”
The potential of the data generated by the project is vast, emphasizes Gregory. “Hopefully, this large-scale resource will allow us and others to focus on the RNA modification sites that are truly important to crop plant stress responses,” says Gregory, “in turn allowing us to utilize the knowledge for future crop improvement.”
Undergraduate involvement will be a key element of the project. Murphy will introduce students to bioinformatics, RNA sequencing, and genomics through course work at her primarily undergraduate institution. In the summer, a number of these students will travel to BTI to participate in immersive bioinformatics training as well as in vivo biomolecular work.
“Students will be able to hone their computational and data analysis skills while making real contributions to cutting edge science,” says Murphy.
Teaching coding skills to undergraduates is imperative, Nelson adds: “Bioinformatics used to play a supporting role in plant biology. Now it is actually driving much of the discovery.”
Nelson stresses that collaborative funding opportunities such as those offered by the NSF make ambitious projects like this practical, adding, “This project wouldn’t be possible without three amazing collaborators. I think together we will probably uncover some very fundamental principles of RNA biology.”
The NSF grant (no. IOS-2023310), entitled, “TRTech-PGR: Identification and characterization of stress-responsive and evolutionary conserved epitranscriptomic modification sites in plant transcriptomes,” is in the amount of $2,022,004.