The development of high throughput technologies has given rise to a wealth of information at system level including genome, transcriptome, proteome and metabolome. However, it remains a major challenge to digest the massive amounts of information and use it in an intelligent and comprehensive manner. To address this question, Dr. Fei’s group has focused on developing computational tools and resources to analyze and integrate large scale “omics” datasets,” which help researchers to understand how genes work together to comprise functioning cells and organisms.
Development of online databases to facilitate data distribution, analysis, mining and integration
- Tomato Functional Genomics Database
- Tomato Epigenome Database
- Cucurbit Genomics Database
- Kiwifruit Genome Database
- Whitefly Genomics Database
- Chinese Tomato Virome
- Pan-African Sweet Potato Virome
Development of computational tools for omics data analysis
- Plant MetGenMAP – a web-based tool for comprehensive mining and integration of gene expression and metabolite changes in the context of biochemical pathways.
- iAssembler – A de novo assembly package for transcriptome sequences generated using 454 or Sanger platforms
- iTAK – A package to identify and classify plant transcription factors and protein kinases.
- VirusDetect – An automated pipeline for efficient virus discovery using deep sequencing of small RNAs.
Application of NGS technologies and bioinformatics in crop improvement
During the past several years, significant progresses have been made regarding the DNA sequencing technologies. As a result, several next-generation sequencing (NGS) platforms, such Illumina HiSeq, have received wide applications due to their high throughput and low cost. We are interested in using NGS technologies to investigate genomes, epigenomes and transcriptomes of several economically important crops including tomato, cucurbits, sweetpotato, and fruit tree crops, to facilitate the understanding of the evolution and regulatory networks of important agronomical traits. We are also using NGS technologies to perform large-scale virus survey for crops like sweet potato and tomato, in an effort to understand global virus diversity, distribution and evolution in important food crops.
Inferring gene regulatory networks
Living cells are the product of gene expression programs involving regulated transcription of thousands of genes. How a collection of transcriptional regulatory factors associates with genes during specific biological processes or under specific environmental conditions can be described as a gene regulatory network. We are interested in developing new algorithms to infer gene regulatory networks by integrating datasets from various different sources, including gene expression data, metabolomics data, promoter sequences, and microRNA information.
- Recently developed gene editing tools like CRISPR/Cas enable plant scientists to figure out the functions of myriad plant genes. While these studies could eventually lead to the creation of crops with improved traits like increased disease resistance or higher yield, researchers first need a good way to keep track of the increasingly large amounts of […] Read more »
- The genome sequences of I. trifida and I. triloba can be used as robust references to facilitate sweetpotato breeding. The genomic resources developed in this study set the stage for increased rates of genetic gains for key traits such as yield, resistance to disease, and high beta-carotene. Read more »
Back to our roots: Insights from genomes of a plant-associated fungus and its bacterial endosymbiontsIn an article published this month in the journal New Phytologist, researchers at the Boyce Thompson Institute and the National Center for Genome Resources describe the genome sequences (DNA sequences), of the fungus Diversispora epigaea (formerly known as Glomus versiforme) and its endosymbionts – beneficial bacteria that live inside its cells. D. epigaea is a […] Read more »
Internship Program | Projects & Faculty | Apply for an Internship