Diversity analysis of wild tomato species in the Lycopersicum clade using transcriptomes
S. lycopersicum, the domesticated tomato’s genetic diversity has been drastically reduced due to bottlenecks during domestication and as a result, useful allele diversity has been lost in the gene pool. Fortunately, wild tomato species have high genetic variation and thus have been utilized for restoration of the gene diversity in cultivated tomatoes. However, in order to fully understand the domesticated tomato’s genetic potential, the diversity of the wild tomato species must be further analyzed. RNA-seq data from the SRA and Next Generation Sequencing on various wild tomato tissue samples were utilized for analyzing genetic diversity. Cleaned paired-end Illumina reads from S. arcanum, S. peruvianum, S. pimpinellifolium, S. cornemulleri, S. chilense, and S. pennellii were mapped to the Heinz genome using the reference-based assembly program,Tophat2. The accepted hits files for each accession were then merged using Samtools. In order to analyze the coverage for the wild species data from various tissues in reference to the Heinz genome, Bedtools was utilized to get the number of reads for each Heinz gene that were found in a wild tomato species. Due to the sufficient coverage of these samples, a consensus sequence was generated based on the read mapping using Samtools. The consensus sequence for each species can then be compared using ClustalW to generate a phylogeny tree. A manual was generated in an effort to facilitate the analysis of future data as more wild tomato samples get sent in for Next Generation Sequencing.
My Experience
This internship has helped me learn a great deal about my research interests as well as what it means to conduct bioinformatics research. I was initially overwhelmed by the amount of catching up I had to do in order to complete my project; but looking back, I am surprised by the amount that I have accomplished in this short ten week internship. Not only am I more familiar with bioinformatics software and programming, but I now have a better idea of what I want to study in graduate school. Although, this internship has helped me realize how much I miss working in the greenhouse and wet lab, the skills I have developed in this bioinformatics program will be applicable for my future research in plant genetics at graduate school. Most importantly, this internship has fueled my passion for plant research in that it has revealed many interesting research topics.