Lukas Jander
Year: 2021
Faculty Advisor: Zhangjun Fei
Mentor: Shan Wu

De novo genome assembly of wild tomato species, Solanum neorickii and identification of structural variants between S. neorickii and cultivated tomato, S. lycopersicum

Project Summary:

Wild tomato species have been utilized in modern tomato breeding to introduce desirable traits and to construct introgression populations to facilitate rapid trait mapping. The small-green-fruited wild relative, S. neorickii, LA2133, was crossed with S. lycopersicum to develop backcross inbred lines, allowing identification of quantitative trait loci (QTLs), including a QTL controlling phenylalanine level in fruit. However, the specific genes and variants contributing to traits are largely unknown due to lack of a S. neorickii genome sequence. In this study, using the PacBio HiFi high accuracy long reads, we de novo assembled the genome of S. neorickii, LA2133, into contigs with a total length of 856.83 Mb. Through comparison between the genome assemblies of S. neorickii, LA2133, and S. lycopersicum, Heinz 1706, a total of 225,098 structural variants (SVs), specifically, insertion-deletions (indels), were identified. Five indels were identified near and in the candidate gene encoding phenylalanine ammonia-lyase (PAL) in the previously identified phenylalanine QTL. None of the indels were present within the coding sequences, indicating conserved gene function between S. neorickii and S. lycopersicum. A 303-bp insertion was found in S. lycopersicum at around 4 kb upstream of PAL. In addition, one indel in the 5’-untranslated region and three indels in the intron were identified. These indels had potential to affect gene expression and might underlie the observed higher expression of PAL in S. neorickii. This high-quality S. neorickii genome assembly and our identified SVs are a valuable genomic resource for tomato breeding and genetic and molecular studies.

My Experience:

Through my internship in a bioinformatics lab at BTI, I learned a variety of techniques used to analyze large genetic data sets and gained valuable experience working in a lab. Working in a formal lab setting allowed me to gain a better understanding of the research process and what a scientific career may look like. Working with a mentor was a great experience as she was always willing to explain every process in detail and made sure I always understood what I was doing. Overall this internship has been a positive experience and has made me strongly consider pursuing a career in computer science or biology.