De Novo Discovery and Comparison of Transposable Element Families in S. lycopersicum and S. pimpinellifolium
Transposable elements (TEs) are sequences of DNA capable of changing their relative position in the genome of an organism either by moving or copying themselves. Their discovery in the 1940s is credited to maize geneticist Barbara McClintock, whose suggestions of TE functionality were dismissed for decades thereafter. Recently, however, researchers have discovered several important aspects of TEs, including one unusual retrotransposon, Rider, whose activity in the SUN gene of the domesticated tomato (Solanum lycopersicum) has resulted in altered fruit morphology phenotypes. Thus, TEs may have played an important role in the speciation between the domesticated tomato and its wild ancestor, and so the identification of putative new active TE families in the S. lycopersicum genome that are absent or less abundant in the S. pimpinellifolium genome may be of particular interest for the advancement of tomato research.
This summer, I implemented a de novo transposable element discovery pipeline called the REPET Package on the tomato genome. Its two main components, TEdenovo and TEannot, are dedicated to the detection and analysis of repeats in genomic sequences, where TEdenovo returns a library of classified, non-redundant consensus sequences, and TEannot filters these results based on a similarity search with known TEs. Once obtained, the TE content of the domesticated and wild ancestor species was then compared to identify TE families with characteristics that suggest recent activity. For those TE families of interest, the presence or absence of individual elements was verified by aligning flanking sequences from the two species. The positions of TE polymorphism sites were compared to the locations of known genes to find instances of TEs that may be contributing to functional genetic variation.
My Experience
This summer internship has been an amazing experience in which I have grown both as a person and researcher. In these past weeks, I have gained not only great new friends, but also a deeper appreciation for bioinformatics and the answers we can find using computer science in conjunction with traditional biological research. I enjoyed my projects immensely, both the beginner project, which was to create a Catalyst-based web interface for Primer3 on the Sol Genomics website, and main project, which dealt with the de novo identification of transposable elements in the tomato genome. I have learned so much from my mentor and others, and now feel confident that my future career will involve a blend of computer science and biology.