Manigbas - Boyce Thompson Institute

Celine Manigbas

Year: 2018

School: Massachusetts College of Liberal Arts

“Analysis of the Asclepias syriaca Genome and Gene Families”

Project Summary:

Asclepias syriaca, known as the common milkweed, is found throughout northeastern and southeastern parts of the United States. Cardenolides are a subclass of cardiac glycosides found in Asclepias, and they contain steroidal toxins poisonous to insects and animals when consumed. The larvae of monarch butterfly, however, utilize Asclepias as their main food source and protection. There is a lack of high quality genomic information regarding A. syriaca to explore the cardenolide biosynthetic pathway and to comparatively analyze against other species in the Apocynaceae family that do not produce cardiac glycosides. The genome of A. syriaca was recently sequenced by the Jander lab using PacBio with >300x coverage, generating longer reads than in the published genome of A. syriaca, and assembled using Falcon Assembler. This assembly could provide more genomic information for annotation and gene prediction, and this could contribute more information for further genomic research concerning milkweeds, its evolution, and similar plants.

The assembly was error corrected using Arrow and was repeat-masked. RNA-seq data mapped to the genome and error corrected using Mikado and Portcullis were used to train the ab inito gene predictors Snap and Augustus. These predictors were used in the MAKER pipeline along with RNA and protein evidence to synthesize the data into structural gene annotations. Blast2Go was used for functional annotation. The gene families of published Asclepias syriaca genome, Catharanthus roseus, Rhazya stricta, Coffea canephora, Theobroma cacao, and Solanum lycopersicum were then identified using Orthofinder. KinFin was used to associate functions to the orthogroups. Gene family expansion was identified using CAFE.

My Experience:

This summer challenged me and taught me a lot about the field of bioinformatics research. This REU was my first exposure to working with Big Data of plant genomes. Prior to this internship, I had very limited experience in bioinformatics. Along the way, I learned a lot about the ever-changing and advancing field through the use of different programs, and it was exciting to work with the cutting edge programs towards research that is relevant to the real world. Before coming here, I was not so clear on the future path I wanted to take and if computational research was the route for me. But after this experience, I realized that I am truly interested in going to the field of bioinformatics.