Interns 2018 BCBC Bioinformatics Course

About the Course

We are living in massive data times, and science is not an exception. New sequencing technologies are filling hard disks with terabytes of information, billion of sequences that need to be analyzed in a proper way. But not only sequence data is growing, gene expression and metabolite concentrations are analyzed by the hundreds or thousands, in a way that makes it difficult, if not impossible, to use familiar tools such as Excel. In this new world, bioinformatic skills are needed, not only by computational biologists, but by biologists and biochemists who find themselves analyzing many genes, proteins or metabolites at the same time.With this perspective, we aim to further bioinformatic skills within the postdoc community at Boyce Thompson Institute through this course. We try to keep it as simple as we can, with just one idea: “Show useful tools to resolve common problems found during the *omics data analysis”. For example, if I have two lists of hundred of genes, how can I combine them and find the common ones, or how can I analyze GO terms for my over-expressed genes, or how can download a chromosome region using Jbrowse or… there are dozens of examples.

Before you arrive — Online Bioinformatics Tools

A variety of tools exist online that may be used for bioinformatic analysis of small data sets. We cover these types of web tools in the slides below. Future lessons will focus on command line and programming methods of large data analyses. If you are overwhelmed by these slides, or the exercises therein, we will communicate an opportunity for a pre-workshop discussion.

Topics covered:

  • web based databases
  • web blast
  • genome browsers
  • sequence alignment
  • phylogeny
  • primer design

Estimated Time

  • Slides and exercises: 1:00 h

Materials

BEFORE 06/12/18 — Setting Up the BCBC Course Virtual Machine

The BTI Bioinformatics Course uses Virtual Box to virtualize a Linux operating system (OS) inside any computer. The virtual machine (VM) is a Linux OS, Debian distribution for 64-bit computers. In order to get it running, you’ll need to download both the VirtualBox software, and the virtual machine itself.

Steps:

1. Download and install VirtualBox following the package instructions, as well as the VirtualBox Extension Pack. For more info consult the user manual. Please download and install the latest VirtualBox version for your operating system.

2. Check if you have a 64-bit or 32-bit system (see these links for instructions: Windows or MacOS) and download the BCBCBIC2018_debian.ova file from below. If you have a 32-bit system, please send us an email. Some Windows 64 bits systems don’t activate the acceleration VT-x/AMD-V, which prevents running the virtual machine. You will need to enable the VT-x/AMD-V in the Bios of your computer.If you are running a 32-bit computer you can come to our office to find an alternative 64-bit laptop that you may use for the course.

3. Create a VM Folder in your system and copy/move the .ova file

4. Open the VirtualBox program.

6. Select the option File > Import ApplianceScreen Shot 2013-03-21 at 6.18.32 PM 

Screen Shot 2013-03-21 at 6.18.44 PM

7. Click “Open Appliance”. Select the .ova file and click “Continue”.

Screen Shot 2013-03-21 at 6.21.40 PM

8. Enable “Reinitialize the MAC address of all the network cards” and click “Import”.

Screen Shot 2013-03-21 at 6.21.56 PM

9. Sign up to have your virtual machine checked before the first class!

Sign up for a slot on our . Virtual Machine Check-In Doodle poll.

Troubleshooting

  • I have enabled virtualization but VirtualBox still gives an error asking to “Enable vt-X” or similar.

Make sure you have disabled hyper-v. This can be done by following these instructions.

  • I have downloaded both the VirtualBox software and the virtual machine file and enabled virtualization, but it still will not run.

Make sure you have a 64-bit machine and have followed the above steps precisely, especially enabling . Come to our office or email us for further troubleshooting if you cannot find the problem.

06/14/18 — UNIX Command-Line Intro, Part 1

Presenter: David

Topics covered

  • Terminal file system navigation
  • Wildcards, shortcuts and special characters
  • File permissions
  • Compression UNIX commands
  • Networking UNIX commands

Estimated Time

  • Lecture and exercises: 2:00 h

Materials

06/21/18 — UNIX Command-Line Intro, Part 2

Presenter: Adrian

Topics covered

  • Basic NGS file formats
  • Text files manipulation commands
  • Command-line pipelines
  • Introduction to bash scripts

Estimated Time

  • Lecture and exercises: 2:00 h

Materials

06/28/18 — Next Gen Sequencing

Presenter: Suzy

Estimated Time

  • Lecture and examples: 2:00 h

Materials

07/05/18 — Introduction to R & Basic R Graphs

Presenter: Alex

Topics covered:

  • Brief introduction to R
  • Data types
  • R graphs

Estimated Time

  • Lecture and examples: 2:00 h

Materials

07/12/18 — Differential expression with edgeR

Presenter: Titima

Prerequisites

  • Make sure that you have “gene_count_matrix.csv” file in the “Slch04_demo” directory in the Desktop directory of your VM. Please let us know before class if you missed a previous session, or were unable to complete the exercises, and do not have the necessary files.

Topics covered

  • General pipeline for differential expression analysis with an emphasis on edgeR
  • Data exploration

Estimated Time

  • Lecture and examples: 2:00 h

Materials

Contact:

Boyce Thompson Institute
533 Tower Rd.
Ithaca, NY 14853
607.254.1234
contact@btiscience.org