During this lab, we will acquaint ourselves with the software packagesTrimmomatic, khmer and Jellyfish. Your objectives are:
- Familiarize yourself with software, how to install and execute it and optionally how tovisualize results.
- Characterize sequence quality.
The Skewer manual: https://github.com/relipmoc/skewer
The JellyFish manual: http://www.genome.umd.edu/jellyfish.html
Step 1: Launch and AMI. For this exercise, we will use a c4.2xlarge with a 500Gb EBS volume. Remember to change the permission of your key code chmod400~/Downloads/????.pem
(change ????.pem to whatever you named it)
Update Software
FASTX-toolkit should now compile cleanly on Mac OS x. No new features were added. Using libgtextutils-0.3 library. 24-Mar-2009 - Version 0.0.7 Added Fasta-Formatter and Fasta-Nucleotide-Changer tools. Using libgtextutils-0.1 library. 25-Feb-2009 - Version 0.0.6 Initial public release.
Install updates
Install other software Note that you can install a large amount of software from the Ubuntu “App Store” using a single command. Some of this software we will not use for this tutorial, but..
Install Ruby Ruby is a computer language like Python or Perl.
Install Brew Brew is a piece of software the serves as a ‘package manager’. It makes installing software easy! You can use it for lots of things, but not everything. Knowing it’s limitations will come with time.
Install Bioinformatics Packages via Brew These are the packages that we will use to do real work!!! YAY!!!
Install khmer
Download data: For this lab, we’ll be using files from Jack Gilbert’s Merlot wine study (http://mbio.asm.org/content/6/2/e02527-14.full). The details are not important right now, but this is a metagenomic sample from root of the grape vine.
You are downloading from MG-RAST, which is a popular metagenomics analysis package. There are a lot of places to get raw data.
Do 2 different trimming levels – Phred=2 and Phred=30: One of these is very harsh, the other is probably more appropriate. Which one is which?
Look at the output from this command, which should start with InputReadPairs:
Interleave reads
Fastqc Command
Run Jellyfish
Look at the 2 histograms
Run FastQC on your data
Dcc e2 free download - DCC Workshop, Tom Clancy's The Division 2 Open Beta, DCC Report It, and many more programs. Dcc e2 mac os download windows 7. Runs on: Mac OS X, Mac OS X 10.3, Mac OS X 10.4, Mac OS X 10.5, Mac OS X 10.6, Mac OS X 10.7, Mac OS X 10.8 Mac PDF Page Numberer Batch v.1.00 Mac PDF Page Numberer Batch is a Acrobat plug-in tool for Acrobat Which is used to automatically page number your documents, you can put customizable page numbers anywhere on the page, with any font size.
Download FastQC .zip file to your computer
Open up a new terminal window using the buttons command-t, then unzip as per normal.
WON’T COVER THE STUFF BELOW, THOUGH YOU SHOULD TRY TO DO IT Download mysql for mac homebrew.
Download Fastqc On Terminal Mac Os
Now look at the .histo
file, which is a kmer distribution. I want you to plot the distribution using R and RStudio.
OPEN RSTUDIO: Google and install locally. Chromium os download for windows. There are OSX and Windows versions.
Fastqc Install
Open up a new terminal window using the buttons command-t
Download Fastqc On Terminal Mac Download
Import and visualize the 2 histogram datasets: