Sequence analysis pipelines

Step by step instructions for analytical pipelines we use for analysis of specific data types (e.g. 2bRAD, tag-based RNA-Seq).

Pipeline Version Description
2bRAD 1.0 (2012) Original pipeline for analysis of 2bRAD sequence data. Includes links to software
and step by step instructions beginning with raw sequence data and ending with a
combined genotype matrix. This is the version used in early 2bRAD publications
and 2bRAD workshops prior to 2016.
2bRAD 2.0 (2016) Updated pipeline for analysis of 2bRAD sequence data.
This is the version we are using for all 2bRAD analysis as of July 2016.

Bioinformatics scripts

Efficient analysis of high-throughput DNA sequence data, 10s to 100s of millions of sequences, requires scripted tools to automate repetitive tasks. The following links provide a collection of custom scripts we use for sequence analysis. We provide these scripts, without restrictions or guarantees, in hopes they might be useful for other groups applying similar methods.

Please note: many scripts rely on BioPerl and other software, and will check for the required software before running. Scripts were developed for Linux systems, and porting to other OS would require modifications.

All scripts are hosted on GitHub.

Category Description
Sequence utilities General purpose utilities for analysis of DNA sequences.
Sequence processing Scripts for processing high-throughput DNA sequence data prior to analysis .
2bRAD utilities Scripts and pipeline for genotyping using 2bRAD data.
RNA-Seq utilities Scripts and pipeline for gene expression analysis of 3' tag-based RNA-Seq data.
Transcriptome utilities Scripts for analysis of transcriptome assemblies.

Laboratory protocols

Step-by-step instructions for preparing DNA and RNA samples for high-throughput sequencing on next-generation sequencing platforms. These assume standard molecular biology equipment is available, and are intended for users familiar with basic molecular techniques. These methods were developed while working with Misha Matz at UT Austin.

Protocol Description Last updated
Tag-based RNA-Seq.pdf Global gene expression profiling on the Illumina HiSeq platform. Publication 26 Aug, 2016
2b-RAD genotyping.pdf Genome-wide genotyping using type IIb restriction endonucleases on the Illumina HiSeq platform. Publication 25 Aug, 2016
Normalized transcriptomes.pdf Preparing normalized cDNA sequencing libraries suitable for assembling reference transcriptomes. 28 Oct, 2016