Goby is a next-gen data management framework designed to facilitate the implementation of efficient data analysis pipelines. The program is distributed under the GNU General Public License (GPL).


If you have used Goby in your work please cite:

Compression of structured high-throughput sequencing data, Campagne F, Dorff KC,  Chambwe N,  Robinson JT, Mesirov JP, Plos One, in press [see also our pre-print of this manuscript at http://arxiv.org/abs/1211.6664]

Goby provides very efficient file formats to store next-generation sequencing data and intermediary analysis results. The file formats are described in more detail on the developer’s page. Goby 1.x files were compressed with GZip. In Goby 2.0, we introduced novel compression approaches that result in state of the art compression of alignment data. Goby 2.0 files can compress to a few percent of a BAM file and are often smaller than CRAM files. See what’s new in Goby 2.0.

Goby also provides utilities that implement common next-gen data computations. We design these utilities to make them relatively easy to use, yet very efficient. You can see an example of this on our quick demo page.

Interested in trying Goby? Here is how we recommend to proceed:

  1. Download the software, including the version of BWA with native Goby support.
  2. Need help, want to send suggestions? Address emails to goby-framework@googlegroups.com. You can also search this forum for answers to similar questions. Subscribing to the forum is also the best way to be notified of new releases.
  3. Configure on your computer.
  4. Follow the detailed walk-through of the demo.
  5. Take a look at the project tutorials, they discuss how to use Goby for different next-gen data analysis applications.
  6. Familiarize yourself with the various Goby modes (small utilities). Use java -jar goby.jar –help to display a list of modes. Help is context sensitive. Additional information can be found in the reference online manual.
  7. If you are a programmer interested in using Goby in your own projects, check out the developer tutorial and the project Java API pages. Version 1.5- of Goby support Java programming only (however, you can use Goby tools with any scripting language such as Perl or Groovy). A Python API has been introduced in version 1.6 to make it possible to parse Goby alignment files natively in Python. Goby 1.8 added a C++ API and Goby 1.9 a C API suitable to implement native Goby support in aligners such as BWA or GSNAP.

Here are some tutorials that describe how to perform common NGS analysis tasks:

import/Export SAM/BAM alignments Goby 2.0 provides very robust import/export capabilities for SAM/BAM format.

differential expression analysis Find genes, exons or intron/other regions differentially expressed across groups of samples. Goby supports total count normalization as well as the Bullard et al BMC Bioinformatics 2010 upper quartile normalization method. Fisher exact and Student T tests can be performed directly with Goby and corrected for multiple hypothesis testing. Tables of results are exported in tab delimited format for easy interoperability with other stat packages (e.g., DESeq, EdgeR).

DNA methylation analysis Analyze bisulfite treated reads to estimate methylation rates throughout the genome, view the result in the local genomic context with IGV.

discover genomic variants Find SNPs and mutations in samples, estimate allelelic frequencies, methylation rate, or and identify genomic positions where allele frequencies differ significantly between groups of samples.

See the related pages listed on the right side of this page for an exhaustive list of pages about Goby. Looking for a web-based alignment and analysis tool? See GobyWeb, our grid-enabled web user interface.