The change log of the project is listed below. You can also view change logs for previous versions:

Goby 2.0 to 2.3.6 [July 2012 to Oct 2016]

Goby 1.0 to [Jan 2010 to Jun 2012]

Release 3.3.1 [Jan 2018]
– Fixed alignment concat where results could be truncated if several empty slices followed one another (e.g., if concat A,B,C and A and B are empty, goby ca could yield an empty alignment, completely omitting alignments in part C.)

Release 3.3.0 [Nov 2017]
– Substantially reduced memory utilization for discover-sequence-variant (all modes).
– discover-sequence-variant could in some rare cases output the same base twice (when indels were extending prior to the beginning of the read after equivalent indel region calculation). This fix improved indel performance when training models with variationanalysis 1.3.3+.
– Initial work to develop models for genomic segments (see .ssi format and concurrent work in variationanalysis). This is work in progress. Protobuf schema is in goby-io/protobuf/SegmentInformationRecords.proto. Models are developed in parallel with Keras (in goby3/python/dl) and DL4J (in variationanalysis).
– Updated genotyping model to state of the art (models/genotyping/1510204519948/, see evaluation results in the folder)

Release 3.2.6 [Jun 2017]
– Updated models for compatibility with latest code: genotyping model and somatic models are updated.
– Tested that models produced with variationanalysis (genotype and somatic) load in Goby and can be used with the modes to generate VCF.
– Various bug fixes to last-to-compact mode. Bugs were triggered by output from more recent versions of Last than tested previously.
– Discover-sequence-variations mode: fix VCF output for indels. Genotypes format mostly rewritten.
Was previously writing incorrect indels. Latest code produces VCF files tested for compatibility with RTG vcfeval.
– Discover-sequence-variations mode: Add minimum-P and stringent-P options to Genotypes output format.
– Rewrote VCFToGenotypeMapMode to use HTSJDK VCF parser. This should enable using BCF files as input as well.
– Fix for count of indels. The first equivalent indel region did not increment the count. Counts on forward and reverse now match the number of supporting entries on each strand.
– Add supporting entry the first time an indel is created in a SampleCountInfo. The supporting entry was not set on the first one.
– Apply count fixer to remove bases matching ref from list, when the mandatory filter has determined the base should be removed. Previously was only removed from counts, but not from list of bases. One possible candidate for indel performance problems we have tried to fix for a while.

Release 3.2.5 [May 2017]
– Fix issue with toProto that prevented using more than one sample for genotyping with goby.
– alignment conversion to goby: ignore missing MD tags (it is possible only some reads are missing them and we still need to convert the other aligned reads).
– Upgrade goby to DL4J 0.8.0.
– fasta-to-compact: Do not use an assertion, but instead reset read index to zero and explain how to avoid the problem.
– SBI format: add distance from start of read and end of read. Will be mapped to a density in next genotype mapper. Should help variationanalysis models detect cases where end of alignment is fully contained within homopolymer region.

Release 3.2.4 [April 2017]
– Fix tally-reads mode.
– Some fixes to realignment of SNPs around indels.
– improvements to barcode remover (to trim bases from 5′ end before removing barcode).
– Goby version now reports the commit that produced the distribution.
– Goby version, including commit now written to generated .sbi files.
– Introduce CommitPropertyHelper to record the specific commit that produced the version of Goby being used.

Release 3.2.3 [March 2017]
– Fix SNP bug in realignment around read insertion.
– Add queryPosition field to SBI output.
– Prevent the writing of sbi entries when AddTrueGenotypeHelper indicated the entry should not be added.

Release 3.2.2 [Feb 2017]

– Fix frequency of bases when indels are also present. Now correctly removes bases that
support the flanking sequence of the indel and do not double count.
– Many changes to how we store varmaps introduced to support indels (vcf-to-varmap).
The serialization format is incompatible with previous versions, so make sure you regenerate
varmaps from VCF.
– Adjust VCF output for compatibility with REF/ALT conventions. This makes it possible to measure
performance with standard tools such as RTG vcfeval (
– Keep counts of indels separately for forward and reverse strand.
– vcf-to-varmap mode: improved semantic of –chromosome-prefix option allows removing (e.g., -chr)
or adding (+chr) prefix to chromosome name.

Release 3.2.1 [Jan 2017]
– fast-co-compact: fix a bug introduced on 10/6/2016 which created negative read entries.
– catch a number of exception that can be thrown by HTSJDK when processing BAM files. Exceptions
are caught so that an error on one alignment does not interrupt processing of an entire alignment.
Errors are shown in log.
– vcf-to-genotype-map mode now supports (b)gzipped vcf input.
– vcf-to-genotype-map: fix bug that manifested itself when the vcf had a single genotype field.
– vcf-to-genotype-map: add chromosome-prefix argument to help import VCF where the chr prefix is missing.

Release 3.2 [Jan 2017]
– Remove memory leak when reading SAM/BAM files. This was the likely cause for running out of memory error in
compression benchmarks (had nothing to do with compression but with the conversion of SAM/BAM to goby representation).
– Disabled tests that could not succeed anymore (because of choices we made in Goby 3, such as lack of auto-upgrade
for alignments produced with Goby 1 and 2.)
– BAM/CRAM support. Added an option to bypass the header check on SO:COORDINATE. Use
-x HTSJDKReaderImpl:force-sorted=true to force Goby to consider an alignment sorted.
– SBI format: add ability to add true labels while writing the file. Add support for downsampling sites without
– Genotype format: reorganization to support calling with deep learning models trained with variation analysis.

– Reorganize model prediction to facilitate installing new versions of the variationAnalysis jars.

Goby 3.1 is now compatible with variationanalysis 1.1.1.

– Replace models with versions trained with variationanalysis 1.1.1.

– Add somatic mutation models trained with whole genome data (ICGC GoldSet).

Release 3.1 [Dec 2016]

Release 3.0.0 [Oct 2016]

– Support reading BAM alignments directly with Goby APIs.

– Support probabilitic models for calling somatic variations, trained with deep learning.