We have released Goby 1.9.8.2. This version offers the vcf-subset and vcf-compare replacements tools I mentioned in my earlier VCF post.
The release also packs an option to call indels with Goby. We use the method of Krawitz et al (Bioinformatics 2010) to find equivalent indel regions (EIR). This approach can re-conciliate distinct indel observations into canonical indel boundaries (an EIR). The genotype and compare-groups formats of the discover-sequence-variants mode will output EIRs at a frequency that sum over all the possible indel variations observed at the site that can be explained by that EIR. Of course, there is quite more to the Goby indel calling approach than the Krawitz method. For instance, the approach is integrated with the fast algorithm for local realignment around indels, so that indels that open when realigning end of reads contribute to the frequency of an EIR.
Programmers will find that Goby represents observed indels at a site in a very similar way to base genotypes. Reading a base or indel frequency at a position in a sample is done with the same API (see the SampleCountInfo class). This makes it easy to support indels in different output formats.
The vcf-compare replacement (new in this release) can keep random samples of positions that differ between input files according to each category of differences it tallies (e.g., missed one allele RA vs RR, missed two alleles AA vs RR, genotypes differ C/T vs A/T where R=G). This is quite useful in inspecting positions in a genome viewer to try and understand differences between calls made by two approaches.
More details about this release are in the ChangeLog.

Leave a Comment