This mode is used to convert alignments to counts for annotations (e.g., gene transcript annotations or exons). It is implemented by edu.cornell.med.icb.goby.modes.CompactAlignmentToAnnotationCountsMode.java.

Mode Parameters

The following options are available in this mode

FlagArgumentsRequiredDescription
n/ainputyesThe compact alignment file(s) to read as input. Basenames are derived from these input files by removing the conventional Goby alignment extensions (e.g., .entries, .tmh, .stat, .header).
(-o|--output)outputnoThe tab delimited output filename, when omitted, constructed from the input filename. Output filenames will have extension .ann-counts.tsv
(-s|--stats)statsnoThe filename where statistics will be written (for group comparisons). These files are tab delimited. Default value: comparison-stats.tsv
(-a|--annotation)annotationyesThe annotation file as input.
(-r|--include-reference-names)include-reference-namesnoWhen provided, only write counts for reference identifiers listed in this comma separated list. To process only counts for chromosome 19 and 1, if sequences are identified by 1 and 19, use: –include-reference-names 1,19
(-g|--groups)groupsnoDefine groups for multi-group comparisons. The syntax of the groups arguments is id-1=basename1,basename2/id-2=basename4,basename5 Where id-1 is the id of the first group, defined to consist of samples basename1 and basename2. basename1 must refer to a basename provided as input on the command line (see input). Multiple groups are separated by forward slashes (/).
--include-annotation-typesinclude-annotation-typesnoComma delimited list of annotation types. When provided, write annotation counts for the specified annotation type. By default, write annotation counts for gene exon or introns. The other category indicates intronic or intergenic regions currently not annotated as genes or exons by the given annotation. Default value: gene,exon,other
--comparecomparenoCompare annotation counts between groups of samples. The compare flag must be followed by group ids separated by slashes. For instance, if groups group-A and group-B have been defined (see –groups option), –compare group-A/group-B will evaluate statistical tests of different count representation between samples in groups A and B.
--normalization-methodsnormalization-methodsnoComma separated list of the normalization methods to apply. Method currently supported include (AC=aligned-count) normalization by aligned sample count, or (BUQ=bullard-upper-quartile) Bullard Upper Quantile normalization. By default both options are evaluated. This option is available since goby 1.5. Default value: aligned-count,bullard-upper-quartile
--paralleln/anoProcess basenames in parallel. Use when you have many basenames to process, need the parallel speedup, and have a lot of memory to load multiple basenames in memory. You can tune the number of processors used by setting the property pj.nt. For instance, -Dpj.nt=5 will use 5 parallel threads. When –parallel is specified, one thread per processing core of the machine will be used unless specified otherwise (with pj.nt). Default value: FALSE
--write-annotation-countswrite-annotation-countsnoIf true, the annotation counts files will be written. Default value: true
--omit-non-informative-columnsomit-non-informative-columnsnoIf true, columns which are entirely non-informative will be omitted. Default value: false
(-w|--use-weights)use-weightsnoWhether weights should be used to adjust read counts. When the flag is set to a string ‘id’ other than false, this mode will try to load a weights file associated with each input basename (‘basename’.’id’-weight). If found, the weights are used to adjust the read count for annotations. This option is available since Goby 1.7. Default value: false
--adjust-gc-biasadjust-gc-biasnoWhen other than false, the identifier of a formula to reweight counts (requires use-weights gc). If false, no reweighting is done. This option is available since Goby 1.7. Default value: false
(-t|--eval)evalnoName the statistics to evaluate. The complete list of valid statistics names is “samples,fold-change,fold-change-magnitude,log2-fold-change,group-averages,t-test,fisher,fisher-r,chi-square,Bonferroni,BH”. This option is available since Goby 1.7. Default value: samples,fold-change,fold-change-magnitude,log2-fold-change,group-averages,t-test,fisher-r,BH