This mode compares the distribution of each feature used in a set of specific biomarker models. Distribution differences are quantified for feature signal between two sample sets (i.e., training set vs. validation set). A P-value (Kolmogorov-smirnov) and ratio of rank statistics is evaluated for each feature in each model processed.

It is implemented by org.bdval.DistributionDifferenceByFeatureMode.java.

Mode Parameters

The following options are available in this mode

Flag Arguments Required Description
--maqcii-properties-file maqcii-properties-file yes The maqcii properties file such as maqcii-c.properties
--model-conditions-file model-conditions-file yes The model-conditions-file such as ‘model-conditions.txt’
--models-dir models-dir yes The directory containing models (may be within sub-directories)
--model-list model-list no The models to process (or ‘all’ to process all models). Comma separated, such as ‘DUDTR,YTNJM’. (default: all)
--model-exclude-list model-exclude-list no The models to NOT process (or ‘none’ to process all models). Comma separated, such as ‘DUDTR,YTNJM’. (default: none)
--signal-quality-calc-class signal-quality-calc-class yes Fully qualified classname for an AbstractSignalQualityCalculator class
--eval-dataset-root eval-dataset-root no The eval-dataset-root directory or specify ‘-’ to use the dataset-root directory specified in the model-conditions file (default: -)
--properties-training-label properties-training-label no The label used to denote training values in the properties file. (default: training)
--properties-validation-label properties-validation-label no The label used to denote validation values in the properties file. (default: validation)
--extended-output extended-output yes If true, extra output will be included. (default: false)
--merge-classes merge-classes yes If true, all classes will be merged. (default: false)
--max-num-classes max-num-classes yes The maximum number of classes (for the output file header) (default: 2)