This mode is used to discover biomarkers with the Support Vector Machine (SVM) weight approach. A support vector machine is trained with a linear kernel on the training set. Feature weights are then evaluated from the trained model, and features with the n largest absolute value of the weight are identified as the most important features (a.k.a. the biomarkers).

This computationally efficient method minimizes error rate on the training set. Post-feature selection cross validation may overestimate performance on training sets, so the method should be used with a split plan which supports embedded feature selection.

It is implemented by org.bdval.DiscoverWithSvmWeights.java.

Mode Parameters

The following options are available in this mode

Flag Arguments Required Description
(-n | --num-features)num-featuresnoNumber of features to select. (default: 50)
--output-gene-listn/anoWrite features to the output in the tissueinfo gene list format.