This mode is used to discover biomarkers with the iterative Support Vector Machine (SVM) weight approach. A support vector machine is trained with a linear kernel on the training set. Feature weights are then evaluated from the trained model, and features with the N − k largest absolute value of the weight are identified as the features to use in the next round. k is taken such that (N − kN) = ratio, typically 50%. The process starts over until the number N − k falls below or equal to the desired number of biomarkers (n). When the condition is met, the the n features with the largest absolute weight are written out.

This method minimizes error rate on the training set and may reduce numerical instability of SVM optimization process. The method is also fairly computationally efficient.

It is implemented by org.bdval.DiscoverWithSvmWeightsIterative.java.

Mode Parameters

The following options are available in this mode

Flag Arguments Required Description
(-n | --num-features)num-featuresnoNumber of features to select. (default: 50)
--output-gene-listn/anoWrite features to the output in the tissueinfo gene list format.
(-r | --ratio)rationoThe ratio of new number of feature to original number of features, for each iteration. (default: 0.5)