This mode is used to discover biomarkers with the iterative Support Vector Machine (SVM) weight approach. A support vector machine is trained with a linear kernel on the training set. Feature weights are then evaluated from the trained model, and features with the N − k largest absolute value of the weight are identified as the features to use in the next round. k is taken such that (N − k / N) = ratio, typically 50%. The process starts over until the number N − k falls below or equal to the desired number of biomarkers (n). When the condition is met, the the n features with the largest absolute weight are written out.
This method minimizes error rate on the training set and may reduce numerical instability of SVM optimization process. The method is also fairly computationally efficient.
It is implemented by org.bdval.DiscoverWithSvmWeightsIterative.java.
Mode Parameters
The following options are available in this mode
| Flag | Arguments | Required | Description |
|---|---|---|---|
(-n | --num-features) |
num-features | no | Number of features to select. (default: 50) |
--output-gene-list |
n/a | no | Write features to the output in the tissueinfo gene list format. |
(-r | --ratio) |
ratio | no | The ratio of new number of feature to original number of features, for each iteration. (default: 0.5) |


Leave a Comment