Sequence programs written in .sequence files can be used with the Sequence Mode to run a number of modes in a specific order using a consistent set of parameters. Using sequence files, variables can be defined that get dynamically populated with runtime specific values which in turn get passed to the modes. Sequence programs are used by BDVal to consistently perform the same set of operations in each split of a validation plan.
There are three basic sections to a sequence file:
- variable definitions
- required option definitions
- the options that will get passed to the modes in the sequence.
Blank lines or lines that begin with the character ‘#’ are ignored.
The following is an example of a sequence file that will execute svm-weights, write-model and predict modes. The first mode performs feature selection. The second mode trains a model with the subset of features selected in the first step and writes it to disk. Finally, the third mode loads the model, applies it to the test set of the split plan and writes the sample predictions to disk in a predictions file.
The header of the sequence program defines two variables: “label” and “predictions-filename”. Three options are required in addition to the other options that sequence mode requires. These options are called “other-options”, “split-id”, and “num-features”. Any tokens surrounded by the character “%” will be populated dynamically by the sequence mode (each token is replaced by the value of the command line argument –<token>; for instance, %dataset-name% will be replaced with the value of –dataset-name provided on the command line when the sequence mode is run.).
def label=baseline-global-svm-weights-%model-id% def predictions-filename=%dataset-name%-%label%-prediction-table.txt # addoption required:other-options:Other DAVMode options can be provided here addoption required:split-id:id of split being processed addoption required:num-features:Number of features in the generated model # -m svm-weights --overwrite-output true -o %dataset-name%-%split-id%-%label%-features.txt --output-gene-list --gene-list full --gene-features-dir %gene-features-dir% --num-features %num-features% %other-options% --split-type training -m write-model --overwrite-output true --gene-list %label%|%dataset-name%-%split-id%-%label%-features.txt %other-options% --split-type training --model-prefix libSVM_%dataset-name%-%split-id%-%label% -m predict --overwrite-output false --model libSVM_%dataset-name%-%split-id%-%label%.model -o %predictions-filename% %other-options% --split-type test --true-labels %conditions%
To view sequences implemented in BDVal click on any link below and login as “guest” with password <email-address>.