New: We have developped a user interface for BDVal. See BDVal for MPS. This interface makes it much easier to configure BDVal projects.

BDVAL is an acronym for Biomarker Discovery and VALidation [1-2]. BDVAL is an open source project for biomarker discovery in high-throughput datasets. The program is distributed under the GNU General Public License (GPL). See the download page for the most recent distribution. BDVal can process microarray and proteomics datasets to discover and validate biomarkers.

  • BDVal directly supports many kinds of classifiers: it can train weka and libSVM classifiers. In contrast to using these classifiers directly, BDVal provides higher-level services to develop and evaluate models. See the FAQ for details.
  • BDVal supports various feature selection strategies and validation protocols: SVM weights (Support Vector Machine), recursive feature elimination, genetic algorithm wrappers, T-Test, Fold-Change, and any sensible combination of these strategies. Leave one out, stratified cross validation with random repeats are all supported.
  • BDVal leverages biological information: gene lists and pathway information can be used for a priori feature selection or feature aggregation.
  • BDVal is a high-performance program: it takes advantage of multi-threaded machines transparently.
  • BDVal is highly portable: it runs on a laptop computer, multi-processor SMP machine or the Sun Grid Engine without recompilation (thanks to Java)
  • BDVal output is fully reproducible: all steps of discovery and validation are automated. Random seeds can be controlled. The program generates detailed validation statistics and detailed model information output. Results are fully reproducible.
  • BDVal is robust: the program has been used in the MAQC-II community evaluation of biomarker discovery approaches.

Interested in trying BDVal?  Here is how we recommend to proceed:

  1. Download the software
  2. Follow the detailed walkthrough as described in the BDVal User Guide v1.1. The manual describes how to use the Ant scripts to automate the process of model development and validation. This is the recommended approach because it scales well for working with thousands of models built across tens of endpoints.
  3. Learn about other configuration options.
  4. Familiarize yourself with the various BDVal modes. These modes are used by the Ant scripts described in the user manual, but can be used as standalone programs for integration in other analysis pipelines. Use java -jar bdval.jar -help to display a list of BDVal modes/utilities. Help is context sentitive.
  5. If you are a programmer interested in using BDVal for  your own projects, look at the Developers Page and check the Java API pages. Version 1.1 of BDVal supports Java programming only. However, you can use BDVal modes with any scripting language such as Perl or Groovy.
  6. Questions? Please consult the BDVal user forum for previous answers or ask new questions.

See the related pages listed on the right side of this page for an exhaustive list of pages about BDVal.

1.Introduction to the development and validation of predictive biomarker models from high-throughput data sets. Deng X, Campagne F. Methods Mol Biol. 2010;620:435-70.

2. BDVal: reproducible large-scale predictive model development and validation in high-throughput datasets. Dorff KC, Chambwe N, Srdanovic M, Campagne F. Bioinformatics. 2010 Oct 1;26(19):2472-3. Epub 2010 Aug 11