This page describes the BWA_BAM plugin, as an example of a plugin that generates BAM output.

Config.xml

The config.xml file contains:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<alignerConfig>
    <name>BWA (BAM output)</name>
    <id>BWA_BAM</id>
    <help>BWA writing to BAM output. This aligner requests exclusive access to a server node and run parallel on this
        node against a single reads file.
    </help>
    <requires>
        <resource>
            <id>BWA_GOBY</id>
            <version-at-least>0.5.9.16</version-at-least>
        </resource>
        <resource>
            <id>SAMTOOLS</id>
            <version-at-least>0.1.14</version-at-least>
        </resource>
    </requires>
    <supportsColorSpace>true</supportsColorSpace>
    <supportsBisulfiteConvertedReads>false</supportsBisulfiteConvertedReads>
    <supportsGobyReads>true</supportsGobyReads>
    <supportsGobyAlignments>false</supportsGobyAlignments>
    <supportsPairedEndAlignments>true</supportsPairedEndAlignments>
    <supportsFastqReads>false</supportsFastqReads>
    <supportsFastaReads>false</supportsFastaReads>
    <supportsBAMAlignments>true</supportsBAMAlignments>
    <options>
       ...
    </options>
    <version>1.0</version>
</alignerConfig>

The plugin identifier must match the name of the directory where the plugin is defined. Please note that case matters in this comparison.

This plugin indicates that it requires the SAMTOOLS with a version number at least 0.1.14 and BWA_GOBY with version>=0.5.9.16.

The various <supports…> elements describe what capabilities are supported by the plugin. BAM aligner plugins must have

<supportsBAMAlignments>true</supportsBAMAlignments>.

This indicates to GobyWeb that the alignments produced by the plugin are in the BAM format. By convention, this also means that the plugin is responsible for aligning the entire read file and producing one BAM and one BAM index file per GobyWeb sample.

The element <supportsColorSpace>true</supportsColorSpace> indicates that the plugin can align reads encoded in color-space.

Similarly, <supportsBisulfiteConvertedReads>false</supportsBisulfiteConvertedReads> indicates that the plugin cannot align bisulfite converted reads.

The other elements are described in the table below:

booleansupportsBAMAlignments
Indicates whether the aligner can write alignments in the BAM format.
 booleansupportsBisulfiteConvertedReads
Indicates that the aligner can process bisulfite converted reads.
 booleansupportsColorSpace
Indicates whether this aligner supports reads in color-space.
 booleansupportsFastaReads
Indicates whether the aligner can read Fasta read files.
 booleansupportsFastqReads
Indicates whether the aligner can read Fastq read files.
 booleansupportsGobyAlignments
Indicates whether the aligner can write alignments in the Goby format.
 booleansupportsGobyReads
Indicates whether the aligner can read Goby read files.
 booleansupportsPairedEndAlignments
Indicates whether the aligner can perform paired-end alignment.

The configuration of BWA_BAM is appropriate for a plugin that aligns base-space or color-space reads, either single end or paired-end and writes BAM output.

script.sh

The script file is written in bash and defines one function called plugin_align:

function plugin_align {
    OUTPUT=$1
    BASENAME=$2
..
}

The function will be called with two arguments: an argument that can be used as a temporary filename, and the basename that should be used to store the sorted alignment.

The function can be called like this:

plugin_align tmp-filename basename

In such a case it could create a SAM file in tmp-filename and write sorted BAM output to

  • basename.bam and
  • basename.bam.bai

After assigning OUTPUT and BASENAME with the argument to the function, the script continues by checking if the sample is encoded in color-space and if so sets the color space option expected by bwa (-c):

    COLOR_SPACE_OPTION=""
    if [ "${COLOR_SPACE}" == "true" ]; then
                COLOR_SPACE_OPTION="-c"
    fi

Scripts that indicate  <supportsColorSpace>true</supportsColorSpace> do not need to handle color-space reads since GobyWeb will never invoke the plugin on such read files.

    # set the number of threads to the number of cores available on the server:
    NUM_THREADS=`grep physical  /proc/cpuinfo |grep id|wc -l`
    PARALLEL_OPTION="-t ${NUM_THREADS}"

The previous lines detect the number of threads available on the execution node, and set the bwa option -t appropriately.

if [ "${PAIRED_END_ALIGNMENT}" == "true" ]; then
 ...
else
 ...
fi

The previous if statement tests if the read file contains paired-end reads. The variable was initialized by GobyWeb from meta-data stored about the Sample. For simplicity, we will discuss the alignment of single end reads (when the value of ${PAIRED_END_ALIGNMENT} is false). The following instructions perform the trick:

                # Single end alignment, native aligner
        SAI_FILE_0=${READS##*/}.sai
        nice ${RESOURCES_BWA_GOBY_EXEC_PATH} aln ${PARALLEL_OPTION}  ${COLOR_SPACE_OPTION} \
          -f ${SAI_FILE_0} -l ${INPUT_READ_LENGTH} ${ALIGNER_OPTIONS} ${INDEX_DIRECTORY}/${INDEX_PREFIX} ${READS}
        dieUponError "bwa aln step failed (single end), sub-task ${CURRENT_PART} of ${NUMBER_OF_PARTS}, failed"
        # aln worked, let's samse
        nice ${RESOURCES_BWA_GOBY_EXEC_PATH} samse ${COLOR_SPACE_OPTION} -f pre-sort-${TAG}  \
            ${INDEX_DIRECTORY}/${INDEX_PREFIX} ${SAI_FILE_0} ${READS}
        dieUponError "bwa samse step failed (single end), sub-task ${CURRENT_PART} of ${NUMBER_OF_PARTS}, failed"

The bash construct SAI_FILE_0=${READS##*/}.sai strips the READS path to retain only the filename of the reads file. This operation is used to construct a filename local to the current directory where the plugin script executes.

The second line runs the version of BWA with Goby support imported as a resource. The variable RESOURCES_BWA_GOBY_EXEC_PATH has been set by the plugin system following the pattern RESOURCES _ plugin-id _ file-id. In this case, plugin-id refers to the BWA_GOBY resource plugin, and file-id is the executable file with id EXEC_PATH defined by this resource.

The third line detects if the status of the second command line was an error (status !=0 indicate errors by convention), and if so pushes a message to the GobyWeb user interface, and exits execution of the script.

The last lines similarly run bwa samse and interrupt processing if this step failed for any reason.

Finally, the script converts the file produced to a sorted BAM file and index this file. The function dieUponError is again called to check that each step succeeded or report the error condition otherwise. The output is generated as ${BASENAME}.bam and ${BASENAME}.bam.bai

    nice ${RESOURCES_SAMTOOLS_EXEC_PATH}  view -uS ${OUTPUT}  | ${RESOURCES_SAMTOOLS_EXEC_PATH}  sort - ${BASENAME}
    dieUponError "samtools view|sort step failed, sub-task ${CURRENT_PART} of ${NUMBER_OF_PARTS}, failed"
    # sort worked. We index the BAM file. If this works, the return code will be 0, indicating no problem with plugin_align
    nice ${RESOURCES_SAMTOOLS_EXEC_PATH} index ${BASENAME}.bam
    dieUponError "samtools index step failed, sub-task ${CURRENT_PART} of ${NUMBER_OF_PARTS}, failed"