To Upload a new Sample (collection of reads) to GobyWeb use the menu item “Actions | Upload a new sample”:

Clicking on this link will bring up the sample upload page (numbers in red highlight features we discuss below):

 

The notable features of the “Upload a new sample” page are

  1. Each item in GobyWeb has a Tag, which is a shorthand identifier for the item. You can ignore these or use them in your project notebook to help you keep notes about Samples, Alignments, etc. These are automatically generated by GobyWeb.
  2. This checkbox determines how the samples uploaded in (4) will be handled providing you upload more than one file. Normally, if you upload more than one file on the “Upload a new sample” page, all of the files will be concatenated into a single large compact-reads file, creating a single sample. If you check this option “Create Multiple Samples” each of the uploaded files will create a new GobyWeb sample. In this case, sample file names are determined by concatenating the name provided in (3) and the filenames of the files that are uploaded.
  3. Here you can specify the name and description of the sample you are uploading. Use text strings that are meaningful to you. It is best to use short strings that identify the sample and clearly label its group.
  4. This part of the form is where you upload the files for the new Sample. You can browse for one or more  files (see uploading section below for more details).
  5. Here you specify the Attributes of the Sample. Please be careful to specify the correct Platform and Organism as these will restrict the choices offered for alignment (i.e., you will not be able to align a mouse sample to a human genome). Platform and organism attributes can be edited after the files have been uploaded.
  6. Each of the items created in GobyWeb can optionally be shared with any of other registered GobyWeb user. You can alter sharing settings latter, or at the time your upload samples.
  7. After you have reviewed sample name and attributes and finished uploading sample file(s), click Create to finalize sample creation.

Let’s discuss the uploading features in more detail. GobyWeb supports two upload modes. The first mode (shown above) works in any web browser but has a 2 GB limit on individual file uploads. Simply click Browse to locate files you want to upload. The new Java applet-based uploader removes this limitation, but requires that your browser can run Java applets (we had issues with Chrome on MacOS, but not with Safari or Firefox). The Java applet uploader is the default, but you can switch to the other uploader with the toggle control to the right of “Select Uploader to Use:”. The Java appled uploader looks like this:

Quality Encoding

If you are uploading files other than Goby compact-reads, you will need to specify how the quality scores are encoded. This is necessary because some files do not clearly specify the unit used to represent base quality scores (see Cock et al NAR 2010). GobyWeb will convert quality score according to your instructions. Three choices are currently supported:
  • Illumina (use for FASTQ files produced with an Illumina pipeline 1.3+)
  • Sanger (use for FASTQ files that use Phred encoding, i.e., suitable for files obtained from SRA).
  • Solexa (use for FASTQ produced with an Illumina pipeline version prior to 1.3).

Samples being created

After clicking the Create button, you will be shown the “Show Sample” page

The notable features of the “Show Sample” page are

  1. You can see that the Sample was created
  2. You can see the files that make up the sample and the size of the .compact-reads file as stored in GobyWeb
  3. Before the file can be aligned, GobyWeb will inspect the files you’ve uploaded, checking for read lengths and several other things. You can click the (refresh) button to reload the current page and wait for the Sample to be “Ready to Align”. Once it is, if your Sample included quality scores (such as from a FASTQ format) you will be presented with a Plot of Phred Quality Scores for reads in the Sample which can be useful in determining the overall quality of your input file.