Count mapped reads (app: Index BAM and get stats)

Description: The Index BAM and get stats app is based on the SAMTOOL idxstats, that produces a table in tab-delimited format listing the reference sequences, sequence length, number of reads mapping to the sequence and number of reads that don't. As the reference sequences in this case are the individual transcript sequences, the number of mapped reads equals the counts for each transcript. Documentation:

  1. Log into the Discovery Environment:
  2. Open the Index BAM and get stats app (Public Applications > NGS > SAMtools > Index BAM and get stats).
    1. Change 'Analysis Name' to ReadCounts_con01, add a 'Description' (optional), and use the default 'output folder'.
  3. Click on the Inputs tab.
    1. Select the 'Select a BAM file to report on' field. Browse to the folder containing the outputs from the previous section (Reformat mapping outputs) (Sample data: Community Data > iplant_training > rna-seq_without_genome > M_count_mapped_reads > BAcon01bowtie.sorted). Input each of the BAM files on a separate job, adjusting the 'Analysis Name' accordingly (e.g. ReadCounts_dehyd01 for BAdehyd01.sorted).
  4. Click on "Launch Analysis".
  5. Click on 'Analyses' from the DE workspace and monitor the 'Status' of the analysis (e.g., Idle, Submitted, Pending, Running, Completed, Failed).
    1. Once launched, an analysis will continue whether the user remains logged in or not.
    2. Email notifications update on the analysis progress; they can be switched off under 'Preferences'.
    3. If the analysis fails or does not proceed in the anticipated timeline, check these tips for troubleshooting. (Using the sample data, the analysis should be complete in less than 5 minutes.)
    4. To re-run an analysis, click the analysis "App" in the 'Analyses' window.
  6. Access analysis results in one of two ways:
    1. In the 'Analyses' window click on the analysis "Name" to open the output folder.
    2. In the 'Data' window, click on user name, then navigate to the folder that holds the output of the analysis. (Find the output for the sample at Community Data > iplant_training > rna-seq_without_genome > M_count_mapped_reads > output_from_sample_data.)
  7. The idxstats.txt file in each output is the output from SAMTOOLS idxstats. The prefix for the file name is derived from the input BAM file and includes a random number. Below is an example of the output from Samtools idxstats. Each line consists of reference sequence name, sequence length, # mapped reads and # unmapped reads.

