Rationale and background:
Overview of rnaQUAST
rnaQUAST is a tool for evaluating RNA-Seq assemblies using reference genome and gene data database. In addition, rnaQUAST is also capable of estimating gene database coverage by raw reads and de novo quality assessment using third-party softwares (STAR, TopHat, GMAP etc.,). The detailed manual is available here - http://spades.bioinf.spbau.ru/rnaquast/release1.1.0/manual.html
Needs to run rnaQUAST on Atmosphere
- Atmosphere requirements
- CyVerse username that has an institutional email (e.g. email@example.com)
- Atmosphere allocation
- Follow section called "Adding apps and services to your account"
- Fill in the request sheet with the allocation needed
- Include the number of Atmosphere Units (AUs) that you will need
- It can be difficult to know how AUs many are needed a priori,
- (number of cores) x (real time hours) x (days you need to run the instance) = AUs needed in a month
- Computational Knowledge
- Familiarity with the terminal/shell
Protocol: How to use rnaQUAST 1.1.0 on Atmosphere?
This tutorial will take users through the steps of:
- Launching the rnaQUAST 1.1.0 Atmosphere image
- Running rnaQUAST 1.1.0 on a test data
Please work through the tutorial and add your comments on the bottom of this page. Or send comments per email to firstname.lastname@example.org. Thank you
Part 1: Connect to an instance of an Atmosphere Image (Virtual Machine)
Step 1. Go to https://atmo.iplantcollaborative.org and log in with your CyVerse credentials.
Step 2. Click on the Launch New Instance button and search for rnaQUAST 1.1.0.
Step 3. Select the image rnaQUAST 1.1.0 and click Launch Instance. It will take 10-15 minutes for the cloud instance to be launched.
Note: Instances can be configured for different amounts of CPU, memory, and storage depending on user needs. This tutorial can be accomplished with the medium instance size, medium1 (4 CPUs, 8 GB memory, 80 GB root)
Part 2: Set up a rnaQUAST 1.1.0 run on a test data using the Terminal window
Step 1. Open the Terminal. Add the ssh details along with your IP address to connect the instance through the terminal
step 2. You will find rnaQUAST v1.1.0 software in "/opt" folder. All the dependencies for running rnaQUAST v1.1.0 are located in "/opt/rnaQUAST-1.1.0"
Step 3. Before you start using the rnaQUAST 1.1.0, you need to make sure that the following softwares are added to your PATH
Step 4. The staged example data can be found in folder "rnaQUAST-1.1.0/test_data" within "opt" folder. List its contents with the ls command:
- Trinity.fasta, spades.311.fasta and idba.fasta are test assemblies assembled with Trinity, spades and idba respectively
- Saccharomyces_cerevisiae.R64-1-1.75.gtf is the test reference annotation file
- Saccharomyces_cerevisiae.R64-1-1.75.dna.toplevel.fa is the test reference genome file. The files with extension (.bt2) are bowtie2 indexed reference genome files
- Paired_end1.fq and Paired_end2.fq are the test paired end fastq read files
a. Using rnaQUAST 1.1.0 tool using GMAP (default)
b. Using rnaQUAST 1.1.0 tool using BLAT
a. Using rnaQUAST 1.1.0 tool using STAR aligner (default)
Software for de novo quality assessment:
a. Using rnaQUAST 1.1.0 tool with BUSCO
b. Using rnaQUAST 1.1.0 tool with GeneMarkS-T