Determine differential expression (app: DESeq)
Description: The statistics app DESeq identifies differentially expressed sequences in two sequence pools. Alternatively, the EdgeR tool could be used. Documentation: http://bioconductor.org/packages/release/bioc/vignettes/DESeq/inst/doc/DESeq.pdf.
- Log into the Discovery Environment: https://de.iplantcollaborative.org/de/.
- Open the DESeq app (Public Applications > NGS > Transcriptome Profiling > Misc RNASeq tools > DESeq).
- Change 'Analysis Name' to Determine_Differential_Expression, add a 'Description' (optional), and use the default 'output folder'.
- Click on the Inputs tab.
- Select the 'Tab-delimited input file' field. Enter the matrix created in Section O (Combine counts) (Sample data: Community Data > iplant_training > rna-seq_without_genome > P_determine_differential_expression > ccombine_result.txt).
- Click on the Experiment Design tab.
- Select the 'Comma-separated list of factors for the data columns in your file' field. For the sample data enter the factors as control,control,control,dehydrated,dehydrated,dehydrated.
- Select the 'Comma-separated list of library types for each factor listed above (must be "single-end" or "paired-end" for each entry)' field. For the sample data enter paired-end,paired-end,paired-end,paired-end,paired-end,paired-end.
- Select the 'Comma-separated pair of factors for comparison' field. For the sample data enter control,dehydrated.
- Click on the Statistical Options tab.
- Enter 0.01 for the Minimum false-discovery rate, and 0.2 for the Quantile for removing insignificant genes (the lowest quantile, which will be ignored as insignificant).
- Click on "Launch Analysis".
- Click on 'Analyses' from the DE workspace and monitor the 'Status' of the analysis (e.g., Idle, Submitted, Pending, Running, Completed, Failed).
- Once launched, an analysis will continue whether the user remains logged in or not.
- Email notifications update on the analysis progress; they can be switched off under 'Preferences'.
- If the analysis fails or does not proceed in the anticipated timeline, check these tips for troubleshooting. (Using the sample data, the analysis should be complete in less than 20 minutes.)
- To re-run an analysis, click the analysis "App" in the 'Analyses' window.
- Access analysis results in one of two ways:
- In the 'Analyses' window click on the analysis "Name" to open the output folder.
- In the 'Data' window, click on user name, then navigate to the folder that holds the output of the analysis. (Find the output for the sample at Community Data > iplant_training > rna-seq_without_genome > P_determine_differential_expression > output_from_sample_data.)
- The output will consist of five output files, three graphs for the Dispersions, MA plot, the pValues, a text file of all of the results, and a text file for the significant results.
The MA Plot created by the DESeq App (DESeq_MAplot.png) will resemble the following figure:
Sample of significant results from the DESeq App (DESeq_results_significant.txt):