Annotate transcripts (app: Rename contigs 2.0)
Description: Rename contigs 2.0 takes in two files as input, the list produced in the last step and the transcripts fasta file produced in Section D (Rename transcripts), and matches the transcript names to the the same name in the list file, then appends the Blat matched name in the second column of the list to the end of the transcript name. The end result is a transcript fasta file with the original names linked to its Blat match.
- Log into the Discovery Environment: https://de.iplantcollaborative.org/de/.
- Open the Rename contigs 2.0 app (Public Applications > NGS > Transcriptome Profiling > Misc RNASeq Tools > Rename contigs 2.0).
- Change 'Analysis Name' to Annotate_Transcripts, add a 'Description' (optional), and use the default 'output folder'.
- Click on the Settings tab.
- Click on the 'Sequence file' field. Browse to the folder that contains the renamed .cds file from Section D (Rename transcripts) (Sample data: Community Data > iplant_training > rna-seq_without_genome > J_annotate_transcripts > BA_transcripts_cds.fa). Select the file, then click on OK.
- Click on the 'List of names-old and new' field. For the list of names enter cutWrapper_output.txt from Section I (Reformat Blat results) (Sample data: Community Data > iplant_training > rna-seq_without_genome > J_annotate_transcripts > cutWrapper_output.txt). Select the file, then click on OK.
- Click on the 'Output File Name' field. Enter 'BA_transcriptome_annotated.fasta' for the output file name.
- Click on "Launch Analysis".
- Click on 'Analyses' from the DE workspace and monitor the 'Status' of the analysis (e.g., Idle, Submitted, Pending, Running, Completed, Failed).
- Once launched, an analysis will continue whether the user remains logged in or not.
- Email notifications update on the analysis progress; they can be switched off under 'Preferences'.
- If the analysis fails or does not proceed in the anticipated timeline, check these tips for troubleshooting. (Using the sample data, the analysis should be complete in less than 5 minutes.)
- To re-run an analysis, click the analysis "App" in the 'Analyses' window.
- Access analysis results in one of two ways:
- In the 'Analyses' window click on the analysis "Name" to open the output folder.
- In the 'Data' window, click on user name, then navigate to the folder that holds the output of the analysis. (Find the output for the sample at Community Data > iplant_training > rna-seq_without_genome > J_annotate_transcripts > output_from_sample_data.)
- The output should include a mixture of unchanged sequence names and ones that have the gene/protein names (from refseq_protein) appended. This step is not required, but provides a convenient means of tracking the information related to the identities of these annotated transcript sequences. These sequences are not complete transcripts, but coding sequences, which helps tie them to the peptide sequences that they match. If at some point the full transcript sequences for some of these are needed, they can be found within the names created for the coding sequences in BA_transcripts_annotated.fasta.