Identify best matches (app: Best Hit for Blat Output)
Description: The Best Hit for Blat Output app will extract the best matches that were generated during the mapping in Section F (Map transcripts). Aligners like Blat can produce many matches for one input sequence, many of which may be of poor quality. Tools for filtering Blat results come with Blat, such as pslReps, which is available in the DE in the form of the app Best Hit for Blat Output. pslReps selects the best alignments for a particular query sequence, using a ‘near best in genome’ approach. Documentation: https://genome.ucsc.edu/goldenPath/help/blatSpec.html#pslRepsUsage.
- Log into the Discovery Environment: https://de.iplantcollaborative.org/de/.
- Open the Best Hit for Blat Output app (Public Applications > NGS > Aligners > Best Hit for Blat Output).
- Change 'Analysis Name' to Identify_Best_Matches, add a 'Description' (optional), and use the default 'output folder'.
- Click on the Inputs tab.
- Under the 'Input psl' field enter the file containing the mapped transcripts (Sample data: Community Data > iplant_training > rna-seq_without_genome > H_identify_best_matches > concatenate_out.txt; in this case disregard the request by the app for a .psl file).
- Select the 'Output psl file name' field. Adjust the name of the output file to 'BA_pep_vs_Refseq_pep.psl'.
- Click on "Launch Analysis".
- Click on 'Analyses' from the DE workspace and monitor the 'Status' of the analysis (e.g., Idle, Submitted, Pending, Running, Completed, Failed).
- Once launched, an analysis will continue whether the user remains logged in or not.
- Email notifications update on the analysis progress; they can be switched off under 'Preferences'.
- If the analysis fails or does not proceed in the anticipated timeline, check these tips for troubleshooting. (Using the sample data, the analysis should be complete in less than 5 minutes.)
- To re-run an analysis, click the analysis "App" in the 'Analyses' window.
- Access analysis results in one of two ways:
- In the 'Analyses' window click on the analysis "Name" to open the output folder.
- In the 'Data' window, click on user name, then navigate to the folder that holds the output of the analysis. (Find the output for the sample at Community Data > iplant_training > rna-seq_without_genome > H_identify_best_matches > output_from_sample_data.)
- The output consists of a smaller psl file with the best matches for each input contig sequence.