The applications listed here are available for use in the Discovery Environment and are documented in: Discovery Environment Manual.

Discovery Environment Applications List

The box below searches only this space.
To search the entire iPlant wiki, enter your query in the box at the upper right.

Maintenance: Tues, 28 Jan 2020

ACCESS TO OR USAGE OF THE FOLLOWING SERVICES WILL BE UNAVAILABLE OR DISRUPTED:

Discovery Environment         8:00am to 5:00pm MST
The Discovery Environment will be unavailable while patches and updates are applied.
        ** Currently running analyses will be terminated. Please plan accordingly.

Data Store                    8:00am to 5:00pm MST
The Data Store will be unavailable during the maintenance period.
 
Data Commons                  8:00am to 5:00pm MST
The Data Commons will be unavailable during the maintenance period.
 
Atmosphere and Cloud Services 8:00am to 5:00pm MST
Marana Cloud: Atmosphere instances in the Marana Cloud will be operational; however, you will not be able to use the Data Store within your instance, and you may not be able to access the Atmosphere web interface.
 
User Portal                   8:00am to 5:00pm MST
The User Portal, http://user.cyverse.org, will be unavailable while we perform maintenance and updates.
 
Agave/Science API             8:00am to 5:00pm MST
The Agave/Science API will be unavailable during this maintenance period.
 
DNA Subway                    8:00am to 5:00pm MST
DNA Subway will be unavailable during this maintenance period.
 
The following services will NOT be affected by the maintenance: CyVerse Wiki and JIRA

Keep up to date with our maintenance schedules on the CyVerse public calendar
http://www.cyverse.org/maintenance-calendar
Check your local timezone here https://bit.ly/36iVOkX 
 
Please contact support@cyverse.org for any questions, or concerns.

 

 

 

 

 

Skip to end of metadata
Go to start of metadata

Please work through the documentation and add your comments on the bottom of this page, or email comments to support@cyverse.org. Thank you.

What is PEAR?

PEAR is an ultrafast, memory-efficient and highly accurate pair-end read merger. It is fully parallelized and can run with as low as just a few kilobytes of memory. PEAR is distributed under the Creative Commons license, and it runs on the command-line under Linux and UNIX based operating systems.

Mandatory arguments

  • Specify the name of file that contains the forward paired-end reads
  • Specify the name of file that contains the reverse paired-end reads
  • Specify the name to be used as base for the output files. PEAR outputs four files. A file containing the assembled reads with a assembled.fastq extension, two files containing the forward, resp. reverse, unassembled reads with extensions unassembled.forward.fastq, resp. unassembled.reverse.fastq, and a file containing the discarded reads with a discarded.fastq extension.

Optional arguments

p-value

Specify a p-value for the statistical test. If the computed p-value of a possible assembly exceeds the specified p-value then the paired-end read will not be assembled. Valid options are: 0.0001, 0.001, 0.01, 0.05 and 1.0. Setting 1.0 disables the test. (default: 0.01)

Minimum overlap size

Specify the minimum overlap size. The minimum overlap may be set to 1 when the statistical test is used. However, further restricting the minimum overlap size to a proper value may reduce false-positive assembles. (default: 10)

Maximum possible length

Specify the maximum possible length of the assembled sequences. Setting this value to 0 disables the restriction and assembled sequences may be arbitrary long. (default: 0)

Minimum possible length

Specify the minimum possible length of the assembled sequences. Setting this value to 0 disables the restriction and assembled sequences may be arbitrary short. (default: 50)

Minimum length of reads after trimming

Specify the minimum length of reads after trimming the low quality part (see option -q). (default: 1)

Quality score threshold for trimming

Specify the quality score threshold for trimming the low quality part of a read. If the quality scores of two consecutive bases are strictly less than the specified threshold, the rest of the read will be trimmed. (default: 0)

Maximal proportion of uncalled bases

Specify the maximal proportion of uncalled bases in a read. Setting this value to 0 will cause PEAR to discard all reads containing uncalled bases. The other extreme setting is 1 which causes PEAR to process all reads independent on the number of uncalled bases. (default: 1)

Statistical test

Specify the type of statistical test. Two options are available. (default: 1)

  1. Given the minimum allowed overlap, test using the highest OES. Note that due to its discrete nature, this test usually yields a lower p-value for the assembled read than the cut-off (specified by -p). For example, setting the cut-off to 0.05 using this test, the assembled reads might have an actual p-value of 0.02.
  2. Use the acceptance probability (m.a.p). This test methods computes the same probability as test method 1. However, it assumes that the minimal overlap is the observed overlap with the highest OES, instead of the one specified by -v. Therefore, this is not a valid statistical test and the 'p-value' is in fact the maximal probability for accepting the assembly. Nevertheless, we observed in practice that for the case the actual overlap sizes are relatively small, test 2 can correctly assemble more reads with only slightly higher false-positive rate.

 

Empirical base frequencies

Disable empirical base frequencies. (default: use empirical base frequencies)

Scoring method

Specify the scoring method. (default: 2)

  1. OES with +1 for match and -1 for mismatch.
  2. Assembly score (AS). Use +1 for match and -1 for mismatch multiplied by base quality scores.
  3. Ignore quality scores and use +1 for a match and -1 for a mismatch.

 

Base PHRED quality score

Base PHRED quality score. (default: 33)

  

Test Run

All files are located in the Community Data directory of the CyVerse Discovery Environment at the following path:

Community Data > iplantcollaborative > example_data > PEAR  (/iplant/home/shared/iplantcollaborative/example_data/PEAR)

Mandatory arguments: 

Use  Read1.fastq for forward reads, Read2.fastq for reverse reads and ouput (default) for the name of output file

Leave all the optional arguments as default.

Output

A file containing the assembled reads with a assembled.fastq extension, two files containing the forward, resp. reverse, unassembled reads with extensions unassembled.forward.fastq, resp. unassembled.reverse.fastq, and a file containing the discarded reads with a discarded.fastq extension.


Tool Source for App