The applications listed here are available for use in the Discovery Environment and are documented in: Discovery Environment Manual.

Discovery Environment Applications List

The box below searches only this space.
To search the entire iPlant wiki, enter your query in the box at the upper right.

 

 

 

 

 

Skip to end of metadata
Go to start of metadata

Rationale

NCBI fastq-dump can be very slow sometimes, even if you have the resources (network, IO, CPU) to go faster, even if you already downloaded the sra file (see the protip below). This tool speeds up the process by dividing the work into multiple threads. This is possible because fastq-dump have options (-N and -X) to query specific ranges of the sra file, this tool works by dividing the work into the requested number of threads, running multiple fastq-dump in parallel and concatenating the results back together, as if you had just executed a plain fastq-dump call.

parallel-fastq-dump-multi-0.6.5 is invoked using the following:

  1. Input (s)
    1. File containing SRA ids (1 SRA id per line)
  2. Parameters
  3. Outputs
    1.  Output Folder Name (default - sra_out)

Please work through the documentation and add your comments on the bottom of this page, or email comments to support@cyverse.org. Thank you.

Test Data

All files are located in the Community Data directory of the CyVerse Discovery Environment at the following path:

Community Data > iplantcollaborative > example_data > ncbi_sra_toolkit_fastq_dump (/iplant/home/shared/iplantcollaborative/example_data/ncbi_sra_toolkit_fastq_dump)

 Run parallel-fastq-dump-multi-0.6.5 as following:

  1. Input file
    1. sra_id_se.txt
  2. Parameters
  3. Outputs
    1. Output Folder Name (default - sra_out)

Tool Source for App

https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc&f=fastq-dump

  • No labels