The applications listed here are available for use in the Discovery Environment and are documented in: Discovery Environment Manual.

Discovery Environment Applications List

The box below searches only this space.
To search the entire iPlant wiki, enter your query in the box at the upper right.






Skip to end of metadata
Go to start of metadata

CD-HIT-est 4.6.8 

Community rating: ?????

Performs clustering of contigs on a fasta file of assembled transcripts.


Community rating: ?????

CD-HIT-EST clusters a nucleotide dataset into clusters that meet a user-defined similarity threshold, usually a sequence identity.

Quick Start

Test Data


Test data for this app appears directly in the Discovery Environment in the Data window under Community Data -> iplantcollaborative -> example_data -> CD-HIT.

Input File(s)

Use testranscripts.fasta from the directory above as test input.

Parameters Used in App

When the app is run in the Discovery Environment, use the following parameters with the above input file(s) to get the output provided in the next section below.

    • Global sequence identity should be set to 0.94.
    • Default settings otherwise.

Output File(s)

Expect CD-HITout.fa and  CD-HITout.fa.clstr as output. 

CD-HITout.fa contains the clustered sequence in fasta format.

CD-HITout.fa.clstr contains information about the clusters.

Tool Source for App

  • No labels