- This is an example workflow that demonstrates how to use CLUSTALW to do a multiple sequence alignment from the command line. It is also to demonstrate how to run this program in non-intractive mode, the first step to programmatic wrapping.
- The starting point is DNA sequences
- Access to a linux/unix shell
- This work flow assumes that you have the BioPerl libraries and the CLUSTALW binary executables compiled and installed in your path, either by putting them in /usr/local/bin or by editing your $PATH environmental variable.
The DNA sequences
- I downloaded coding sequences (CDS) for actin genes from five metazoan species from NCBI.
- A complete CDS starts at the start codon (ATG; Methionine) and ends at the stop codon (TAG, TGA or TAA).
- Codons are three nucleotide units that encode particular amino acids or the stop-translation signal.
- This is a sample CDS for C. elegans, in FASTA format:
- View the whole FASTA file
Doing the multiple sequence alignment with CLUSTALW
- CLUSTALW can be run from the command line
- It is a binary executable that uses interactive menus
- A basic multiple sequence alignment starts with loading the file (select option 1, then enter the filename, actin.fa)
- Then do the alignment by choosing option 2.
- Then select option 1, and choose the default output file names when prompted. The alignments will be performed and saved to a file as well as printed to the screen.
- You are done, the alignment file is named actin.aln
Using CLUSTALW non-interactively
- A menu driven-interface is not useful for pipeline or programatic access.
- Fortunately, we can run the application by passing the commands via STDIN
- This is accomplished by creating a text file with the sequence of commands in it.
- Annotated version:
- select menu option one, load the input file
- select option 2 (multiple alignments); option 1 runs the alignment.
- provide output file names for the alignments and guide tree files
- exit from alignment display; alignment menu; main menu
- To run the program non-interactively, save the commands as clustalw_commands.txt, then run CLUSTALW using this incantation:
- program output will scroll rapidly on screen and also save the multiple sequence alignments in actin.aln