BT_22SEP10

Agenda

  • Update in NINJA/WINDJAMMER development
  • discussion of MPI pre-processing of distance matrix

Notes

Minutes

  • NINJA/WINDJAMMER: development of the MPI implementatin of Neighbor-Joining in WINDJAMMER is complete and the program behavior mirrors NINJA both internally and externally, except for externalized memory.
  • Robert reports that initial estimates indicate a performance improvement of 20-35X  compared to NINJA but more detailed and rigorous comparative benchmarking is needed. The test was run on c.a 2,500 nodes.
  • Adding more nodes would not further improve performance, as the limiting factor is the amount of available memory. Sheldon indicates that users might be interested in a non-parallel version of the program and that the source code should be made available via a publicly accessible repository.
  • Pre-processing of distance matrix. Generating the pair-wise distance matrix for 218K taxa takes 1 to 2 days and is a task that can be parallelized. It is a key requirement of this collaborationn that this function be added to WINDJAMMER.
  • Rob suggests a algorithm in which the sequences are divided in blocks of 1000 and each block is sent to a node to perform all pairwise comparisons among the 1000 sequences. In a second step, the blocks of sequences are exchanged among nodes so that every block is eventually compared to every other block. The only roadblock to an implementation is including a pairwise alignment function and file format issue.
  • Travis will work with Rob to solve these problem by helping to extract the relevant c code from quicktree and with File format issues

Action items

  • A1: Robert will provide benchmarking data comparing the performance of WINDJAMMER and NINJA for different sized trees.
  • A2: Robert will work on a prototye implementation of the parallel matrix computation alogrithm.
  • The next meeting will take place on Tuesday, Oct 6th at 9AM Eastern/10 AM central.