BT_26JAN11

Notes

The meeting can be skipped until February. I have a few things to report:

1. There was a question about that Windjammer and NINJA and RapidNJ were building different trees. I have resolved that issue. By default I was forcing negative distances to zero. The other codes were not. By allowing negative distances all three codes produce the same or similar trees.
2. I have read over the two papers mentioned by a reviewer. The 2006 paper has an mpi implementation that uses a manager-worker bee strategy. This technique won't scale. Also they only ran on up to 32 processors and had a scale up speed of 7 at 32 processor. The largest case they ran was about 10K taxa. The 2009 paper used GPU with CUDA to also to the neighbor joining. They are doing the canonical algorithm O(N^3). Again they only did about 10K taxa.

I have several technical issues I have to resolve:

1. There is an issue with ranger and running the 218K case on 2512 cores. This worked until recently. The way memory is managed on has changed and we think that some of the freed memory is being treated as committed so the program runs out of memory when there actually is memory there. I'm testing different mpi implementations as well as memory managers to track down the problem.
2. The reader for distance matrices needs to be modified so that it doesn't overload the Lustre file system.
3. Once the previous issues have been handled, I'm going to test Windjammer on longhorn and the new lonestar.