NESCENT MAY31-JUNE01 2010

Participants

  • Bengt Sennblad
  • Jamie Estill
  • Jim Leebens-Mack
  • Cecile Ane
  • Todd Vision
  • Sheldon McKay (Tuesday)

Agenda Topics

  • Introductory presentations and discussions
  • Brainstorming of ideas for visualization of large, coupled gene and species trees

Notes

Ideas about visualization 

  1. Impressed with the Kubach tree visualization tool for the big species and big gene trees.  Especially liked the semantic zooming of clade detail, which we felt was a better way to collapse detail than a hyberbolic tree.
  2. Would like to make sure the DE has three tree panels simultaneously displayed: A - the species tree, B - the gene tree "ATV-style" (with labeled nodes) and C - a very highly magnified "fat tree" (for one or at most a few branches at a time.  The "fat tree" was deemed useful for those interested in hybridization, species tree questions, and useful pedagogically - it should be readily available if not always present, and not an 'expert option'.  Additional rqmnts:
    1. The region shown in C would have a corresponding panning box in B.
    2. Selecting regions on the gene tree would light up corresponding regions on the species tree and vice versa.
  3. The WG liked the idea of highlighting orthology groups dynamically with the mouse-over, as in the Princeton Orthology Database (http://ppod.princeton.edu/), but that there may be better visual strategies to indicate which genes are in an orthology group than used there. 
  4. Colorization of labels by e.g. species is nice, but likely to be problematic unless there are a limited number of groups, e.g. all the genes in one organismal clade relative to all the others.
  5. It would be desirable to make sure the user can easily see how many/where duplication events or other incongruencies separate two genes.  Not sure how to accomplish that, though.

Ideas for publications

  1. Tests of the accuracy and scalability of different algorithms with different biological models.  
    1. Accuracy and timing as a function of size, as part of a short overview of the scale challenge - Bengt & Todd
    2. Accuracy as a function of model violations - on tree sizes small enough to use Prime-GSR - Bengt & Cecile
      1. See "Planning of simulation study" above for details
    3. Accuracy using Bowers benchmark dataset, with a more biologically focused paper - Jamie & Jim
    4. Jim is interested in an additional paper on the effect of heterogeneity in background duplication rates and polyploidy, if time allows.  This would naturally follow on from b and c.
  2. Confidence measures on phylogenies when using non-probabilistic objective functions?  [Not sure this stands on its own as currently conceived] - Cecile & Todd
  3. iPlant discovery environment tree reconciliation gene catalog, pipeline, and interface
    1. Rolled into 1KP pilot data, presenting analysis as an enhanced publication - Jamie & Jim
    2. Possibly a separate application note, with more focus on the interface - Todd, Sheldon, Andrew
  4. Review of the state of the art in gene tree reconciliation, with a focus on combinations of different processes happening simultaneously.  [Combine with 1a?] - All, Todd as lead