TE100921 - Minutes

In Attendance

Barb Banbury (bbanbury@utk.edu)
Jeremy Beaulieu (jeremy.beaulieu@yale.edu)
Joe Felsenstein (joe@gs.washington.edu)
Eric Lyons (elyons@iplantcollaborative)
Naim Matasci (nmatasci@iplantcollaborative.org)
Sheldon McKay (sheldon.mckay@gmail.com)
Brian O'Meara (omeara.brian@gmail.com)

Announcements

David Ackerly won't be able to participate in meeting discussions but will remain available via email.

Group composition

Given that a variety of skills and backgrounds are represented, there is no need to change the group composition.

Project status overview

  • Any open issue with PIC implementation? 

    • No, but PIC in the DE has not been widely used by community and therefore no extensive usage testing took place.
  • Status of Discrete Trait Reconstruction implementation?  

    • Overall progress with implementation of additional methods has been slow. The community needs to be able to perform additional analyses in order for the CI to be adopted. One possibility would be to implement all the components of the Phylip package, given that a wrapper for CONTRASTS already exists and the various package components share architectural features (e.g. input files format). Brian fears that the needs analysis might take very long and that some of the components might be not germane to trait evolutionary analysis (e.g. tree building methods). Joe also points out that there might be scalability issues. Sheldon indicates that needs analysis can be streamlined and that the needs of the different software components are similar to those of the PIC. Brian and Sheldon agree that those additional tools are nevertheless relevant to phylogenetic analyses (e.g. obtaining a consensus tree) and should be implemented. Regarding scalability, most users are expected to work initially with small to medium sized tree. For tree building, the Big Trees WG is working on improving performance of tree building algorithms (RaXML, NINJA/WINDJAMMER). Barb also agrees that the tools present in Phylip are relevant to phylogenetic analysis. Sheldon adds that he's familiar with Phylip and that he could assist in implementing it.
      Another option is to explore the R statistical software package ape (Analyses of Phylogenetics and Evolution) which offers a comprehensive suite of phylogenetic methods. One particular aspect of apeversus Phylip is the fact that there are several web implementations of Phylip, but none of ape. A key point in favor of ape is the fact that ape can perform an estimation of uncertainty of ancestral state reconstructions. Regarding an issues with the performance of apementioned by Sheldon, Naim notes that the memory overflow problem identified previously by Liya has been patched ) and that he could generate a tree with 1 million taxa in less than 30 minutes on a PowerBook Pro laptop and that a PIC analysis of that tree took another 30 minutes.
      Naim will explore the feasibility of integrating ape into the DE and provide the relevant timeline. At the same time, concurrent implementation of other software (e.g. AncML) will be pursued.
  • General issues: 

    • Collection of existing big trees: Brian proposed to {*}allow users to “donate” trees for internal testing{*}, possibly through a checkbox when uploading for analysis.
    • Visualization issues: Naim will attend Tree-viz meetings. Brian mentioned that he produced a wish list of visualization features during needs analysis. Naim will look for the list on the wiki.
    • Method appropriateness warnings: Joe mentioned that models designed for small/medium trees might be unsuitable for large trees and that appropriateness should be tested. A test would be to estimate parameters for different parts of the tree and test whether they are different from the parameters estimated for the whole tree through a likelihood ratio test. Because users will most likely analyze small to medium size trees in the early stages, the issue is not of high priority and will be addressed at a later stage. 
    • API: Naim contacted Rion Dooley at the Texas Advanced Computing Center to get a timeline regarding the development of the API that will allow an easy integration of services into the discovery environment. Rion expects the wrapper API to be out in Jan 2011.

Action items

  • A1: To explore the feasibility of integrating ape into the DE and provide the relevant timeline.
    • Assigned to Naim
  • A2: To investigate an optional “make trees available for internal testing” functionality when trees are uploaded for analysis.
    • Unassigned (Naim?, Sheldon?)
  • A3: {-}To look for the tree visualization wish list on the Confluence wiki{-}. After searching and emailing Nicole, we found some documents indicating the needs of TE with regard to vizualization. It seems that edge coloring is the main priority: Metadata wish list
    and Brian also provided some examples of relevant trees:Cross cutting needs analysis
    • {-}Assigned to Naim{-} Completed
  • {-}Naim will also set up a{-} jira project for Trait Evolution to track progress. Jira page set up and users added (except Joe). The page is located at https://pods.iplantcollaborative.org/jira/browse/TRAILEVOL

Abbreviations

  • PIC: Phylogenetically Independent Contrasts
  • CI: Cyberinfrastructure
  • WG: Working group
  • API: Application Programming Interface