iPToL_ET_11JUN09

Attendees: Michael Sanderson, Nirav Merchant, Dan Stanzione, Brenton Elmore, Sriram Srinivasan, Karla Gendler, Andy Lenards, Nick Murphy, Edwin Skidmore, Scott Menor, Adam Kubach, Jerry Lu

Discovery Environment: no strong vision of overall look of DE but strong vision of problems that need to be solved; more concerned with characterization and solution of problems

What is cross cutting across all solutions to problems (video-conferencing, chats, sharing of data)? Not sure of how to rank collaborative tools versus ability to work on the tools as an individual in general

ToL proper, through Val Tannen, been developing tools to streamline workflow in multi-institution, multi-people projects (Val's project is called p pod)
Field right now single-user, single gene; people made career looking at single part of tree

What do we need to provide to enable collaboration five years in the future? Once tree is complete, will there be a third party annotation in phylogenetic terms?
Has to be decentralized; anyone, anywhere has to be able to annotate tree

Audiences served: person studying orchids, person downloading GenBank running algorithms, and the collaborative group

Provenance and experimental reproducibility
Really good thing to be able to have experimental reproducibility
Exciting product would be reproducibility (can you, yourself, get the published tree from the data provided)
Two problems: never thought of it and complexity of producing the tree
Ability to share at users discretion (is there a desire for that? Perhaps at some level, can have Type I and type II errors)

Sharing: trees haven’t gone to completion but could be starting point for an analysis

Fungal Tree of Life is a good model for collaborative environment

  • http://www.aftol.org/
  • (software environment is MOR)
  • continually going to GenBank and tree dynamically changes and then tree annotated and classification changes; have tools for people that are generating new data

Mike’s view now (vs. above): Data all lives in GenBank; if it doesn’t live there now, it will be there shortly; therefore can just take data from one place; but still not all data is there and won’t be there until the people publish data

2 different ways of looking at it; build a tree and it stays that way or say tree’s major branches are about right and then discussion/debate over reorganizing “twigs”

Mike has a grant dealing with efficient storage: database to store 1million phylogenetic trees

Good to consider a species trees and lots of other trees (not good to have one species tree)

Do people actually look at the trees or just run tools on them? How do you show uncertainty?
Conventional: draw a set of lineages that all come to one point (polytomies); seems like we should be able to present it better

TreeJuxtopser
Characterize statistically the difference between trees and put that into the visualization
Anyone is going to be wondering about congruence of trees: how do you represent this

Success: 500K tree, more feasible is creating 50K tree and tools to work with it

What makes a tree abysmal? Wouldn’t use it to have confidence that the 10 species of Arabidopsis are accurately placed on the tree; not enough data in the analysis; missing too much information; no credible assessment of the tree; no validation

Verification and validation? What other kinds of validation tools need to be developed? Built in statistical methods; Alexis has built in some heuristic models; does the giant tree display the same small tree?

Computational steering: your tree may not converge but can you point it in a direction so that you can get it to converge

Tree is going to be information overload. What would be ideal way, not savvy with phylogenetics, to present it and use it? Currency of communication is taxonomy; reconciling phylogenetic vision of plant diversity with classical description of classifications that grew up with

View of world is based on model organisms; do we have viewpoints or lenses that are tailored towards certain communities? Do we start looking at different languages? Can we turn on/off annotation?

Modern set of terms will be out of sync with classical botanists somewhere

Automate the application of names to the tree: could have system where definitions are in place but annotations move around the tree

Is there something like WHOid in phylogenetics world?

APWeb might be too sophisticated for K-12 and even basic undergraduate

“Gold Standard Tree” might not be a tree that is completely resolved, expresses uncertainty and certainty; may be representation of what we know with confidence

  • AI follow-up with Andrew: Understand why people use the "web site" vs. excel spreadsheet
  • This is a social issue
  • People using lots of .xls spreadsheets with email traffic

pPod with Val Tannen and Reid Beeman at University of Florida

Communication methods now are excel spreadsheets going back and forth in email

In working groups, hope to ask do you really want this tool, with the interface, and we’ll wrap it

Look at ways to parallelize algorithm for scalability

Tree Structure: make a decision that won’t handcuff iplant

What is a DE? Skeptism if can be executed to be useful; but in meantime have a much more defined set of tasks;

If all you could deliver was to select tree, do something with it and publish it

Anything from data integration as low hanging fruit? Where do the trees come from? Either user or comes from somewhere else?

Early task to work on is input selection and output selection (visualize and export)?

First version of DE will be workbench model