Agenda

(minutes in orange)

Required features leading up to July workshop (from Naim's email):
- Spinning wheel/Placeholder if user clicks on the URL before the tree is available.
- Handle missing trees (i.e. non tree files being passed)
- Support for Nexus trees
- File fingerprinting
- Passing decoration metadata. Two cases: in one case, the tree+decorations are the result of an analysis that has just been run whereas in another case the decorations are added to a tree after it has been loaded.

Email response from Karen:
> Spinning wheel/Placeholder if user clicks on the URL before the tree
> is available: How long would it take to implement?
The other option is to deal with this on the DE side. We can either
wait until we provide the viewer URL to the user, or provide the URL
right away and then wait for the viewer.

We don't expect many very large trees. It would nevertheless be better to have a redirect page than having a 404 error page. Kris to look into the possibility.

> Missing trees (i.e. non tree files being passed): How is that handled?
Good question. Again, we should ask whether this is best handled on
the viewer side or the DE side. If we think about sending a file to
the viewer as a job, then perhaps it makes more sense for the DE to
provide the feedback - 'your job is complete', 'your job failed', etc.
I assume similar situations are going to come up with other analysis
tools in the DE.

The trees will be parsed at ingestion (see below) and therefore tree files should always be valid.

> Support for Nexus trees: How long would it take to implement it?
NEXUS is a file format that contains trees as newick strings (in
addition to multiple sequence alignments). I thought we had agreed
that things like file parsers should not be coded into each analysis
tool (given the obvious redundancy). Also, we should not have code a
NEXUS parser - there are plenty already (for example,http://sourceforge.net/projects/ncl/).

Sheldon proposed to add a tool that would extract the newick string from a nexus file. Andy L. says that the newick parser that will parse files at intake can also be used to extract newick strings from nexus files. This seems to be the most technically robust solution and very easy to implement.

> File fingerprinting implementation: How long would it take to implement?
More details, please.

Kris will implement some form of hashing (aka fingerprinting aka memoization) to avoid storing the same tree multiple times (either from the same user or from different users). This is particularly important for large, reference trees (e.g 55K) that take a longer time to get loaded into the db. A side issue is the fact that the database would need to be regularly cleaned either by checking which trees are still present in the DE or by deleting all trees below a certain size (in which case need to make sure that the URL is not stored with the file). Alternatively, only trees over a given threshold will be stored.

The viewer should be moved to a production server with access to a large storage.

> Also, we should think about how decoration metadata are passed. I see
> two different cases: in one case, the tree+decorations are the result
> of an analysis that has just been run whereas in another case the
> decorations are added to a tree after it has been loaded.
When you say "after it has been loaded", do you mean loaded into the
viewer? There are two different ways to think about this - the user
might want to add annotations through interaction with the viewer. Or,
they may have a separate files (perhaps csv) that contains metadata
associated with taxonomic names. In the latter case, I expect we would
need an intermediate service to check names, translate metadata into
visual elements, etc.

Very unlikely that decorations will be available for the July workshop. Naim and Kris will discuss possible options.

iPToL

TV_14JUN2011

Agenda