Agenda

Sharon to provide summary of pipeline requirements including common components and shared elements from discussions with Gordon and Stephen
Discussion/update on Faceplant project
Jerry to provide update on BIEN collaboration including current status of resolution service and upcoming meeting in St. Louis
- update on collaboration
- upcoming workshop in St Louis
open discussion

Action Items

Participants: Pam, Dough, Val, Sharon, Jerry
Sharon updated her progress on learning data assembly pipelines from Gordon and Stephen, shared components from both pipelines are
- an iplant sequence database to hold sequence data from genbank (potentially other sources), syncronized with Genbank, and with user interface or API to mark, query data
- defining homologous sets with different approaches, both involves using blast, filtering the blast results with various criteria, deal with (record) reverse compliments
- MultipleSequenceAlignments, with different QC approach, one tree pruning with manul inspection with knowledge of phylogenetic backbone, the other with profile alignment with MDA scores
- MSA concatenation to generate the superMatrix and feed to RAxML

Sharon also updated the group with facePlant discussion
Jerry updated the progress of collaboration with BIEN (deliverable, upcoming meeting)
- Pam suggested the upcoming meeting in St Louis might/should include Nico from TOLKIN

the meeting attendees strongly suggested talking with Alex to find out in what form the superMatrix should be in, whether in a set of Multiple Fasta files, or in some database
Val brought up the issue of storing Multiple Sequence Alignments in databases and the challenge of running MultipleSequenceAlignments with huge dataset such as 500,000 sequences
Pam suggested bring the facePlant people together with APWEB people, the popularity of APWEB could help bring users to the facePlant
Pam also mentioned, in addition to genbank sequences, the data assembly infrastructure should be able to allow user to upload their own sequence data, for the phylogenetic analysis in combination with genbank data