TR_08FEB10

  • Jump to Call-in/Webex information
  • Jump to Notes section

Call-in/WebEx Information

Action items

  • none

Agenda

  1. Brief review of previous supplement meeting (3 min.)
  2. Prototype Development Progress Update (10 min.)
  3. Review goals of NESCent trip (2 min.)
  4. Identifying Target Users [#] (5 min.)
  5. Discussion of points marked as [REVISIT] in previous meetings [1,2] (~40 min.)

Discussion Points

  1. Context: A point was made that when building the user interface (the entry point), we might consider allowing our user to search for terms that appear in GenBank/GenPept Accession Records.
    • Question: Beyond the Definition Line (as seen in FASTA), what terms would users like to be indexed in searches? The source/organism? The any of the fields in features?
      Taxon name searches would be good. Otherwise, terms in def line would be fine(Jim)
  2. Context: In Phytome's subfamily selection page, it was mentioned that a user would need to be familiar w/ the organisms and have to recognize the IDs. The correlation was not obvious. It was suggested that the presence of the alias, for Phytome, in the subfamily selection page makes this a richer interface.
    • Question: How should we provide the context/metadata to such a page so it is more clear to a user not intimately familiar with the organisms which subfamilies (or individual genes) to select? It seems like it is a 'needle in a stack needles' problem. In other words, the user will know - after they look through them all in great detail - which exact one that they want. Given the skill level indicated for the target user, making this step easy is a must.
      Taxon name could be appended to the gene names or users could be given a table with taxon names listed in same row as gene names .... and perhaps best blast hit annotation. (Jim)
  3. Context: When developing a solution, consideration should be put into allow the user to mark or track hits from the original search all the way through the analysis.
    • Questions:
      • Would such functionality help solve the previous issue of marking the subfamilies to select easier?
        Probably. Gene table with check boxes could include info mentioned above (Jim)
      • Is there an example of an application (desktop or web) that provides functionality like this?
      • Would simple checkbox flagging be a sufficient first step to provide this?
        If I understand what you are getting at here, PlantTribes may exhibit a useful way to organize the results of gene database queries - e.g. see YABBT example (Jim)

Open Questions

  • Do we need to be able to map the protein back to CDS (aka Coding Sequence)? [notes]
  • A recent tech talk was given by Eric Lyons about CoGe and it made me curious...
    • Can CoGe do any one of the gene/species tree reconciliation that we are discussing?
    • If yes, would it be possible to do several steps of this example in an environment like CoGe? Would it take manual assembling of data from the users? At what points (steps in our example workflow) would the user need to leave CoGe? [reference]

Notes

Attendees: Jim Leebens-Mack, Cecile Ane, Adam Kubach, Jerry Lu, Natalie Henriques, Michael Gonzales, Andrew Lenards, Todd Vision, Nicole Hopkins

  • Timeline for prototype
    • We are behind schedule regarding the proposed timeline (originally discussed here
    • We will begin roughing out the user interface during the NESCent visit.
  • Goals for NESCent visit
    • Andy: develop a firm understanding of the entry points for the prototype, and the underlying assumptions of what will be present
      • discuss 1KP as the gene catalog
      • in-scope use cases
      • being user interface design in whiteboard sessions
    • Todd: get Andy up to speed w/ domain knowledge such that he use make more efficient use of the working group
  • Finding Target Users for Personas
    • result: personas will developed by ET/Core-SW internally and verified by domain researchers. No interviews of external individuals will be done.
  • Discussion Points
    • #1 (see above for context/question)
      • Beyond the Definition Line (as seen in FASTA), taxon names (not just binomial, but higher taxonomic names include in GenBank record), common name. Would be nice to include information in the FEATURES section of GenBank records. However, the use of this section is not standardized so it is not clear how to index for search.
      • A user may come in from a publication, so they will want to find the gene described in the publication and proceed from that point.
        • [NiceToHave]: search on publication/journal/pubmedID
    • #3 (see above for context/question)
      • this type of UI discussion is difficult over tele-con without having a visual-mockup.
        • AI: Andy will present rough UI for tracking results/hits
      • Including an "alias" plus taxon name for records that is the identifier used by the genome sequencing project is the way that Phytome gave metadata for users to determine if the result was what they're looking for.
      • In the case of non-sequenced genome (like the radish workflow example), would use EMBL or GenBank ID as the alias value. u
      • Phytome deals w/ Sanger
        • Todd describe an approach used by Phytome to develop consensus by calling out to other EST data-sources using the identifier.
      • A few groups are putting assemblies into the GenBank Short Read Archive (not very accessible).
        • having an alias for this database is needed, try and link back to the source of the data.
      • PlantTribes used different anaysis than Phytome
        • Phytome: look at underlying sequence
        • PlantTribes: see if there is some fraction of unigene overlap
      • Jim said the information that is provided depends on where we are in the process
        • BLAST results - put information in a table there (taxon, ID of unigene/gene).
          • additional information from EST -> not usually annotated w/ unigenes, just have the sequence
          • Pre-run BLAST and show the Definition Line for the best hit
          • then, check genes in context of the tree
          • then, provide more precise information (source taxon, unigene ID/gene ID)
    • Big Tree (from Feb. 1 REVISIT point)
      • NCBI taxonomic tree is available as a stand-in until the "Big Tree" is available
        • Discussion of "annotations" regarding trees, this usually means the ability to "attach" information to taxa (like this)
      • An approach to visualization for tree reconciliation was mentioned regarding linked hyperbolic trees. Here is context:
        discussion
        • All tip names for clades as a whole (names should be in NCBI heirarchy)
        • ways to see how one tree maps to another - "branch of one links to the node of another"
        • So, tree reconciliation maps tips of species tree to the tips of the gene tree
          • Sidenote: need to use Todd the prototype tree viewer from Tree Viz group during NESCent visit (done, Feb. 10 - feedback provided to Adam Kubach)

{note transcription is complete (Andy)]}