2014.06.05 BIEN db

BIEN Database

June 5, 2014


Aaron Marcuse-Kubitza, Mark Schildhauer, Martha Narro


  • Data dictionary (Aaron)
  • Taxon name rescrub (Aaron)


Data Dictionary (for VegBIEN and associated tables)

  • We need to have the draft Data Dictionary ready to show to reviewers by 5 PM Pacific time Monday, June 9th so they have time to review the terms prior to the Thursday call.
    • This deadline can not be changed because next week is the only time key people are available to do the review during June.
  • The Data Dictionary is needed for several purposes including, but not limited to:
    • Scientists who need to understand column headings in analytical tables such as viewFullOccurrence.
    • Scientists, informaticians, developers who will access the full BIEN3 database to create extracts or additional analytical tables.
      • These users will also need the VegBIEN schema in addition to the Data Dictionary.
  • The staged approach to defining the terms was refined a bit more and now is:
    • 1) Define the view full occurrence terms.
    • 2) Identify and define the additional biological, ecological, geospatial, provenance and institutional attributes in VegBIEN and associated tables.
    • 3) Last, define the house keeping terms like IDs in VegBIEN and associated tables.
  • Input on how to present the terms and definitions
    • Aaron showed the format he's planning to use.
    • Needs to add a column for the database table the terms come from.
    • The provenance of derived terms need to be defined beyond stating they are “derived”. Describe how they are derived. Provide the provenance of the term.
    • Capture the actual definitions. For example, if they are already defined in VegCore, copy and paste the definition from VegCore.
    • Include both the definition from VegCore and the link to the source it came from. The link is important for providing additional clarification and context to the user/reviewer.
      • But we don't solely want to rely on the link since the definition from the source could change thereby diverging from the term as defined when VegBIEN was created.
  • Aaron expressed concerns about getting the terms defined by the Monday deadline. Mark and Martha reiterated that they think this will be easy and only take a day once he gets started. It's mostly copying and pasting.
  • To get started, Aaron is to create the definitions, with links to the source, for 6 terms this by sometime this evening.
    • Send them to Mark and Martha for feedback.

Taxon name rescrubbing

  • Nicole and Jerry are working on the problem with a rare, problematic name causing the TNRS development server to crash. 
  • The scrubbing is again running in the background.
  • Aaron needs to should not do any other work on taxon scrubbing in order to get the draft Data Dictionary completed by Monday.
    • After the Data Dictionary has been reviewed and is satisfactory, return to the taxon scrubbing task.


  • The top priority is now to get the Data Dictionary drafted by 5 PM Monday, June 9th.
  • It would be good if the taxon name scrubbing could continue running in the background, but if additional problems crop up, do not let them become a distraction. File a bug report, contact iPlant support and return to working on the Data Dictionary.

To Do


  • In the table with definitions add a column for the database table the terms come from.
  • For derived terms describe how they are derived and provide the provenance of the term.
  • Include both the definition from VegCore and the link to the source it came from
  • To get started, create the definitions, with links to the source, for about 6 terms this by sometime this evening.
    • Send them to Martha (and Mark) for feedback.
  • Continue working on defining terms to get as many as possible done before the Friday call.


  • Provide feedback on the initial set of terms Aaron sends out.

Mark, Martha, Aaron

  • Have a call at 9 AM tomorrow, Friday, June 6th, to review what Aaron has gotten done on the Data Dictionary.