2014.03.27 BIEN db

BIEN Database

March 27, 2014

Participants

Aaron, Brad, Mark, Martha

Agenda

  • Review progress and address questions on quantitative validations (Aaron, 50 min.)
  • Outline next steps (5 min.)

Previous Week

Aaron's To Do List
Write the additional plot query assigned last week (#19)

Recall that Aaron will complete the validations using the current schema.

Remaining (multi-week) validation work for Aaron:
1. Write the specimen output queries
2. Complete all plot input queries, including modifying Brad's FIA input queries
3. Write the specimen input queries
4. Validate all plot datasets
5. Validate all specimen datasets

Notes

  • Specimen input queries are finished.
  • Specimen output queries are about half done.

Order of work for Aaron

  • First
    • Specimen input validation queries are done, but Aaron needs to change to using concatenated name.
    • Write output queries for specimens (estimates 1-2 days)
  • Second
    • TEAM: Rename columns (half-1 day), denormalizing (half-1 day)
    • Madidi: Already renamed, just needs denormalize (half-1 day)
    • So estimates will have them done in 2-3 days.
  • Third
    • Write the VegCore input queries for plots (maybe 3 days).

Decisions

plots aggregating validations
  • won't denormalize SALVIAS because already have input queries for it (Brad)
  • validate FIA last because it's a special case (Brad)
specimens aggregating validations
  • OK to run NY validations when writing specimens output queries instead of at the end with the other specimens datasources (Brad)
  • when writing specimens output queries based on NY input queries, treat query name as authoritative rather than query implementation (Brad)
  • use taxonoccurrence as the main specimen table
  • use concatenated taxon name instead of concatenating the ranks, since not all specimens datasources provide the ranks
new-style import
  • needs to include the denormalization of normalized datasources
NY
  • use artificial key as pkey instead of removing rows that are missing an accessionNumber (Brad)

To Do for Aaron

aggregating validations
  1. finish specimens output queries
    • use concatenated taxon name instead of concatenating the ranks
    • in #1, use taxonoccurrence instead of location as the main specimen table
  2. run specimens output queries on NY to test them
  3. denormalize normalized plots datasources: TEAM, Madidi
  4. write denormalized plots input queries
  5. finish fixing plots output queries
  6. validate plots datasources: SALVIAS, VegBank, CVS, TEAM, Madidi, CTFS, FIA
  7. validate specimens datasources
new-style import
NY
FIA