Cultivated species meeting - 3 Feb 2014

Approaches to detecting cultivated species in the BIEN3 database

Present at meeting: Brad Boyle, Bob Peet, Peter Jorgensen, Brian Enquist

Background

Methods used previously for BIEN2:

  • Any "cultivated" field in original database. Problem: rarely available
  • Filter by keywords in specimen and locality descriptions ("cultivated", "planted", "farm", "garden", etc). Problem: false positives
  • Filter by proximity to herbarium, <=3 km (assumption: many herbaria have botanical gardens nearby). Problem: false positives
  • Filter out by country observations for a few well-know taxa (e.g., Pinus in any country south of Nicaragua). Problem: very coarse, highly incomplete.
  • Result: column 'isCultivated', 0=no, 1=yes.

What do we want to detect?

  • Cultivated (non-natural observation of individual plant, planted by human)
  • Introduced (observation of species not native to declared locality)
  • "Invasive" is too subjective, and may apply to native or non-native species. Out of scope for BIEN3.

Some potential approaches:

Blacklists

  1. Filter by exact location or polygon (e.g., All St. Louis, Shaw Arboretum)
  2. Filter out families and genera known to be endemic to Old World
    • Challenge: taxonomic issues could lead to many false positive or false negatives
  3. Filter out species endemic to Old World countries
  4. Filter out known introduced or invasive species in New World countries
    • "Invasive" can apply to native or introduced species. 
    • Therefore, require very clear definition of meaning of invasive.

Whitelists

  1. Filter in observations of species known to be native to a particular checklist region
    • Challenge: can still have have individual plants planted or cultivated outside their natural range within a region
      • For example: Pinus caribbaea plantation in Nicaragua
      • Perhaps use these lists to flag observations as "likely native"

Outlier algorithms

  • Look for extremes of range size, latitudinal and longitudinal breadth
    • Before range modeling (extreme lat, long, etc)
    • After modeling, or as part of modeling algorithm: major disjuncts

Additional suggestions:

  • Important to cite sources for each decision

Data sources

  • Global invasive species database (http://www.issg.org/database/welcome/)
  • USDA Plants
    • Country, state and county lists
  • Smithsonian lists (Caribbean, Guiana Shield, etc.)
  • Tropicos API: 
    • Get Name Distributions
    • Catalogs of Peru, Bolivia, Ecuador

Deliverables

Peter: send to Brad catalogs for Peru, Bolivia, Ecuador DONE

Bob: send to Brad more information regarding access to Smithsonian lists

All: any new ideas for sources, send to Brad

Brad:

  • Investigate "Get Name Distributions" Tropicos API call
  • Get started!