Meeting Report RCN UBE DataInInquiry for Education

RCN UBE DataInInquiry for Education

Meeting Report

Everything we are reporting is preliminary, based on discussions during the workshop. It will be refined and may change somewhat, but the major points are unlikely to change. The group's goal is to write a final workshop report by the end of August.

There are two aspects to making data more accessible for undergraduate teaching purposes. One is overcoming the technical challenges surrounding making diverse, distributed datasets discoverable and integrating the data once found. The second is overcoming obstacles to using data to achieve learning objectives, including clearly stating what those learning objectives are. On the technical side, many of the items listed below should be considered by the Data Strategy group.

Technical Solutions

  • The group identified some technical solutions that exist now and should be made more widely known and adopted:
    • DataONE's DataUp tool provides an easy “on ramp” to introduce students and scientists to using ontology terms to describe their data. It enables people to standardize column and row heading terms used in Excel spreadsheets to use ontological terms.
    • Metacat is used for registering data that has been encoded in EML for later searching.
    • Bioportal.ontology.org for discovering which ontologies terms may belong to.
    • Taxonomic Name Resolution Service (http://tnrs.iplantcollaborative.org/) for standardizing plant taxonomic names within and, importantly, between datasets prior to merging them.
  • Best Practices
  • Tools that need to be developed to address technical challenges:
    • There wasn't much discussion of what needs to be developed.
    • Semantic web solutions were mentioned, but the comment was that none were yet ready for widespread adoption.
    • Most discussion focused on the need to use ontologies.

Technical Recommendations

  • Data providers should produce:
    • better information on the scope of their repositories
    • a diagram/schema of the data thatscientistscan understand
    • description of the nature of the data (raw, “cleaned up” and if so how, derived, curated)
    • description of what can be done with the data and should not be done (could be a list of best practices references in literature, e.g., in climate modeling http://journals.ametsoc.org/doi/abs/10.1175/BAMS-D-11-00054.1
    • should use metadata and ontology terms, and instruct users on how to take advantage of them
    • There is a critical need for tutorials or “how to” instructions providing examples of how to find data at beginner, intermediate and advanced levels. Data repositories can be very opaque to first time users. NCBI provides such support and can be referred to for examples.
    • These tools are best when they are matched up (or designed around) domain topics that extend learning and/or statistical methodologies that are often used with these methods (e.g., multivariate techniques if pulling potential “driver” data from multiple sources or ordination techniques if pulling down community data)

Technical Follow-up

  • If iPlant's Data Strategy group is interested in making DataUp available within our CI, follow up with William Michener at DataONE.
  • Martha will follow up with Lou Gross to see what they are doing to help users with R.
  • The Data Strategy group should work more closely with Brian Heidorn and others in Library and Information Sciences, including finding out more about their course on data management and curation.

Education

Some big issues

  • Data being seen in academia as "belonging" to Mathematics and/or Comp Sci
  • Most faculty appear clueless and/or resistant to using data in teaching
  • Need pilots, dissemination and funding
    • Funding agencies such as NSF (CAUSE), HHMI
    • Disseminate through professional societies that have educational branches: ASM, ASCB, ASPB, ....

Some goals to use data in teaching

  • Teach ecological concepts using data
  • Teach data handling skills
  • Teach data literacy

Some Objectives

  • Identify appropriate data sets and ways to access.
  • Develop prototype tools that students would use to work with the data.
  • Research how students learn with data sets.
  • Develop and test educational modules.
  • Introduce students to scientific workflows.

Some initiatives

  • Sam Donovan (BioQuest) heading a group that plans to work on a White Paper on using data in teaching.
  • Brian Heidorn (UA) and others to work on funding ideas to put the objectives above into place.
  • More to come from the final workshop report...
  • In the meantime, peruse the workshop materials at https://sites.google.com/a/umich.edu/dataininquiry/