To search content in this manual only, enter your query above. To search for content in the entire CyVerse wiki, enter your query at the top right.
__________________

DATA COMMONS USER MANUAL
Maintenance Notice 23 Oct 2018

CyVerse systems will be unavailable on Tuesday, October 23rd, from 7:00am to 5:00pm MST.
Check your local timezone here https://goo.gl/iLJg6R

Keep up to date with our maintenance schedules on the CyVerse public calendar
http://www.cyverse.org/maintenance-calendar

ACCESS TO OR USAGE OF THE FOLLOWING SERVICES WILL BE UNAVAILABLE OR DISRUPTED:

Discovery Environment:        7:00am to 5:00pm MST

The Discovery Environment will be unavailable while patches and updates are applied.
        ** Currently running analyses will be terminated. Please plan accordingly.

DataS
tore                    
7:00am to 5:00pm MST
The DataStore will be unavailable during the maintenance period.

Atmosphere                   
7:00am to 5:00pm MST
Atmosphere instances will be operational; however, you will not be able to use the Data Store within your instance.

User Portal                   7:00am to 5:00pm MST
The User Portal, http://user.cyverse.org, will be unavailable while we perform maintenance and updates.

Agave/Science API             7:00am to 5:00pm MST

The Agave/Science API will be unavailable during this maintenance period.


Please contact suppport@cyverse.org for any questions, or concerns.

Skip to end of metadata
Go to start of metadata

DRAFT. NOT FOR GENERAL USE.

Reserved DOIs are used for datasets that need a permanent identifier but cannot yet be made public. CyVerse only issues DOIs for datasets that will ultimately be made public, either in their current form or in a revised form. We cannot offer permanent identifiers for private datasets, but you can easily share data with selected collaborators before publication.

There are two type of reserved DOIs:

  1. Embargoed Reserved DOI: Datasets that are embargoed and will be released at some later date but will not change with public release 
  2. Versioned Reserved DOI: Datasets that will undergo changes before public release. In this case, the revised, published dataset is considered a new version of the original one and will have a different DOI. The two DOIs will be linked via their versions. At publication, all versions of the dataset should be made public, unless no research was ever done on the older versions. In other words, if anyone refers to the older version DOIs in a publication or talk, it should be made public.

Step 1: Before you begin, review the related pages and determine type of DOI

  1. First read Is Data Commons Curated Data right for my data?
  2. Check out the Permanent Identifier FAQs.
  3. Determine if you have an Embargoed Reserved DOI or a Versioned Reserved DOI, as described above.
  1. Step 2: Organize the dataset in the CyVerse Data Store

There are several steps to properly organizing your dataset. These include determining what data to include, how many identifiers to request, how to organize the data into folders, and creating the ReadMe file and data inventory.

Step 2.1. Determine what to include

A data collection may be composed of multiple files and different datasets. In preparing your data for publication:

  1. Identify the data and other materials that you consider useful for validation and reuse of your research:
    • Data associated to a research project may include multiple files with different roles.
    • If there are components of your dataset that belong in a public repository such as NCBI (e.g., fastq files), submit them to the repository, rather than to CyVerse Curated Data.
  2. Beyond data, you will include the ReadMe file (see Step 4), and you may include scripts or links to scripts to run your analysis.

Step 2.2. Determine how many permanent identifiers to request

To determine how many DOIs to request for a given data collection, consider the following:

  • Think about its size and components.
  • How many studies or publications does it represent?
  • Is your data collection formed by different datasets and are those likely to be used separately?
  • Do you want to create a data collection with one DOI for the entire project and additional related DOIs for distinct datasets so that they are cited individually?

If you are uncertain about how many DOIs to request, contact us at doi@cyverse.org.

Step 2.3. Organize your data into folder(s)

  1. Organize your data so that there is one folder for each DOI (see CyVerse Curated Data folder-naming guidelines for naming conventions).
  2. Within a folder, include all files in your data package plus the ReadMe file and the inventory.
    • You may have subfolders within a data package.
    • You may include compressed files in a package, as described on the Permanent Identifier FAQs, but do not compress the entire folder/package.

Step 2.4. Name your top level folder according to the guideline

The folder containing your dataset should be named using the $Creator_$subject_$date format.

For more details on folder naming, see the CyVerse Curated Data Folder-Naming Guidelines.

 Step 2.5. Create a ReadMe file

Create a text file labeled "readMe" with the following information:

Step 2.6. Create an inventory

2.7 Supporting documents on data management and organization

Here is a useful guide to data organization: Research Data Management: File Organization (PDF).

Step 3: Add metadata to your folder

  1. In the Data window, click the checkbox next to the folder, then select Metadata > Edit / View Metadata
    1. Alternatively, you can choose Edit / View Metadats from the three dot menu next to the file.
  2. Click on + Select Template and and choose the DOI Request / Datacite metadata template.
  3. Complete the required fields (marked with an asterisk) and as many of the optional fields as possible.
  4. Save the template. For more information, including how to apply metadata in bulks, see Using Metadata in the DE.
  5. For Versioned Reserved DOI, you must include a version number in the metadata. We recommend using 1,2,3... for versioning, but you may use another system.
  6. You may add any additional metadata that is appropriate. We encourage the use of additional metadata to make your data better understood and more discoverable.
  7. We also encourage the use of metadata on subfolders and individual files in your datasets

Step 4: For Versioned Reserved DOI only: Copy data if you will modify your dataset before public release

  1. If the final published dataset is likely to be different from the original version, follow these steps. 
  2. To maintain a record of the original dataset, make a copy of the complete dataset (the folder and all its contents). Copy the metadata from the original dataset to the duplicate. NOTE: You must use iCommands to copy data (use icp -r). This step cannot be done in the DE. 

  3. One of the copies should remain in your private directory. This is the copy that can be edited before release. You may rename this folder if you want.

  4. Use the other copy for the remaining steps.

Step 5: Submit the request for the DOI

  1. In the Data window, click the checkbox next to the folder.
  2. Select Metadata > Request DOI.

  3. After verifying you have read the manual (this page), click I need a DOI. You will receive a verification email that your request has been received and a notification will be listed in the Notifications list in the DE (Notifications at the top right of the screen).

Step 6: Contact the DOI curators

  1. At this point, if you do not contact the curators, your dataset will be made public!
  2. Send and email to doi@cyverse.org asking for a reserved DOI. Include:
    1. the name of the directory (folder)
    2. whether your dataset is an embargoed reserved DOI or a versioned reserved DOI.


Step 7: Wait for CyVerse validation checks

After submitting your request, a CyVerse Curated Data curator begins validating your dataset, metadata, and overall configuration of your dataset. Validations are based solely on the required DOI metadata and folder-naming conventions, as well as its potential utility to the CyVerse and larger scientific community—not the quality of your data.

  • If the curator determines that the dataset is adequately organized and the DataCite metadata are accurate, they will provide a DOI, and you will be notified of the DOI and the location of its corresponding landing page in the Community Data > commons_repo > curated folder in the DE.
  • If the curator determines that minor changes are needed (e.g. typos in metadata or the readme file), they may make those changes themselves.
  • If the curator determines that substantive changes are needed, they will contact you with required changes.
  • If the curator determines that your dataset is not appropriate for the Curated Data site, you will be notified.

To check the status of your request, click Notifications at the top right of the DE screen. For more information on using notifications in the DE, see Viewing and Deleting Notifications.

NOTE

Icon

Because the dataset is reserved, a curator will alter the permissions on the folder so that it does not show up to the registered users or the public. The dataset will not be visilbe at http://datacommons.cyverse.org/. You will be given permission to view the data, which you can share with others.


Step 8: Share the reserved data with collaborators

  1. Following the normal procedure, upon creation of the DOI, the dataset will move to the Community Data > commons_repo > curated folder in the DE and be visible at datacommons.cyverse.org/commons_repo/curated.
  2. Within an hour, the curator will remove public permission from this dataset and give you read permission. If you need to authorize others to read the dataset, please notify the curator by sending CyVerse user names of your collaborators or reviewers.
  3. The curator will retain ownership permission of the data, in case any changes are needed. 

NOTE

Icon

The DOI indicates that the dataset is stable, so it must not change. If you need to make corrections or changes to the metadata, please ask the curator at doi@cyverse.org to do so. This is important, because the metadata must be updated in multiple locations.

If you need to make changes to the dataset, but don't feel a new version is warrented, please email doi@cyverse.org.

Step 9: For Versioned Reserved DOI only: Edit the original dataset and create new versions

  1. Follow this step only if you plan to make changes to the original dataset and release a new version.
  2. When collaborators suggest changes to the dataset, make the changes in the original copy in your private data directory.
  3. When you want to release a new stable version of the dataset, follow the steps above. Be sure to include version numbers in the metadata.
  4. The curator will add links among different versions of the dataset.

TIP

Icon

You should release a new stable version of the dataset any time changes will be cited in a presentation or publication. The author should cite the DOI of the dataset that includes their changes.

Step 10A: For Versioned Reserved DOI only: Publish your final dataset

  1. When no more changes are needed to the dataset and it is ready for public release, follow the steps on Requesting a Permanent Identifier in the Data Commons for the final dataset.
  2. Be sure to include the appropriate version number in the metadata.
  3. Upon publication, the curator will:
    • Create a DOI for the final dataset and make it public.
    • Verify that all versions of the dataset are properly linked.
    • Make all older versions of the dataset public.

Step 10B: For Embargoed Reserved DOI only: Publish your final dataset

  1. Contact the curator at doi@cyverse.org to ask for the embargo to be removed from your dataset.
  2. Upon publication, the curator will make the dataset public. It will then be visible to anyone at http://datacommons.cyverse.org/ or to any logged in CyVerse users on the Discovery environment. 
  • No labels