Skip to main content

Data Depositor Guide

Welcome! Constellation is a research data repository service of the Oak Ridge Leadership Computing Facility (OLCF), available for OLCF users as well as Oak Ridge National Laboratory (ORNL) staff. Datasets published in Constellation are assigned a Digital Object Identifier (DOI) by the DOE Office of Scientific and Technical Information (OSTI) and reported for inclusion in the DOE Data Explorer. Constellation can assist projects in meeting the objectives of the 2023 DOE Public Access Plan for federally funded research.

 

Data submitted to this repository may be the results of research:

1) conducted using a project allocation on OLCF resources including Frontier, Summit, Slate, Andes, etc., or

2) funded wholly or in part by the US Department of Energy under its contract with UT-Battelle, LLC (DE-AC05-00OR22725) for the management of ORNL.

For questions and help with data deposits, email doi_support@ornl.gov.

 

Steps to publish a dataset in Constellation

There are seven steps involved in submitting and publishing your dataset on Constellation:

  1. Obtain a Globus identity
  2. Create your Constellation user account
  3. Reserve a DOI for your dataset
  4. Add metadata and create a README file
  5. Add data files and submit your dataset for review
  6. Address any questions or concerns about your dataset received from ORNL curators
  7. Receive notification that your dataset has been published

See "Detailed submission instructions" below for more information on each step.

Staff members of Oak Ridge National Laboratory should also follow relevant SBMS procedures to Review and Release Scientific and Technical Information. A submission workflow for datasets is available in RESolution, where submitters will be asked to provide the DOI that has been reserved in Constellation. Datasets should be approved by a Releasing Official prior to final Constellation publication.

 

Detailed submission instructions

1. Obtain a Globus identity

a. Go to https://www.globus.org/get-started and click 'Log In.' Most non-profit organizations, including national labs, offer log in using institutional credentials.

b. If your institution is not on the dropdown list, obtain a new personal 'Globus ID' at https://www.globusid.org/create. You will be asked to verify your email address. Return to https://www.globus.org/get-started and log in.

c. After logging in, your Globus identity can be found under Settings > Account > Identity. Your primary identity is the first field displayed to the right of the crown icon and may be in the form of an email address or [globus_username]@globusid.org.

Image
where-to-get-globusid

 

2. Create your Constellation user account

a. Log in to the Constellation website using your ORNL UCAMS/XCAMS or OLCF credentials.

b. Enter your Globus primary identity and review and accept the User Agreement.

c. You will be taken to your account dashboard.

 

 3. Reserve a DOI for your dataset

a. Start the process of adding your dataset by clicking on the "Reserve new DOI" button in your user dashboard.

b. Enter a draft title for your dataset.

c. Indicate whether the data you are submitting was created using resources of the Oak Ridge Leadership Computing Facility (OLCF). OLCF resources include Frontier, Summit, Slate (Onyx, Marble), Andes, etc.

d. Click Save. Your dataset and reserved DOI will now be visible in your dashboard under "Draft datasets."

NOTE: This step does not actually issue a DOI - rather, it creates a request record with OSTI that will be finalized when the dataset is published. Prior to submitting a dataset for review, you can update the DOI metadata and data files as many times as needed.

 

 4. Add metadata and create a README

a. In your user dashboard, find your dataset in the "Draft datasets" table and click the "add" button under "Metadata." This will open a form that enables entry of metadata associated with the new DOI. The fields of this form are described below:


Title (required) Add a short descriptive title that will help users understand what the dataset contains. Titles may not exceed 255 characters.

Author Information (required) You can add multiple dataset authors, and you must have at least one author. Fill out all required fields for each author including: First Name, Last Name, Affiliation, and E-mail.

Sponsoring Organizations (required) Enter the name of the organization that sponsored (provided funding) for the dataset.

Originating Organizations (required) Enter the name of the organization that performed the research or issued the dataset.

Other Contributing Organizations (optional) Add any organizations that contributed to the dataset through significant review, site management, data collection, etc.

Primary DOE Contract Number(s) (required) Enter the DOE contract number(s) under which the work was funded. Separate multiple numbers with a semicolon and a space. This field is validated against the OSTI contract authority. Invalid and non-DOE contract numbers should instead be entered in the 'Other Contract Numbers' field.

OLCF Project Identifier or Title (required if OLCF resources used) Enter the Oak Ridge Leadership Computing Facility project identifier assigned to the research underlying the dataset. Project identifiers are usually six alphanumeric digits, such as 'ABC001'. If the project did not use an OLCF resource, do not complete this field.

Other Contract Number(s) (optional) Enter other funding identifiers that do not fit elsewhere in the form, including ORNL funding and non-DOE grants. Multiple contract numbers may be entered in this field, separated with a semicolon and a space.

Other Identifying Numbers (optional) Add any other identifying numbers, such as a product number, that do not fit anywhere else in this form.

Dataset Type (required) Select the data's main or most important content type.

Topics (optional) Select the main subject categories of the dataset from a list provided by OSTI. Multiple topics may be selected.

Keywords (optional) Enter a few terms that describe the most important content of the dataset and help users discover the data.

Description (required) Add a narrative description of the dataset being published, similar to an abstract. Descriptions should not exceed 300 words.

Software Needed (optional) Enter the name of any software needed to access the dataset contents.

Related Identifiers (optional) The DOI being created may be related to other DOIs or web content. Select whether the related identifier is a DOI or URL, enter the identifier link, and select a phrase that communicates the relationship of the Constellation dataset to the related item.

 

b. When all required fields are complete, click the “Save” button to save the DOI request as a draft. If any required field is omitted, you will see a “Data Entry Error” alert indicating the error. To resolve the issue, enter the missing information in the form and click the "Save" button again.

c. A README file is recommended for inclusion with all data deposits. A README provides additional context and citation information for your dataset and should help future researchers understand and reuse your data.

Download Constellation's README template:

 

5. Add data files and submit your dataset for review

a. You can access the Globus endpoint for a DOI reservation by clicking the "upload" link in your user dashboard table of draft datasets. This will open the Globus web interface to your assigned directory in the “OLCF DOI-UPLOADS" collection.

Files may be transferred from another Globus collection (ex. an OLCF storage system) or uploaded from your local machine. To add data files over 1 GB in size from your laptop or desktop, install Globus Connect Personal to create your own Globus collection.

b. Add your README file or other dataset documentation to the same directory.

c. To send your dataset for curator review and publication, return to your Constellation dashboard. Open the metadata form, change the dropdown status at the bottom of the page to "Needs Approval" and click "Save."

 

6. Address any questions or concerns about your dataset received from ORNL curators

a. Datasets that are submitted using the OLCF Constellation Portal are reviewed prior to publication by a member of the Constellation Data Curation team. Curators make sure data does not contain PII, check metadata and documentation for completeness, and make suggestions to improve data discoverability and reuse. You will be contacted by email from doi_support@ornl.gov if the curator has any concerns or is requesting changes to the dataset.

b. Your account dashboard will display a list of datasets that are in the process of being uploaded, reviewed, and published. Datasets are organized based on their current status:

        i. Draft datasets (you are working on compiling the submission)

        ii. My datasets under review (dataset has been submitted for curator review and approval)

        iii. My approved datasets (dataset has been approved by a curator and is being moved to the data repository)

        iv. My published datasets (dataset is available for public download)
 

 

7. Receive notification that your dataset has been published

Your dataset’s status will change to “Published” once it has been approved by a curator and the data files have moved to the ORNL data archive. The associated DOI will also be transmitted to OSTI for activation. With the “Published” status, your dataset is available for download by anyone. You will receive an email from Constellation staff to confirm that all publication steps are complete.