Skip to main content

Summit Darshan Archival Dataset

  • Karimi, Ahmad Maroof | Oak Ridge National Laboratory
  • Khan, Awais | Oak Ridge National Laboratory
  • Oral, Sarp | Oak Ridge National Laboratory
  • Zimmer, Christopher | Oak Ridge National Laboratory
Download dataset
Overview

Description

Summit Darshan Archival Dataset contains 2021 Summit Darshan log data for 25 applications and is grouped into science domains. The dataset is processed, and all the propriety fields are anonymized. The resultant data is converted into a tabular structure and saved in parquet file format. In this notebook, we demonstrate how to access the data. Data Organization: The data is organized into two directories: Darshan total (`darshan_total`): List all the high levels generated by the `darshan-parser --total` command on `.darshan` files. There is one parquet file for each application. Note: `uid` and `exe` field are masked Darshan detail (`darshan_detail`): This data contains detailed job level log information extracted by command `darshan-parser` on the raw `.darshan` files. The data is sorted by directory hierarchy in the order of `year/month/day (2021/12/07)`. For instance, to get the data for a `job_id` 3819766 of application `App11`, which was executed on `2021-12-07`can be accessed as follows. Note:`uid` and `filename` fields are masked

Funding resources

DOE contract number

DE-AC05-00OR22725

Originating research organization

Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)

Sponsoring organization

Oak Ridge Leadership Computing Facility, Oak Ridge National Laboratory

Details

DOI

10.13139/OLCF/2305496

Release date

February 15, 2024

Dataset

Dataset type

ND Numeric Data

Software

.parquet file reader

Acknowledgements

Users should acknowledge the OLCF in all publications and presentations that speak to work performed on OLCF resources:

This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.

Category

  • 97 MATHEMATICS AND COMPUTING

Keywords

  • Darshan,
  • Summit storage system