April 2020 Darshan counters from the Summit supercomputer

10.13139/OLCF/1865904

This dataset is the Darshan counters collected from the Summit supercomputer in a month of April 2020. 1. Description of methods used for collection/generation of data: Job submitted on Summit HPC system when completed successfully and has made I/O calls (captured by Darshan tool) writes a Darshan log file on alpine filesystem. One job can have multiple `jsrun` commands and Darshan will generate separate logs each log corresponding to an `jsrun` command, so a job can have one or more Darshan logs associated with it. 2. Methods for processing the data: To process the data, we first use `darshan-util` tool to parse the Darshan logs. Then we restructure the logs and merge data from multiple Darshan logs if they belong to the same Summit job.

Published: 2022-05-03 18:37:36 Download Dataset

Dataset Properties

Field Value
Authors
  • Karimi, Ahmad Maroof Oak Ridge National Laboratory
  • Xie, Bing Oak Ridge National Laboratory
  • Paul, Arnab K. Oak Ridge National Laboratory
  • Oral, Sarp Oak Ridge National Laboratory
  • Wang, Feiyi Oak Ridge National Laboratory
Project Identifier STF008
Dataset Type ND Numeric Data
Subjects
  • 97 MATHEMATICS AND COMPUTING
Keywords
  • Supercomputer I/O subsystem
  • Summit supercomputer
  • Access Patterns
  • Darshan log
Software Needed There are several ways to read CSV files, we recommend to use Python framework and pandas library.
Originating Organizations Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organizations Office of Science (SC)
DOE Contract DE-AC05-00OR22725

Acknowledgements

Papers using this dataset are requested to include the following text in their acknowledgements:

*Support for 10.13139/OLCF/1865904 is provided by the U.S. Department of Energy, project STF008 under Contract DE-AC05-00OR22725. This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility.