Skip to main content

SARS-CoV2 Docking Dataset for MLMol Language Model (50M)

  • Tsaris, Aristeidis | Oak Ridge National Laboratory
  • Gounley, John | Oak Ridge National Laboratory
  • Blanchard, Andrew | Oak Ridge National Laboratory
Download dataset
Overview

Description

This is a processed molecular dataset from this https://doi.ccs.ornl.gov/ui/doi/348 adding up to 50M molecules for the training and 486K molecules for the validation. Instructions on how to use/run/train this dataset can be found here: https://code.ornl.gov/candle/mlmol

Funding resources

DOE contract number

DE-AC05-00OR22725

Originating research organization

Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)

Sponsoring organization

Office of Science (SC)

Details

DOI

10.13139/ORNLNCCS/1868526

Release date

May 20, 2022

Dataset

Dataset type

ND Numeric Data

Acknowledgements

Users should acknowledge the OLCF in all publications and presentations that speak to work performed on OLCF resources:

This work was carried out [in part] at Oak Ridge National Laboratory, managed by UT-Battelle, LLC for the U.S. Department of Energy under contract DE-AC05-00OR22725.