SARS-CoV2 Docking Dataset for MLMol Language Model (50M)
- Tsaris, Aristeidis | Oak Ridge National Laboratory
- Gounley, John | Oak Ridge National Laboratory
- Blanchard, Andrew | Oak Ridge National Laboratory
Overview
Description
This is a processed molecular dataset from this https://doi.ccs.ornl.gov/ui/doi/348 adding up to 50M molecules for the training and 486K molecules for the validation. Instructions on how to use/run/train this dataset can be found here: https://code.ornl.gov/candle/mlmol
Funding resources
DOE contract number
DE-AC05-00OR22725Originating research organization
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)Sponsoring organization
Office of Science (SC)Details
DOI
10.13139/ORNLNCCS/1868526Release date
May 20, 2022Dataset
Dataset type
ND Numeric DataAcknowledgements
Users should acknowledge the OLCF in all publications and presentations that speak to work performed on OLCF resources:
This work was carried out [in part] at Oak Ridge National Laboratory, managed by UT-Battelle, LLC for the U.S. Department of Energy under contract DE-AC05-00OR22725.