Structural Models and Sequence Alignment Results of the Desulfovibrio vulgaris Proteome
- Davidson, Russell B | Oak Ridge National Laboratory
- Coletti, Mark | Oak Ridge National Laboratory
- Gao, Mu | Georgia Tech
- Sedova, Ada | Oak Ridge National Laboratory
Overview
Description
This dataset contains the structural models for the primary transcripts of the Desulfovibrio vulgaris proteome as well as sequence alignment results for a subset of the encoded proteins. For each protein, the five models inferred from AlphaFold 2 are provided. The largest pTM-scoring model for each protein was energy minimized; this minimized structure as well as its AlphaFold pickle output file are also provided. This set of structures represent an alternate source of models for the D. vulgaris proteome to those available in the AlphaFold Protein Structure Database (AFDB). This is a bit more complicated since the proteins reporting in the AFDB originate from an outdated form of the D. vulgaris sequence. The different versions of the D. vulgaris gene annotation are collected in the Chronology subdirectory; further consideration of these changes on the structural space of the proteome are currently underway. For proteins that have been annotated as hypothetical, sequence alignment results from the HHblits and SAdLSA alignment methods are provided. These methods are often more capable to resolve sequence homology than other methods. Therefore, the results from both HHblits and SAdLSA are provided to identify possible homologs for these challenging proteins. Numerous sequence databases are utilized for these alignments. References AlphaFold v2 Multimer: https://doi.org/10.1101/2021.10.04.463034. References HHblits: hhtps://doi.org/10.1186/s12859-019-3019-7. References SAdLSA: hhtps://doi.org/10.3389/fbinf.2021.689960.
Funding resources
DOE contract number
ERKPA05, ERKP917Originating research organization
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)Sponsoring organization
Office of Science (SC);Office of Science (SC), Biological and Environmental Research (BER) (SC-23)Related resources
- References (DOI): https://doi.org/10.1101/2021.10.04.463034
- References (DOI): https://doi.org/10.1186/s12859-019-3019-7
- References (DOI): https://doi.org/10.3389/fbinf.2021.689960
Details
DOI
10.13139/ORNLNCCS/1988139Release date
July 12, 2023Dataset
Dataset type
ND Numeric DataSoftware
PythonAcknowledgements
Users should acknowledge the OLCF in all publications and presentations that speak to work performed on OLCF resources:
This work was carried out [in part] at Oak Ridge National Laboratory, managed by UT-Battelle, LLC for the U.S. Department of Energy under contract DE-AC05-00OR22725.
Category
- 59 BASIC BIOLOGICAL SCIENCES,
- 74 ATOMIC AND MOLECULAR PHYSICS,
- 97 MATHEMATICS AND COMPUTING
Keywords
- Protein Structure Prediction,
- Protein Sequence Alignment,
- Functional Annotation