Skip to main content

Structural Models and Sequence Alignment Results of the Rhodospirillum rubrum Proteome

  • Davidson, Russell B | Oak Ridge National Laboratory
  • Coletti, Mark | Oak Ridge National Laboratory
  • Gao, Mu | Georgia Tech
  • Sedova, Ada | Oak Ridge National Laboratory
Download dataset
Overview

Description

This dataset contains the structural models for the primary transcripts of the Rhodospirillum rubrum proteome as well as sequence alignment results for a subset of the encoded proteins. For each protein, the five models inferred from AlphaFold 2 are provided. The largest pTM-scoring model for each protein was energy minimized; this minimized structure as well as its AlphaFold pickle output file are also provided. This set of structures represent an alternate source of models for the R. rubrum proteome to those available in the AlphaFold Protein Structure Database. For proteins that have been annotated as hypothetical, sequence alignment results from the HHblits and SAdLSA alignment methods are provided. These methods are often more capable to resolve sequence homology than other methods. Therefore, the results from both HHblits and SAdLSA are provided to identify possible homologs for these challenging proteins. Numerous sequence databases are utilized for these alignments. References AlphaFold v2 Multimer: https://doi.org/10.1101/2021.10.04.463034. References HHBlits: https://doi.org/10.1186/s12859-019-3019-7. References SAdLSA: https://doi.org/10.3389/fbinf.2021.689960.

Funding resources

DOE contract number

ERKPA05, ERKP917

Originating research organization

Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)

Sponsoring organization

Office of Science (SC);Office of Science (SC), Biological and Environmental Research (BER) (SC-23)

Related resources

Details

DOI

10.13139/ORNLNCCS/1987876

Release date

July 7, 2023

Dataset

Dataset type

ND Numeric Data

Software

Python

Acknowledgements

Papers using this dataset are requested to include the following text in their acknowledgements:

*Support for 10.13139/ORNLNCCS/1987876 is provided by the U.S. Department of Energy, project BIF135 under Contract ERKPA05, ERKP917. This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility.

Category

  • 59 BASIC BIOLOGICAL SCIENCES,
  • 74 ATOMIC AND MOLECULAR PHYSICS,
  • 97 MATHEMATICS AND COMPUTING

Keywords

  • Protein Structure Prediction,
  • Protein Sequence Alignment,
  • Functional Annotation