Direct-coupling analysis of residue coevolution captures native contacts across many protein families Journal Article


Authors: Morcos, F.; Pagnani, A.; Lunt, B.; Bertolino, A.; Marks, D. S.; Sander, C.; Zecchina, R.; Onuchic, J. N.; Hwa, T.; Weigt, M.
Article Title: Direct-coupling analysis of residue coevolution captures native contacts across many protein families
Abstract: The similarity in the three-dimensional structures of homologous proteins imposes strong constraints on their sequence variability. It has long been suggested that the resulting correlations among amino acid compositions at different sequence positions can be exploited to infer spatial contacts within the tertiary protein structure. Crucial to this inference is the ability to disentangle direct and indirect correlations, as accomplished by the recently introduced direct-coupling analysis (DCA). Here we develop a computationally efficient implementation of DCA, which allows us to evaluate the accuracy of contact prediction by DCA for a large number of protein domains, based purely on sequence information. DCA is shown to yield a large number of correctly predicted contacts, recapitulating the global structure of the contact map for the majority of the protein domains examined. Furthermore, our analysis captures clear signals beyond intradomain residue contacts, arising, e.g., from alternative protein conformations, ligand-mediated residue couplings, and interdomain interactions in protein oligomers. Our findings suggest that contacts predicted by DCA can be used as a reliable guide to facilitate computational predictions of alternative protein conformations, protein complex formation, and even the de novo prediction of protein domain structures, contingent on the existence of a large number of homologous sequences which are being rapidly made available due to advances in genome sequencing.
Keywords: controlled study; dna binding protein; binding affinity; protein conformation; protein domain; proteins; accuracy; reproducibility of results; computational biology; protein binding; algorithms; rna; dna; amino acid sequence; protein multimerization; sequence alignment; clinical evaluation; models, molecular; binding sites; amino acids; dna binding; protein interaction mapping; predictive value; cross coupling reaction; statistical sequence analysis; residue-residue covariation; maximum-entropy modeling; contact map prediction; direct coupling analysis
Journal Title: Proceedings of the National Academy of Sciences of the United States of America
Volume: 108
Issue: 49
ISSN: 0027-8424
Publisher: National Academy of Sciences  
Date Published: 2011-12-06
Start Page: E1293
End Page: E1301
Language: English
DOI: 10.1073/pnas.1111471108
PROVIDER: scopus
PMCID: PMC3241805
PUBMED: 22106262
DOI/URL:
Notes: --- - "Export Date: 1 February 2012" - "CODEN: PNASA" - "Source: Scopus"
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Chris Sander
    210 Sander