Affinity regression predicts the recognition code of nucleic acid-binding proteins Journal Article


Authors: Pelossof, R.; Singh, I.; Yang, J. L.; Weirauch, M. T.; Hughes, T. R.; Leslie, C. S.
Article Title: Affinity regression predicts the recognition code of nucleic acid-binding proteins
Abstract: Predicting the affinity profiles of nucleic acid-binding proteins directly from the protein sequence is a challenging problem. We present a statistical approach for learning the recognition code of a family of transcription factors or RNA-binding proteins (RBPs) from high-throughput binding data. Our method, called affinity regression, trains on protein binding microarray (PBM) or RNAcompete data to learn an interaction model between proteins and nucleic acids using only protein domain and probe sequences as inputs. When trained on mouse homeodomain PBM profiles, our model correctly identifies residues that confer DNA-binding specificity and accurately predicts binding motifs for an independent set of divergent homeodomains. Similarly, when trained on RNAcompete profiles for diverse RBPs, our model correctly predicts the binding affinities of held-out proteins and identifies key RNA-binding residues, despite the high level of sequence divergence across RBPs. We expect that the method will be broadly applicable to modeling and predicting paired macromolecular interactions in settings where high-throughput affinity data are available. © 2015 Nature America, Inc.
Keywords: binding affinity; protein domain; protein motif; proteins; protein dna binding; protein binding; transcription factor; high throughput screening; rna binding protein; rna; statistical analysis; amino acid sequence; microarray analysis; molecular recognition; nucleic acids; binding affinities; binding energy; homeodomain protein; dna binding; biochemistry; biomolecules; protein microarray; dna binding motif; throughput; protein nucleic acid interaction; protein rna binding; nucleic acid binding protein; rna-binding protein; priority journal; article; dna-binding specificity; acid-binding proteins; interaction model; rna-binding residues; sequence divergences; statistical approach; affinity regression
Journal Title: Nature Biotechnology
Volume: 33
Issue: 12
ISSN: 1087-0156
Publisher: Nature Publishing Group  
Date Published: 2015-12-01
Start Page: 1242
End Page: 1249
Language: English
DOI: 10.1038/nbt.3343
PROVIDER: scopus
PUBMED: 26571099
PMCID: PMC4871164
DOI/URL:
Notes: Article -- Export Date: 7 January 2016 -- 1242 -- Source: Scopus
Altmetric Score
MSK Authors
  1. Christina Leslie
    106 Leslie
  2. Irtisha Singh
    8 Singh
  3. Li   Yang
    3 Yang