REUNION: Transcription factor binding prediction and regulatory association inference from single-cell multi-omics data Journal Article


Authors: Yang, Y.; Pe’er, D.
Article Title: REUNION: Transcription factor binding prediction and regulatory association inference from single-cell multi-omics data
Abstract: Motivation: Profiling of gene expression and chromatin accessibility by single-cell multi-omics approaches can help to systematically decipher how transcription factors (TFs) regulate target gene expression via cis-region interactions. However, integrating information from different modalities to discover regulatory associations is challenging, in part because motif scanning approaches miss many likely TF binding sites. Results: We develop REUNION, a framework for predicting genome-wide TF binding and cis-region-TF-gene “triplet” regulatory associations using single-cell multi-omics data. The first component of REUNION, Unify, utilizes information theory-inspired complementary score functions that incorporate TF expression, chromatin accessibility, and target gene expression to identify regulatory associations. The second component, Rediscover, takes Unify estimates as input for pseudo semi-supervised learning to predict TF binding in accessible genomic regions that may or may not include detected TF motifs. Rediscover leverages latent chromatin accessibility and sequence feature spaces of the genomic regions, without requiring chromatin immunoprecipitation data for model training. Applied to peripheral blood mononuclear cell data, REUNION outperforms alternative methods in TF binding prediction on average performance. In particular, it recovers missing region-TF associations from regions lacking detected motifs, which circumvents the reliance on motif scanning and facilitates discovery of novel associations involving potential co-binding transcriptional regulators. Newly identified region-TF associations, even in regions lacking a detected motif, improve the prediction of target gene expression in regulatory triplets, and are thus likely to genuinely participate in the regulation. © The Author(s) 2024. Published by Oxford University Press.
Keywords: metabolism; computational biology; protein binding; transcription factor; algorithms; transcription factors; mononuclear cell; leukocytes, mononuclear; algorithm; chromatin; genomics; binding site; binding sites; bioinformatics; software; single cell analysis; single-cell analysis; procedures; humans; human; multiomics
Journal Title: Bioinformatics
Volume: 40
Issue: Suppl. 1
ISSN: 1367-4803
Publisher: Oxford University Press  
Date Published: 2024-07-01
Start Page: i567
End Page: i575
Language: English
DOI: 10.1093/bioinformatics/btae234
PUBMED: 38940155
PROVIDER: scopus
PMCID: PMC11211829
DOI/URL:
Notes: The MSK Cancer Center Support Grant (P30 CA008748) is acknowledged in the PubMed record and PDF. Corresponding MSK author is Dana Pe’er -- Source: Scopus
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Dana Pe'er
    110 Pe'er
  2. Yang Yang
    2 Yang