Mediation analysis using incomplete information from publicly available data sources Journal Article


Authors: Derkach, A.; Kantor, E. D.; Sampson, J. N.; Pfeiffer, R. M.
Article Title: Mediation analysis using incomplete information from publicly available data sources
Abstract: Our work was motivated by the question whether, and to what extent, well-established risk factors mediate the racial disparity observed for colorectal cancer (CRC) incidence in the United States. Mediation analysis examines the relationships between an exposure, a mediator and an outcome. All available methods require access to a single complete data set with these three variables. However, because population-based studies usually include few non-White participants, these approaches have limited utility in answering our motivating question. Recently, we developed novel methods to integrate several data sets with incomplete information for mediation analysis. These methods have two limitations: (i) they only consider a single mediator and (ii) they require a data set containing individual-level data on the mediator and exposure (and possibly confounders) obtained by independent and identically distributed sampling from the target population. Here, we propose a new method for mediation analysis with several different data sets that accommodates complex survey and registry data, and allows for multiple mediators. The proposed approach yields unbiased causal effects estimates and confidence intervals with nominal coverage in simulations. We apply our method to data from U.S. cancer registries, a U.S.-population-representative survey and summary level odds-ratio estimates, to rigorously evaluate what proportion of the difference in CRC risk between non-Hispanic Whites and Blacks is mediated by three potentially modifiable risk factors (CRC screening history, body mass index, and regular aspirin use). © 2024 John Wiley & Sons Ltd.
Keywords: adult; controlled study; case control study; united states; sensitivity analysis; colorectal cancer; incidence; prevalence; risk factors; cancer screening; risk factor; colorectal neoplasms; simulation; questionnaire; body mass; register; registries; acetylsalicylic acid; colonoscopy; colorectal tumor; computer simulation; epidemiology; observational study; computer model; logistic regression analysis; health status disparities; sigmoidoscopy; ethnicity; african american; health disparity; caucasian; hispanic; ethnology; aspirin; registry data; humans; human; male; female; article; information sources; data integration; mediation analysis; black or african american; white people; direct and indirect effects; summary level information; survey sampling; data source; information source
Journal Title: Statistics in Medicine
Volume: 43
Issue: 14
ISSN: 0277-6715
Publisher: John Wiley & Sons  
Date Published: 2024-06-30
Start Page: 2695
End Page: 2712
Language: English
DOI: 10.1002/sim.10076
PUBMED: 38606437
PROVIDER: scopus
PMCID: PMC12093256
DOI/URL:
Notes: The MSK Cancer Center Support Grant (P30 CA008748) is acknowledged in the PubMed record and PDF. Corresponding MSK author is Andriy Derkach -- Source: Scopus
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Elizabeth David Kantor
    40 Kantor
  2. Andriy Derkach
    148 Derkach