Multiomic integration of public oncology databases in bioconductor Journal Article

Authors: Ramos, M.; Geistlinger, L.; Oh, S.; Schiffer, L.; Azhar, R.; Kodali, H.; de Bruijn, I.; Gao, J.; Carey, V. J.; Morgan, M.; Waldron, L.
Article Title: Multiomic integration of public oncology databases in bioconductor
Abstract: PURPOSE Investigations of the molecular basis for the development, progression, and treatment of cancer increasingly use complementary genomic assays to gather multiomic data, but management and analysis of such data remain complex. The cBioPortal for cancer genomics currently provides multiomic data from . 260 public studies, including The Cancer Genome Atlas (TCGA) data sets, but integration of different data types remains challenging and error prone for computational methods and tools using these resources. Recent advances in data infrastructure within the Bioconductor project enable a novel and powerful approach to creating fully integrated representations of these multiomic, pan-cancer databases. METHODS We provide a set of R/Bioconductor packages for working with TCGA legacy data and cBioPortal data, with special considerations for loading time; efficient representations in and out of memory; analysis platform; and an integrative framework, such as MultiAssayExperiment. Large methylation data sets are provided through out-of-memory data representation to provide responsive loading times and analysis capabilities on machines with limited memory. RESULTS We developed the curatedTCGAData and cBioPortalData R/Bioconductor packages to provide integrated multiomic data sets from the TCGA legacy database and the cBioPortal web application programming interface using the MultiAssayExperiment data structure. This suite of tools provides coordination of diverse experimental assays with clinicopathological data with minimal data management burden, as demonstrated through several greatly simplified multiomic and pan-cancer analyses. CONCLUSION These integrated representations enable analysts and tool developers to apply general statistical and plotting methods to extensive multiomic data through user-friendly commands and documented examples. © 2020 American Society of Clinical Oncology. All rights reserved.
Journal Title: JCO Clinical Cancer Informatics
Volume: 4
ISSN: 2473-4276
Publisher: American Society of Clinical Oncology  
Date Published: 2020-01-01
Start Page: 958
End Page: 971
Language: English
DOI: 10.1200/cci.19.00119
PUBMED: 33119407
PROVIDER: scopus
PMCID: PMC7608653
Notes: Article -- Export Date: 1 December 2020 -- Source: Scopus
Citation Impact
MSK Authors
  1. Jianjiong Gao
    98 Gao