A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns Journal Article


Authors: Jiao, W.; Atwal, G.; Polak, P.; Karlic, R.; Cuppen, E.; Danyi, A.; de Ridder, J.; van Herpen, C.; Lolkema, M. P.; Steeghs, N.; Getz, G.; Morris, Q.; PCAWG Tumor Subtypes and Clinical Translation Working Group; & PCAWG Consortium
Contributors: Abeshouse, A.; Al-Ahmadie, H.; Armenia, J.; Chen, H. W.; Davidson, N. R.; Gao, J.; Ghossein, R.; Giri, D. D.; Gundem, G.; Heins, Z.; Huse, J.; Iacobuzio-Donahue, C. A.; Kahles, A.; King, T. A.; Kundra, R.; Lehmann, K. V.; Levine, D. A.; Liu, E. M.; Ochoa, A.; Pastore, A.; Rätsch, G.; Reis-Filho, J.; Reuter, V.; Roehrl, M. H. A.; Sanchez-Vega, F.; Sander, C.; Schultz, N.; Senbabaoglu, Y.; Singer, S.; Socci, N. D.; Stark, S. G.; Vázquez-García, I.; Yellapantula, V. D.; Zhang, H.
Article Title: A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns
Abstract: In cancer, the primary tumour’s organ of origin and histopathology are the strongest determinants of its clinical behaviour, but in 3% of cases a patient presents with a metastatic tumour and no obvious primary. Here,as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we train a deep learning classifier to predict cancer type based on patterns of somatic passenger mutations detected in whole genome sequencing (WGS) of 2606 tumours representing 24 common cancer types produced by the PCAWG Consortium. Our classifier achieves an accuracy of 91% on held-out tumor samples and 88% and 83% respectively on independent primary and metastatic samples, roughly double the accuracy of trained pathologists when presented with a metastatic tumour without knowledge of the primary. Surprisingly, adding information on driver mutations reduced accuracy. Our results have clinical applicability, underscore how patterns of somatic passenger mutations encode the state of the cell of origin, and can inform future strategies to detect the source of circulating tumour DNA. © 2020, The Author(s).
Keywords: controlled study; human tissue; primary tumor; somatic mutation; genetics; mutation; histopathology; diagnostic accuracy; neoplasm; neoplasms; reproducibility; reproducibility of results; metastasis; biology; computational biology; pathology; validation study; mutational analysis; human genome; neoplasm metastasis; intermethod comparison; genome; cancer classification; tumor; genome, human; procedures; high throughput sequencing; cell component; dna sequencing; cancer; humans; human; male; female; article; whole genome sequencing; circulating tumor dna; deep learning; accuracy assessment; passenger mutation pattern
Journal Title: Nature Communications
Volume: 11
ISSN: 2041-1723
Publisher: Nature Publishing Group  
Date Published: 2020-02-05
Start Page: 728
Language: English
DOI: 10.1038/s41467-019-13825-8
PUBMED: 32024849
PROVIDER: scopus
PMCID: PMC7002586
DOI/URL:
Notes: Article -- Erratum issued, see DOI: 10.1038/s41467-022-32329-6 -- Export Date: 13 January 2023 -- Source: Scopus
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Ronald A Ghossein
    483 Ghossein
  2. Dilip D Giri
    184 Giri
  3. Douglas A Levine
    380 Levine
  4. Tari King
    186 King
  5. Samuel Singer
    337 Singer
  6. Jason T Huse
    143 Huse
  7. Nicholas D Socci
    266 Socci
  8. Chris Sander
    210 Sander
  9. Victor Reuter
    1228 Reuter
  10. Jianjiong Gao
    132 Gao
  11. Nikolaus D Schultz
    487 Schultz
  12. Gunnar Ratsch
    68 Ratsch
  13. Andre Kahles
    31 Kahles
  14. Hsiao-Wei Chen
    30 Chen
  15. Kjong Van Stephan Fritz Lehmann
    22 Lehmann
  16. Stefan G Stark
    17 Stark
  17. Michael H Roehrl
    127 Roehrl
  18. Alessandro   Pastore
    55 Pastore
  19. Joshua   Armenia
    56 Armenia
  20. Zachary Joseph Heins
    22 Heins
  21. Ritika   Kundra
    89 Kundra
  22. Hongxin Zhang
    47 Zhang
  23. Angelica Ochoa
    30 Ochoa
  24. Gunes Gundem
    56 Gundem
  25. Minwei Liu
    24 Liu