Using somatic variant richness to mine signals from rare variants in the cancer genome Journal Article


Authors: Chakraborty, S.; Arora, A.; Begg, C. B.; Shen, R.
Article Title: Using somatic variant richness to mine signals from rare variants in the cancer genome
Abstract: To date, the vast preponderance of somatic variants observed in the cancer genome have been rare variants, and it is common in practice to encounter in a new tumor variants that have not been observed previously. Here we focus on probability estimation for encountering such hitherto unseen variants. We draw upon statistical methodology that has been developed in other fields of study, notably in species estimation in ecology, and word frequency estimation in computational linguistics. Analysis of whole-exome and targeted panel sequencing data sets reveal substantial variability in variant “richness” between genes that could be harnessed for clinically relevant problems. We quantify the variant-tissue association and show a strong gene-specific, lineage-dependent pattern of encountering new variants. This variability is largely determined by the proportion of observed variants that are rare. Our findings suggest that variants that occur at very low frequencies can harbor important signals that are clinically consequential. © 2019, The Author(s).
Keywords: protein phosphorylation; gene mutation; single nucleotide polymorphism; somatic mutation; mutation; validation process; pancreas cancer; glioma; endometrium cancer; colorectal cancer; genetic variability; gene frequency; protein p53; carcinogenesis; tumor suppressor gene; algorithm; probability; mismatch repair; microsatellite instability; uterine cervix cancer; genome; tumor; bioinformatics; tumor gene; ethnicity; genetic marker; linguistics; cyclin dependent kinase 1; isocitrate dehydrogenase 1; nucleic acid base substitution; ecology; machine learning; cancer; human; article; circulating tumor dna; whole exome sequencing; transcriptional regulator atrx
Journal Title: Nature Communications
Volume: 10
ISSN: 2041-1723
Publisher: Nature Publishing Group  
Date Published: 2019-12-03
Start Page: 5506
Language: English
DOI: 10.1038/s41467-019-13402-z
PUBMED: 31796730
PROVIDER: scopus
PMCID: PMC6890761
DOI/URL:
Notes: Article -- Source: Scopus
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Colin B Begg
    306 Begg
  2. Ronglai Shen
    205 Shen
  3. Arshi Arora
    36 Arora