Orchard: Building large cancer phylogenies using stochastic combinatorial search Journal Article


Authors: Kulman, E.; Kuang, R.; Morris, Q.
Article Title: Orchard: Building large cancer phylogenies using stochastic combinatorial search
Abstract: Phylogenies depicting the evolutionary history of genetically heterogeneous subpopulations of cells from the same cancer, i.e., cancer phylogenies, offer valuable insights about cancer development and guide treatment strategies. Many methods exist that reconstruct cancer phylogenies using point mutations detected with bulk DNA sequencing. However, these methods become inaccurate when reconstructing phylogenies with more than 30 mutations, or, in some cases, fail to recover a phylogeny altogether. Here, we introduce Orchard, a cancer phylogeny reconstruction algorithm that is fast and accurate using up to 1000 mutations. Orchard samples without replacement from a factorized approximation of the posterior distribution over phylogenies, a novel result derived in this paper. Each factor in this approximate posterior corresponds to a conditional distribution for adding a new mutation to a partially built phylogeny. Orchard optimizes each factor sequentially, generating a sequence of incrementally larger phylogenies that ultimately culminate in a complete tree containing all mutations. Our evaluations demonstrate that Orchard outperforms state-of-the-art cancer phylogeny reconstruction methods in reconstructing more plausible phylogenies across 90 simulated cancers and 14 B-progenitor acute lymphoblastic leukemias (B-ALLs). Remarkably, Orchard accurately reconstructs cancer phylogenies using up to 1,000 mutations. Additionally, we demonstrate that the large and accurate phylogenies reconstructed by Orchard are useful for identifying patterns of somatic mutations and genetic variations among distinct cancer cell subpopulations. © 2024 Kulman et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Keywords: gene mutation; genetics; mutation; neoplasm; neoplasms; allele; biological model; classification; computational biology; genotype; gene frequency; genetic variation; lung cancer; algorithms; simulation; algorithm; probability; cancer cell; cell subpopulation; models, genetic; computer simulation; bioinformatics; stochastic model; stochastic processes; gene encoding; phylogeny; point mutations; procedures; plant diseases; dna sequencing; humans; human; article; stochastic systems; b cell acute lymphoblastic leukemia; malignant neoplasm; cancer development; reconstruction algorithm; reconstruction algorithms; evolutionary history; stochastics; orchards; combinatorial search; conditional distribution; phylogeny reconstruction; posterior distributions; orchard; stochastic combinatorial search
Journal Title: PLoS Computational Biology
Volume: 20
Issue: 12
ISSN: 1553-7358
Publisher: Public Library of Science  
Date Published: 2024-12-30
Start Page: e1012653
Language: English
DOI: 10.1371/journal.pcbi.1012653
PUBMED: 39775053
PROVIDER: scopus
PMCID: PMC11723595
DOI/URL:
Notes: The MSK Cancer Center Support Grant (P30 CA008748) is acknowledged in the PubMed record and PDF. Corresponding MSK author is Quaid Morris -- Source: Scopus
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Quaid Morris
    36 Morris
  2. Ethan Kulman
    3 Kulman