Scalable log-ratio lasso regression for enhanced microbial feature selection with FLORAL Journal Article


Authors: Fei, T.; Funnell, T.; Waters, N. R.; Raj, S. S.; Baichoo, M.; Sadeghi, K.; Dai, A.; Miltiadous, O.; Shouval, R.; Lv, M.; Peled, J. U.; Ponce, D. M.; Perales, M. A.; Gönen, M.; van den Brink, M. R. M.
Article Title: Scalable log-ratio lasso regression for enhanced microbial feature selection with FLORAL
Abstract: Identifying predictive biomarkers of patient outcomes from high-throughput microbiome data is of high interest, while existing computational methods do not satisfactorily account for complex survival endpoints, longitudinal samples, and taxa-specific sequencing biases. We present FLORAL, an open-source tool to perform scalable log-ratio lasso regression and microbial feature selection for continuous, binary, time-to-event, and competing risk outcomes, with compatibility for longitudinal microbiome data as time-dependent covariates. The proposed method adapts the augmented Lagrangian algorithm for a zero-sum constraint optimization problem while enabling a two-stage screening process for enhanced false-positive control. In extensive simulation and real-data analyses, FLORAL achieved consistently better false-positive control compared to other lasso-based approaches and better sensitivity over popular differential abundance testing methods for datasets with smaller sample sizes. In a survival analysis of allogeneic hematopoietic cell transplant recipients, FLORAL demonstrated considerable improvement in microbial feature selection by utilizing longitudinal microbiome data over solely using baseline microbiome data. © 2024 The Author(s)
Keywords: survival analysis; survival rate; genetics; nonhuman; sensitivity analysis; hematopoietic stem cell transplantation; algorithms; algorithm; amplicon; computer simulation; rna 16s; software; signal detection; microorganism; longitudinal data; microflora; microbiota; microbiome; variable selection; humans; human; article; rna sequencing; lasso; population abundance; feature selection; least absolute shrinkage and selection operator; compositional data; cp: microbiology; cp: systems biology
Journal Title: Cell Reports Methods
Volume: 4
Issue: 11
ISSN: 2667-2375
Publisher: Cell Press  
Date Published: 2024-11-18
Start Page: 100899
Language: English
DOI: 10.1016/j.crmeth.2024.100899
PUBMED: 39515336
PROVIDER: scopus
PMCID: PMC11705925
DOI/URL:
Notes: The MSK Cancer Center Support Grant (P30 CA008748) is acknowledge in the PDF -- Corresponding authors is MSK author: Teng Fe -- Source: Scopus
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Mithat Gonen
    1028 Gonen
  2. Doris Ponce
    254 Ponce
  3. Miguel-Angel Perales
    913 Perales
  4. Jonathan U Peled
    154 Peled
  5. Sandeep Sunder Raj
    19 Raj
  6. Roni Shouval
    149 Shouval
  7. Anqi Dai
    26 Dai
  8. Nicholas R. Waters
    12 Waters
  9. Teng Fei
    40 Fei