DeepHeme, a high-performance, generalizable deep ensemble for bone marrow morphometry and hematologic diagnosis Journal Article


Authors: Sun, S.; Yin, Z.; Van Cleave, J. G.; Wang, L.; Fried, B.; Bilal, K. H.; Lucas, F.; Isgor, I. S.; Webb, D. C.; Singi, S.; Brown, L.; Shouval, R.; Lin, J.; Yan, E. S.; Spector, J. D.; Ardon, O.; Boiocchi, L.; Sardana, R.; Baik, J.; Zhu, M.; Syed, A.; Yabe, M.; Lu, C. M.; Roshal, M.; Vanderbilt, C.; Goldgof, D. B.; Dogan, A.; Prakash, S.; Carmichael, I.; Butte, A. J.; Goldgof, G. M.
Article Title: DeepHeme, a high-performance, generalizable deep ensemble for bone marrow morphometry and hematologic diagnosis
Abstract: Cytomorphological analysis of the bone marrow aspirate (BMA) is pivotal for the diagnostic workup of a broad range of hematological disorders. However, this skill is error prone, highly complex, and time consuming. Deep learning–based models for the automatic classification of bone marrow cell morphology demonstrate the potential to improve diagnostic efficiency and accuracy. However, existing deep learning approaches in this field fall short of expert-level performance and lack generalizability beyond a single dataset. Working with multiple hematopathologists, we curated a dataset from the University of California, San Francisco, which included a training set of 30,394 images from 40 patients with morphologically normal marrows and a test set of 8507 images from 10 different patients, all derived from 400×-equivalent whole-slide images (WSIs). We then developed DeepHeme, a snapshot ensemble deep learning classifier, which outperformed previous models in accuracy while expanding the total number of differentiable cell classes. We externally validated DeepHeme using an independent dataset from the Memorial Sloan Kettering Cancer Center, which included 2694 images from 10 morphologically normal patients and 11,076 images from 655 patients with normal or diseased marrow, scanned using a different WSI system, demonstrating robust generalizability. At the level of individual cell classifications, we systematically compared DeepHeme’s diagnostic performance with that of three medical experts from different academic hospitals, demonstrating that DeepHeme achieved accuracy comparable to, or exceeding, that of human experts. Accurate and generalizable cell classification represents a step toward automated analysis of hematopathology slides and the development of quantitative, morphology-based, predictive markers. Copyright © 2025 The Authors, some rights reserved.
Keywords: adolescent; adult; child; clinical article; aged; sensitivity and specificity; phenotype; bone marrow; cohort analysis; myelodysplastic syndrome; algorithm; infant; image quality; hematopoiesis; bone marrow cell; lymphatic leukemia; chi square test; megakaryocyte; predictive value; myeloproliferative neoplasm; hematologic disease; receiver operating characteristic; artificial neural network; learning algorithm; human; male; female; article; morphometry; deep learning; bone marrow aspiration; deepheme
Journal Title: Science Translational Medicine
Volume: 17
Issue: 802
ISSN: 1946-6234
Publisher: American Association for the Advancement of Science  
Date Published: 2025-06-11
Start Page: eadq2162
Language: English
DOI: 10.1126/scitranslmed.adq2162
PROVIDER: scopus
PUBMED: 40498857
DOI/URL:
Notes: The MSK Cancer Center Support Grant (P30 CA008748) is acknowledged in the PDF. Corresponding MSK author is Gregory M. Goldgof -- Source: Scopus
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Aijazuddin Syed
    53 Syed
  2. Ahmet Dogan
    469 Dogan
  3. Mikhail Roshal
    235 Roshal
  4. Mariko   Yabe
    51 Yabe
  5. Jee Yeon Baik
    45 Baik
  6. Menglei Zhu
    37 Zhu
  7. Roni Shouval
    168 Shouval
  8. Orly Ardon
    25 Ardon
  9. Irem Sahver Isgor
    12 Isgor
  10. Gregory Goldgof
    10 Goldgof
  11. Brenda Fried
    2 Fried
  12. Khawaja Hassan Bilal
    7 Bilal
  13. Dylan Webb
    3 Webb
  14. Siddharth Shriram Singi
    6 Singi
  15. Zhanghan Yin
    3 Yin
  16. Ethan Yan
    1 Yan