Foundational segmentation models and clinical data mining enable accurate computer vision for lung cancer Journal Article


Authors: Swinburne, N. C.; Jackson, C. B.; Pagano, A. M.; Stember, J. N.; Schefflein, J.; Marinelli, B.; Panyam, P. K.; Autz, A.; Chopra, M. S.; Holodny, A. I.; Ginsberg, M. S.
Article Title: Foundational segmentation models and clinical data mining enable accurate computer vision for lung cancer
Abstract: This study aims to assess the effectiveness of integrating Segment Anything Model (SAM) and its variant MedSAM into the automated mining, object detection, and segmentation (MODS) methodology for developing robust lung cancer detection and segmentation models without post hoc labeling of training images. In a retrospective analysis, 10,000 chest computed tomography scans from patients with lung cancer were mined. Line measurement annotations were converted to bounding boxes, excluding boxes < 1 cm or > 7 cm. The You Only Look Once object detection architecture was used for teacher-student learning to label unannotated lesions on the training images. Subsequently, a final tumor detection model was trained and employed with SAM and MedSAM for tumor segmentation. Model performance was assessed on a manually annotated test dataset, with additional evaluations conducted on an external lung cancer dataset before and after detection model fine-tuning. Bootstrap resampling was used to calculate 95% confidence intervals. Data mining yielded 10,789 line annotations, resulting in 5403 training boxes. The baseline detection model achieved an internal F1 score of 0.847, improving to 0.860 after self-labeling. Tumor segmentation using the final detection model attained internal Dice similarity coefficients (DSCs) of 0.842 (SAM) and 0.822 (MedSAM). After fine-tuning, external validation showed an F1 of 0.832 and DSCs of 0.802 (SAM) and 0.804 (MedSAM). Integrating foundational segmentation models into the MODS framework results in high-performing lung cancer detection and segmentation models using only mined clinical data. Both SAM and MedSAM hold promise as foundational segmentation models for radiology images.
Keywords: lung neoplasms; artificial intelligence; computed tomography; data mining
Journal Title: Journal of Imaging Informatics in Medicine
Volume: 38
Issue: 3
ISSN: 2948-2925
Publisher: Springer  
Date Published: 2025-06-01
Start Page: 1552
End Page: 1562
Language: English
ACCESSION: WOS:001340091700003
DOI: 10.1007/s10278-024-01304-6
PROVIDER: wos
PMCID: PMC12092863
PUBMED: 39438365
Notes: The MSK Cancer Center Support Grant (P30 CA008748) is acknowledged in the PubMed record and PDF. Corresponding MSK author is Nathaniel C. Swinburne -- Source: Wos
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Michelle S Ginsberg
    235 Ginsberg
  2. Andrei Holodny
    207 Holodny
  3. Joseph Nathaniel Stember
    19 Stember
  4. Andrew Michael Pagano
    14 Pagano
  5. Prashanth Kumar Panyam
    3 Panyam
  6. Arthur J Autz
    2 Autz
  7. Mohapar Singh Chopra
    1 Chopra