Abstract: |
Background/Objectives: Improved survival due to advances in medical therapy has resulted in increasing numbers of cancer patients living with bone metastases; however, our understanding of the prognostic implications of bone metastases requires larger population-based studies outlining their incidence and prevalence in different primary cancer types, including those with lower incidence. This study aimed to evaluate the incidence and prevalence of bone metastases in solid organ tumors by analyzing reports of staging CT studies with natural language processing (NLP). Methods: In this retrospective study, 639,470 reports representing 129,326 unique patients were analyzed; 6279 randomly selected reports were manually annotated and labeled for the presence or absence of bone metastases. From these data, a BERT-based NLP model was developed and applied to the patient database. The cumulative incidence at 5 years and prevalence of bone metastases in each cancer type were calculated. Results: The accuracy of the NLP model on a validation set was 97.1%, with a positive predictive value (precision) of 88.0% and a sensitivity (recall) of 86.3%. The 5-year incidence rate of bone metastases was highest in prostate, breast, head and neck, and lung cancer (52%, 41%, 36%, 33%). Incidence was lowest in central nervous system cancer and testicular cancer (8%, 5%). Prevalence was highest in prostate, breast, and lung cancer (32%, 25% and 23%), and lowest in central nervous system cancer and testicular cancer (4%, 4%). Conclusions: NLP was utilized to demonstrate patterns of bone metastases in a broad range of cancer types and is a valuable tool in population-based assessment of bone metastases. © 2025 by the authors. |