Semisupervised training of a brain MRI tumor detection model using mined annotations Journal Article


Authors: Swinburne, N. C.; Yadav, V.; Kim, J.; Choi, Y. R.; Gutman, D. C.; Yang, J. T.; Moss, N.; Stone, J.; Tisnado, J.; Hatzoglou, V.; Haque, S. S.; Karimi, S.; Lyo, J.; Juluru, K.; Pichotta, K.; Gao, J.; Shah, S. P.; Holodny, A. I.; Young, R. J.; for the MSK Mind Consortium
Contributors: Sabbatini, P.; Stetson, P. D.; Schultz, N.; Hellmann, M.; Lakhman, Y.; Gonen, M.; Razavi, P.; Boehm, K.; Sutton, E.; Khosravi, P.; Vanguri, R.; Jee, J.; Fong, C.; Pasha, A.; Rose, D.; Elsherif, E.; Aukerman, A.; Patel, D.; Begum, A.; Zakszewski, E.; Gross, B.; Philip, J.; Geneslaw, L.; Pimienta, R.; Rangavajhala, S. N.
Article Title: Semisupervised training of a brain MRI tumor detection model using mined annotations
Abstract: Background: Artificial intelligence (AI) applications for cancer imaging conceptually begin with automated tumor detection, which can provide the foundation for downstream AI tasks. However, supervised training requires many image annotations, and performing dedicated post hoc image labeling is burdensome and costly. Purpose: To investigate whether clinically generated image annotations can be data mined from the picture archiving and communication system (PACS), automatically curated, and used for semisupervised training of a brain MRI tumor detection model. Materials and Methods: In this retrospective study, the cancer center PACS was mined for brain MRI scans acquired between January 2012 and December 2017 and included all annotated axial T1 postcontrast images. Line annotations were converted to boxes, excluding boxes shorter than 1 cm or longer than 7 cm. The resulting boxes were used for supervised training of object detection models using RetinaNet and Mask region-based convolutional neural network (R-CNN) architectures. The best-performing model trained from the mined data set was used to detect unannotated tumors on training images themselves (self-labeling), automatically correcting many of the missing labels. After self-labeling, new models were trained using this expanded data set. Models were scored for precision, recall, and F1 using a held-out test data set comprising 754 manually labeled images from 100 patients (403 intra-axial and 56 extra-axial enhancing tumors). Model F1 scores were compared using bootstrap resampling. Results: The PACS query extracted 31 150 line annotations, yielding 11 880 boxes that met inclusion criteria. This mined data set was used to train models, yielding F1 scores of 0.886 for RetinaNet and 0.908 for Mask R-CNN. Self-labeling added 18 562 training boxes, improving model F1 scores to 0.935 (P , .001) and 0.954 (P , .001), respectively. Conclusion: The application of semisupervised learning to mined image annotations significantly improved tumor detection performance, achieving an excellent F1 score of 0.954. This development pipeline can be extended for other imaging modalities, repurposing unused data silos to potentially enable automated tumor detection across radiologic modalities. © RSNA, 2022.
Journal Title: Radiology
Volume: 303
Issue: 1
ISSN: 0033-8419
Publisher: Radiological Society of North America, Inc.  
Date Published: 2022-04-01
Start Page: 80
End Page: 89
Language: English
DOI: 10.1148/radiol.210817
PUBMED: 35040676
PROVIDER: scopus
PMCID: PMC8962822
DOI/URL:
Notes: Article -- Export Date: 2 May 2022 -- Source: Scopus
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. John Kyungjin Lyo
    39 Lyo
  2. Yuliya Lakhman
    97 Lakhman
  3. Robert J Young
    231 Young
  4. Mithat Gonen
    1030 Gonen
  5. Paul J Sabbatini
    262 Sabbatini
  6. Sasan Karimi
    115 Karimi
  7. Sofia S Haque
    149 Haque
  8. Andrei Holodny
    207 Holodny
  9. Jianjiong Gao
    132 Gao
  10. Jonathan T Yang
    166 Yang
  11. Matthew David Hellmann
    412 Hellmann
  12. Nikolaus D Schultz
    490 Schultz
  13. Benjamin E Gross
    44 Gross
  14. John Philip
    49 Philip
  15. Elizabeth Jane Sutton
    70 Sutton
  16. Pedram Razavi
    182 Razavi
  17. Jamie Tisnado
    16 Tisnado
  18. Jacqueline Blair Stone
    27 Stone
  19. Krishna   Juluru
    35 Juluru
  20. Peter D Stetson
    51 Stetson
  21. Nelson Moss
    89 Moss
  22. Sohrab Prakash Shah
    88 Shah
  23. Christopher Joseph Fong
    43 Fong
  24. Justin Jee
    55 Jee
  25. David Gutman
    5 Gutman
  26. Ye R Choi
    8 Choi
  27. Rami Sesha Vanguri
    15 Vanguri
  28. Kevin Michael Boehm
    13 Boehm
  29. Vivek Yadav
    2 Yadav
  30. Arfath Pasha
    6 Pasha
  31. Doori Rose
    7 Rose
  32. Anika Begum
    3 Begum
  33. Druv Mukesh Patel
    5 Patel