E pluribus unum: Prospective acceptability benchmarking from the Contouring Collaborative for Consensus in Radiation Oncology crowdsourced initiative for multiobserver segmentation Journal Article


Authors: Lin, D.; Wahid, K. A.; Nelms, B. E.; He, R.; Naser, M. A.; Duke, S.; Sherer, M. V.; Christodouleas, J. P.; Mohamed, A. S. R.; Cislo, M.; Murphy, J. D.; Fuller, C. D.; Gillespie, E. F.
Article Title: E pluribus unum: Prospective acceptability benchmarking from the Contouring Collaborative for Consensus in Radiation Oncology crowdsourced initiative for multiobserver segmentation
Abstract: Purpose: Contouring Collaborative for Consensus in Radiation Oncology (C3RO) is a crowdsourced challenge engaging radiation oncologists across various expertise levels in segmentation. An obstacle to artificial intelligence (AI) development is the paucity of multiexpert datasets; consequently, we sought to characterize whether aggregate segmentations generated from multiple nonexperts could meet or exceed recognized expert agreement. Approach: Participants who contoured =1 region of interest (ROI) for the breast, sarcoma, head and neck (H&N), gynecologic (GYN), or gastrointestinal (GI) cases were identified as a nonexpert or recognized expert. Cohort-specific ROIs were combined into single simultaneous truth and performance level estimation (STAPLE) consensus segmentations. STAPLE(nonexpert) ROIs were evaluated against STAPLE(expert) contours using Dice similarity coefficient (DSC). The expert interobserver DSC (IODSCexpert) was calculated as an acceptability threshold between STAPLE(nonexpert) and STAPLE(expert). To determine the number of nonexperts required to match the IODSCexpert for each ROI, a single consensus contour was generated using variable numbers of nonexperts and then compared to the IODSCexpert. Results: For all cases, the DSC values for STAPLE(nonexpert) versus STAPLE(expert) were higher than comparator expert IODSCexpert for most ROIs. The minimum number of nonexpert segmentations needed for a consensus ROI to achieve IODSCexpert acceptability criteria ranged between 2 and 4 for breast, 3 and 5 for sarcoma, 3 and 5 for H&N, 3 and 5 for GYN, and 3 for GI. Conclusions: Multiple nonexpert-generated consensus ROIs met or exceeded expert-derived acceptability thresholds. Five nonexperts could potentially generate consensus segmentations for most ROIs with performance approximating experts, suggesting nonexpert segmentations as feasible cost-effective AI inputs.
Keywords: risk; radiation oncology; artificial intelligence; carcinoma; segmentation; therapy; performance; target volume delineation; lung-cancer; head; variability; contouring; atlas implementation; crowdsourcing; autosegmentation
Journal Title: Journal of Medical Imaging
Volume: 10
Issue: Suppl. 1
ISSN: 2329-4302
Publisher: SPIE  
Date Published: 2023-02-01
Start Page: S11903
Language: English
ACCESSION: WOS:001057742800003
DOI: 10.1117/1.Jmi.10.S1.S11903
PROVIDER: wos
PMCID: PMC9907021
PUBMED: 36761036
Notes: Article -- MSK Cancer Center Support Grant (P30 CA008748) acknowledged in PDF -- MSK corresponding author is Erin Gillespie -- Source: Wos
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Erin Faye Gillespie
    149 Gillespie
  2. Diana Lin
    16 Lin
  3. Michael Anthony Cislo
    8 Cislo