Results of the 2016 International Skin Imaging Collaboration International Symposium on Biomedical Imaging challenge: Comparison of the accuracy of computer algorithms to dermatologists for the diagnosis of melanoma from dermoscopic images Journal Article


Authors: Marchetti, M. A.; Codella, N. C. F.; Dusza, S. W.; Gutman, D. A.; Helba, B.; Kalloo, A.; Mishra, N.; Carrera, C.; Celebi, M. E.; DeFazio, J. L.; Jaimes, N.; Marghoob, A. A.; Quigley, E.; Scope, A.; YĆ©lamos, O.; Halpern, A. C.; for the International Skin Imaging Collaboration
Article Title: Results of the 2016 International Skin Imaging Collaboration International Symposium on Biomedical Imaging challenge: Comparison of the accuracy of computer algorithms to dermatologists for the diagnosis of melanoma from dermoscopic images
Abstract: Background: Computer vision may aid in melanoma detection. Objective: We sought to compare melanoma diagnostic accuracy of computer algorithms to dermatologists using dermoscopic images. Methods: We conducted a cross-sectional study using 100 randomly selected dermoscopic images (50 melanomas, 44 nevi, and 6 lentigines) from an international computer vision melanoma challenge dataset (n = 379), along with individual algorithm results from 25 teams. We used 5 methods (nonlearned and machine learning) to combine individual automated predictions into "fusion" algorithms. In a companion study, 8 dermatologists classified the lesions in the 100 images as either benign or malignant. Results: The average sensitivity and specificity of dermatologists in classification was 82% and 59%. At 82% sensitivity, dermatologist specificity was similar to the top challenge algorithm (59% vs. 62%, P =.68) but lower than the best-performing fusion algorithm (59% vs. 76%, P = .02). Receiver operating characteristic area of the top fusion algorithm was greater than the mean receiver operating characteristic area of dermatologists (0.86 vs. 0.71, P = .001). Limitations: The dataset lacked the full spectrum of skin lesions encountered in clinical practice, particularly banal lesions. Readers and algorithms were not provided clinical data (eg, age or lesion history/symptoms). Results obtained using our study design cannot be extrapolated to clinical practice. Conclusion: Deep learning computer vision systems classified melanoma dermoscopy images with accuracy that exceeded some but not all dermatologists.
Keywords: melanoma; classification; skin cancer; computer vision; performance; cutaneous melanoma; system; clinical-trial; multicenter; melanocytic lesions; dermatologist; machine learning; cancer; computer algorithm; international skin; imaging collaboration; international symposium on biomedical imaging; reader study
Journal Title: Journal of the American Academy of Dermatology
Volume: 78
Issue: 2
ISSN: 0190-9622
Publisher: Mosby Elsevier  
Date Published: 2018-02-01
Start Page: 270
End Page: 277.e1
Language: English
ACCESSION: WOS:000422791000015
DOI: 10.1016/j.jaad.2017.08.016
PROVIDER: wos
PMCID: PMC5768444
PUBMED: 28969863
Notes: Article -- Source: Wos
Altmetric Score
MSK Authors
  1. Elizabeth Ann Quigley
    11 Quigley
  2. Allan C Halpern
    287 Halpern
  3. Jennifer Defazio
    11 Defazio
  4. Stephen Dusza
    148 Dusza
  5. Alon Scope
    111 Scope
  6. Ashfaq A Marghoob
    364 Marghoob
  7. Aadi Kalloo
    2 Kalloo