Effect of patient-contextual skin images in human- and artificial intelligence-based diagnosis of melanoma: Results from the 2020 SIIM-ISIC melanoma classification challenge Journal Article


Authors: Kurtansky, N. R.; Primiero, C. A.; Betz-Stablein, B.; Combalia, M.; Guitera, P.; Halpern, A.; Kentley, J.; Kittler, H.; Liopyris, K.; Malvehy, J.; Rinner, C.; Tschandl, P.; Weber, J.; Rotemberg, V.; Soyer, H. P.
Article Title: Effect of patient-contextual skin images in human- and artificial intelligence-based diagnosis of melanoma: Results from the 2020 SIIM-ISIC melanoma classification challenge
Abstract: BackgroundWhile the high accuracy of reported AI tools for melanoma detection is promising, the lack of holistic consideration of the patient is often criticized. Along with medical history, a dermatologist would also consider intra-patient nevi patterns, such that nevi that are different from others on a given patient are treated with suspicion.ObjectiveTo evaluate whether patient-contextual lesion-images improves diagnostic accuracy for melanoma in a dermoscopic image-based AI competition and a human reader study.MethodsAn international online AI competition was held in 2020. The task was to classify dermoscopy images as melanoma or benign lesions. A multi-source dataset of dermoscopy images grouped by patient were provided, and additional use of public datasets was permitted. Competitors were judged on area under the receiver operating characteristic (AUROC) on a private leaderboard. Concurrently, a human reader study was hosted using a subset of the test data. Participants gave their initial diagnosis of an index case (melanoma vs. benign) and were then presented with seven additional lesion-images of that patient before giving a second prediction of the index case. Outcome measures were sensitivity and specificity.ResultsThe top 50 of 3308 AI competition entries achieved AUROC scores ranging from 0.943 to 0.949. Few algorithms considered intra-patient lesion patterns and instead most evaluated images independently. The median sensitivity and specificity of human readers before receiving contextual images were 60.0% and 86.7%, and after were 60.0% and 85.7%. Human and AI algorithm performance varied by image source.ConclusionThis study provided an open-source state-of-the-art algorithm for melanoma detection that has been evaluated at multiple centres. Patient-contextual images did not positively impact performance of AI algorithms or human readers. Providing seven contextual images and no total body image may have been insufficient to test the applicability of the intra-patient lesion patterns.
Keywords: nevi; short-term; ugly-duckling sign; cancer; international-symposium
Journal Title: Journal of the European Academy of Dermatology and Venereology
Volume: 39
Issue: 8
ISSN: 0926-9959
Publisher: Wiley Blackwell  
Date Published: 2025-08-01
Start Page: 1489
End Page: 1499
Language: English
ACCESSION: WOS:001373984000001
DOI: 10.1111/jdv.20479
PROVIDER: wos
PMCID: PMC12145458
PUBMED: 39648687
Notes: The MSK Cancer Center Support Grant (P30 CA008748) is acknowledged in the PubMed record and PDF. Corresponding MSK author is Nicholas R. Kurtansky -- Source: Wos
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Allan C Halpern
    399 Halpern
  2. Jochen Weber
    18 Weber