Accurate identification of colonoscopy quality and polyp findings using natural language processing Journal Article


Authors: Lee, J. K.; Jensen, C. D.; Levin, T. R.; Zauber, A. G.; Doubeni, C. A.; Zhao, W. K.; Corley, D. A.
Article Title: Accurate identification of colonoscopy quality and polyp findings using natural language processing
Abstract: Objectives: The aim of this study was to test the ability of a commercially available natural language processing (NLP) tool to accurately extract examination quality-related and large polyp information from colonoscopy reports with varying report formats. Background: Colonoscopy quality reporting often requires manual data abstraction. NLP is another option for extracting information; however, limited data exist on its ability to accurately extract examination quality and polyp findings from unstructured text in colonoscopy reports with different reporting formats. Study Design: NLP strategies were developed using 500 colonoscopy reports from Kaiser Permanente Northern California and then tested using 300 separate colonoscopy reports that underwent manual chart review. Using findings from manual review as the reference standard, we evaluated the NLP tool's sensitivity, specificity, positive predictive value (PPV), and accuracy for identifying colonoscopy examination indication, cecal intubation, bowel preparation adequacy, and polyps ≥10 mm. Results: The NLP tool was highly accurate in identifying examination quality-related variables from colonoscopy reports. Compared with manual review, sensitivity for screening indication was 100% (95% confidence interval: 95.3%-100%), PPV was 90.6% (82.3%-95.8%), and accuracy was 98.2% (97.0%-99.4%). For cecal intubation, sensitivity was 99.6% (98.0%-100%), PPV was 100% (98.5%-100%), and accuracy was 99.8% (99.5%-100%). For bowel preparation adequacy, sensitivity was 100% (98.5%-100%), PPV was 100% (98.5%-100%), and accuracy was 100% (100%-100%). For polyp(s) ≥10 mm, sensitivity was 90.5% (69.6%-98.8%), PPV was 100% (82.4%-100%), and accuracy was 95.2% (88.8%-100%). Conclusion: NLP yielded a high degree of accuracy for identifying examination quality-related and large polyp information from diverse types of colonoscopy reports. © 2017 Wolters Kluwer Health, Inc. All rights reserved.
Keywords: colonoscopy; quality; natural language processing
Journal Title: Journal of Clinical Gastroenterology
Volume: 53
Issue: 1
ISSN: 0192-0790
Publisher: Lippincott Williams & Wilkins  
Date Published: 2019-01-01
Start Page: e25
End Page: e30
Language: English
DOI: 10.1097/mcg.0000000000000929
PROVIDER: scopus
PMCID: PMC5847417
PUBMED: 28906424
DOI/URL:
Notes: J. Clin. Gastroenterol. -- Export Date: 2 January 2019 -- Article -- CODEN: JCGAD C2 - 28906424 -- Source: Scopus
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Ann G Zauber
    314 Zauber