Extracting and classifying diagnosis dates from clinical notes: A case study Journal Article


Authors: Fu, J. T.; Sholle, E.; Krichevsky, S.; Scandura, J.; Campion, T. R.
Article Title: Extracting and classifying diagnosis dates from clinical notes: A case study
Abstract: Myeloproliferative neoplasms (MPNs) are chronic hematologic malignancies that may progress over long disease courses. The original date of diagnosis is an important piece of information for patient care and research, but is not consistently documented. We describe an attempt to build a pipeline for extracting dates with natural language processing (NLP) tools and techniques and classifying them as relevant diagnoses or not. Inaccurate and incomplete date extraction and interpretation impacted the performance of the overall pipeline. Existing lightweight Python packages tended to have low specificity for identifying and interpreting partial and relative dates in clinical text. A rules-based regular expression (regex) approach achieved recall of 83.0% on dates manually annotated as diagnosis dates, and 77.4% on all annotated dates. With only 3.8% of annotated dates representing initial MPN diagnoses, additional methods of targeting candidate date instances may alleviate noise and class imbalance. © 2020 Elsevier Inc.
Keywords: controlled study; aged; major clinical study; disease course; disease classification; note; diagnostic procedure; classification; chronic myeloid leukemia; information processing; feasibility study; medical information; clinical research; myeloproliferative neoplasm; polycythemia vera; thrombocythemia; acute myeloid leukemia; natural language processing; human; male; priority journal; clinical text; temporality
Journal Title: Journal of Biomedical Informatics
Volume: 110
ISSN: 1532-0464
Publisher: Elsevier Inc.  
Date Published: 2020-10-01
Start Page: 103569
Language: English
DOI: 10.1016/j.jbi.2020.103569
PUBMED: 32949781
PROVIDER: scopus
DOI/URL:
Notes: Note -- Export Date: 1 October 2020 -- Source: Scopus
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Julia Tsejan Fu
    6 Fu