help button home button JAMIA Bigger figures
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH

First published June 7, 2004 as JAMIA PrePrint; doi:10.1197/jamia.M1552
Journal of the American Medical Informatics Association 2004;11(5):392-402
© 2004 American Medical Informatics Association


A more recent version of this article appeared on September 1, 2004
This Article
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
M1552v1
11/5/392    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Friedman, C.
Right arrow Articles by Hripcsak, G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Friedman, C.
Right arrow Articles by Hripcsak, G.

Submitted on February 4, 2004
Accepted on April 13, 2004

Automated Encoding of Clinical Documents Based on Natural Language Processing

Carol Friedman PhD1*, Lyudmila Shagina MS1, Yves Lussier MD1, and George Hripcsak MD1

Affiliation of the authors: 1 Department of Biomedical Informatics, College of Physicians and Surgeons, Columbia University, New York, NY

* To whom correspondence should be addressed.

Objective To develop a method based on natural language processing (NLP) that automatically maps an entire clinical document to codes with modifiers. To quantitatively evaluate the method.

Methods An existing NLP system, MedLEE, was adapted to automatically generate codes. The method involves matching of structured output generated by MedLEE consisting of findings and modifiers to obtain the most specific code. Recall and precision applied to UMLS coding were evaluated in two separate studies. Recall was measured using a test set of 150 randomly selected sentences, which were processed using MedLEE. Results were compared with a reference standard determined manually by 7 experts. Precision was measured using a second test set of 150 randomly selected sentences from which UMLS codes were automatically generated by the method and then validated by experts.

Results Recall of the system for UMLS coding of all terms was .77 (95% CI 72-.81), and for coding terms that had corresponding UMLS codes recall was .83 (.79-.87). Recall of the system for extracting all terms was .84 (.81-.88). Recall of the experts ranged from .69 to .91 for extracting terms. The precision of the system was. 89 (.87-.91) and precision of the experts ranged from .61 to .91.

Conclusions Extraction of relevant clinical information and UMLS coding were accomplished using a method based on NLP. The method appeared to be comparable to or better than 6 experts. The advantage of the method is that it maps text to codes along with other related information, rendering the coded output suitable for effective retrieval.




This article has been cited by other articles:


Home page
J. Am. Med. Inform. Assoc.Home page
F. J. Friedlin and C. J. McDonald
A Software Tool for Removing Patient Identifying Information from Clinical Documents
J. Am. Med. Inform. Assoc., September 1, 2008; 15(5): 601 - 610.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
S. B. Johnson, S. Bakken, D. Dine, S. Hyun, E. Mendonca, F. Morrison, T. Bright, T. Van Vleck, J. Wrenn, and P. Stetson
An Electronic Health Record Based on Structured Narrative
J. Am. Med. Inform. Assoc., January 1, 2008; 15(1): 54 - 64.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
E. S. Chen, G. Hripcsak, H. Xu, M. Markatou, and C. Friedman
Automated Acquisition of Disease Drug Knowledge from Biomedical and Clinical Documents: An Initial Study
J. Am. Med. Inform. Assoc., January 1, 2008; 15(1): 87 - 98.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
C. Clark, K. Good, L. Jezierny, M. Macpherson, B. Wilson, and U. Chajewska
Identifying Smokers with a Medical Extraction System
J. Am. Med. Inform. Assoc., January 1, 2008; 15(1): 36 - 39.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
O. Uzuner, Y. Luo, and P. Szolovits
Evaluating the State-of-the-Art in Automatic De-identification
J. Am. Med. Inform. Assoc., September 1, 2007; 14(5): 550 - 563.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
A. Wright, H. Goldberg, T. Hongsermeier, and B. Middleton
A Description and Functional Taxonomy of Rule-based Decision Support Content at a Large Integrated Delivery Network
J. Am. Med. Inform. Assoc., July 1, 2007; 14(4): 489 - 496.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
J. Voorham, P. Denig, and Groningen Initiative to Analyse Type 2 Diabetes Tr
Computerized Extraction of Information on the Quality of Diabetes Care from Free Text in Electronic Patient Records of General Practitioners
J. Am. Med. Inform. Assoc., May 1, 2007; 14(3): 349 - 354.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
Y. Huang and H. J. Lowe
A Novel Hybrid Approach to Automated Negation Detection in Clinical Radiology Reports
J. Am. Med. Inform. Assoc., May 1, 2007; 14(3): 304 - 311.
[Abstract] [Full Text] [PDF]


Home page
Proc Am Thorac SocHome page
Y. A. Lussier and Y. Liu
Computational Approaches to Phenotyping: High-Throughput Phenomics
Proceedings of the ATS, January 1, 2007; 4(1): 18 - 25.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
J. Y. Sun and Y. Sun
A System for Automated Lexical Mapping
J. Am. Med. Inform. Assoc., May 1, 2006; 13(3): 334 - 343.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
B. Hazlehurst, H. R. Frost, D. F. Sittig, and V. J. Stevens
MediClass: A System for Detecting and Classifying Encounter-based Clinical Events in Any Electronic Medical Record
J. Am. Med. Inform. Assoc., September 1, 2005; 12(5): 517 - 529.
[Abstract] [Full Text] [PDF]


Home page
Diabetes CareHome page
A. Turchin, I. S. Kohane, and M. L. Pendergrass
Identification of Patients With Diabetes From the Text of Physician Notes in the Electronic Medical Record
Diabetes Care, July 1, 2005; 28(7): 1794 - 1795.
[Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH
Copyright © 1994 by the American Medical Informatics Association.