| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Paper |
Affiliation of the authors: Department of Biomedical Informatics, College of Physicians and Surgeons, Columbia University, New York, NY
Correspondence and reprints: Carol Friedman, PhD, Department of Biomedical Informatics, Columbia University, 622 West 168 Street, VC-5, New York, NY 10032; e-mail: <friedman{at}dbmi.columbia.edu>.
Received for publication: 02/04/04; accepted for publication: 04/13/04.
Objective: The aim of this study was to develop a method based on natural language processing (NLP) that automatically maps an entire clinical document to codes with modifiers and to quantitatively evaluate the method.
Methods: An existing NLP system, MedLEE, was adapted to automatically generate codes. The method involves matching of structured output generated by MedLEE consisting of findings and modifiers to obtain the most specific code. Recall and precision applied to Unified Medical Language System (UMLS) coding were evaluated in two separate studies. Recall was measured using a test set of 150 randomly selected sentences, which were processed using MedLEE. Results were compared with a reference standard determined manually by seven experts. Precision was measured using a second test set of 150 randomly selected sentences from which UMLS codes were automatically generated by the method and then validated by experts.
Results: Recall of the system for UMLS coding of all terms was .77 (95% CI .72.81), and for coding terms that had corresponding UMLS codes recall was .83 (.79.87). Recall of the system for extracting all terms was .84 (.81.88). Recall of the experts ranged from .69 to .91 for extracting terms. The precision of the system was .89 (.87.91), and precision of the experts ranged from .61 to .91.
Conclusion: Extraction of relevant clinical information and UMLS coding were accomplished using a method based on NLP. The method appeared to be comparable to or better than six experts. The advantage of the method is that it maps text to codes along with other related information, rendering the coded output suitable for effective retrieval.
This article has been cited by other articles:
![]() |
S. B. Johnson, S. Bakken, D. Dine, S. Hyun, E. Mendonca, F. Morrison, T. Bright, T. Van Vleck, J. Wrenn, and P. Stetson An Electronic Health Record Based on Structured Narrative J. Am. Med. Inform. Assoc., January 1, 2008; 15(1): 54 - 64. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. S. Chen, G. Hripcsak, H. Xu, M. Markatou, and C. Friedman Automated Acquisition of Disease Drug Knowledge from Biomedical and Clinical Documents: An Initial Study J. Am. Med. Inform. Assoc., January 1, 2008; 15(1): 87 - 98. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Clark, K. Good, L. Jezierny, M. Macpherson, B. Wilson, and U. Chajewska Identifying Smokers with a Medical Extraction System J. Am. Med. Inform. Assoc., January 1, 2008; 15(1): 36 - 39. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Uzuner, Y. Luo, and P. Szolovits Evaluating the State-of-the-Art in Automatic De-identification J. Am. Med. Inform. Assoc., September 1, 2007; 14(5): 550 - 563. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Wright, H. Goldberg, T. Hongsermeier, and B. Middleton A Description and Functional Taxonomy of Rule-based Decision Support Content at a Large Integrated Delivery Network J. Am. Med. Inform. Assoc., July 1, 2007; 14(4): 489 - 496. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Voorham, P. Denig, and Groningen Initiative to Analyse Type 2 Diabetes Tr Computerized Extraction of Information on the Quality of Diabetes Care from Free Text in Electronic Patient Records of General Practitioners J. Am. Med. Inform. Assoc., May 1, 2007; 14(3): 349 - 354. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Huang and H. J. Lowe A Novel Hybrid Approach to Automated Negation Detection in Clinical Radiology Reports J. Am. Med. Inform. Assoc., May 1, 2007; 14(3): 304 - 311. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. A. Lussier and Y. Liu Computational Approaches to Phenotyping: High-Throughput Phenomics Proceedings of the ATS, January 1, 2007; 4(1): 18 - 25. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Y. Sun and Y. Sun A System for Automated Lexical Mapping J. Am. Med. Inform. Assoc., May 1, 2006; 13(3): 334 - 343. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Hazlehurst, H. R. Frost, D. F. Sittig, and V. J. Stevens MediClass: A System for Detecting and Classifying Encounter-based Clinical Events in Any Electronic Medical Record J. Am. Med. Inform. Assoc., September 1, 2005; 12(5): 517 - 529. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Turchin, I. S. Kohane, and M. L. Pendergrass Identification of Patients With Diabetes From the Text of Physician Notes in the Electronic Medical Record Diabetes Care, July 1, 2005; 28(7): 1794 - 1795. [Full Text] [PDF] |
||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |