help button home button JAMIA Bigger figures
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow [Appendixes]
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Mutalik, P. G.
Right arrow Articles by Nadkarni, P. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Mutalik, P. G.
Right arrow Articles by Nadkarni, P. M.
J Am Med Inform Assoc. 2001;8:598-609. DOI .
© 2001 American Medical Informatics Association


Research Paper

Use of General-purpose Negation Detection to Augment Concept Indexing of Medical Documents

A Quantitative Study Using the UMLS

Pradeep G. Mutalik, MD, Aniruddha Deshpande, MD and Prakash M. Nadkarni, MD

Affiliation of the authors: Yale University School of Medicine, New Haven, Connecticut.

Correspondence and reprints: Pradeep G. Mutalik, MD, Department of Diagnostic Radiology, Yale University School of Medicine, 333 Cedar Street, New Haven, CT 06510; e-mail: <Pradeep.Mutalik{at}yale.edu>.

Received for publication: 12/22/00; accepted for publication: 05/15/01.

Objectives: To test the hypothesis that most instances of negated concepts in dictated medical documents can be detected by a strategy that relies on tools developed for the parsing of formal (computer) languages—specifically, a lexical scanner ("lexer") that uses regular expressions to generate a finite state machine, and a parser that relies on a restricted subset of context-free grammars, known as LALR(1) grammars.

Methods: A diverse training set of 40 medical documents from a variety of specialties was manually inspected and used to develop a program (Negfinder) that contained rules to recognize a large set of negated patterns occurring in the text. Negfinder's lexer and parser were developed using tools normally used to generate programming language compilers. The input to Negfinder consisted of medical narrative that was preprocessed to recognize UMLS concepts: the text of a recognized concept had been replaced with a coded representation that included its UMLS concept ID. The program generated an index with one entry per instance of a concept in the document, where the presence or absence of negation of that concept was recorded. This information was used to mark up the text of each document by color-coding it to make it easier to inspect. The parser was then evaluated in two ways: 1) a test set of 60 documents (30 discharge summaries, 30 surgical notes) marked-up by Negfinder was inspected visually to quantify false-positive and false-negative results; and 2) a different test set of 10 documents was independently examined for negatives by a human observer and by Negfinder, and the results were compared.

Results: In the first evaluation using marked-up documents, 8,358 instances of UMLS concepts were detected in the 60 documents, of which 544 were negations detected by the program and verified by human observation (true-positive results, or TPs). Thirteen instances were wrongly flagged as negated (false-positive results, or FPs), and the program missed 27 instances of negation (false-negative results, or FNs), yielding a sensitivity of 95.3 percent and a specificity of 97.7 percent. In the second evaluation using independent negation detection, 1,869 concepts were detected in 10 documents, with 135 TPs, 12 FPs, and 6 FNs, yielding a sensitivity of 95.7 percent and a specificity of 91.8 percent. One of the words "no," "denies/denied," "not," or "without" was present in 92.5 percent of all negations.

Conclusions: Negation of most concepts in medical narrative can be reliably detected by a simple strategy. The reliability of detection depends on several factors, the most important being the accuracy of concept matching.




This article has been cited by other articles:


Home page
J. Am. Med. Inform. Assoc.Home page
Y. Huang and H. J. Lowe
A Novel Hybrid Approach to Automated Negation Detection in Clinical Radiology Reports
J. Am. Med. Inform. Assoc., May 1, 2007; 14(3): 304 - 311.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
A. Turchin, N. S. Kolatkar, R. W. Grant, E. C. Makhni, M. L. Pendergrass, and J. S. Einbinder
Using Regular Expressions to Abstract Blood Pressure and Treatment Intensification Information from the Text of Physician Notes
J. Am. Med. Inform. Assoc., November 1, 2006; 13(6): 691 - 695.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
B. Hazlehurst, H. R. Frost, D. F. Sittig, and V. J. Stevens
MediClass: A System for Detecting and Classifying Encounter-based Clinical Events in Any Electronic Medical Record
J. Am. Med. Inform. Assoc., September 1, 2005; 12(5): 517 - 529.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
O. Hofmann and D. Schomburg
Concept-based annotation of enzyme classes
Bioinformatics, May 1, 2005; 21(9): 2059 - 2066.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
C. Friedman, L. Shagina, Y. Lussier, and G. Hripcsak
Automated Encoding of Clinical Documents Based on Natural Language Processing
J. Am. Med. Inform. Assoc., September 1, 2004; 11(5): 392 - 402.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
H. J. Murff, A. J. Forster, J. F. Peterson, J. M. Fiskio, H. L. Heiman, and D. W. Bates
Electronically Screening Discharge Summaries for Adverse Medical Events
J. Am. Med. Inform. Assoc., July 1, 2003; 10(4): 339 - 350.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
J. M. Fisk, P. Mutalik, F. W. Levin, J. Erdos, C. Taylor, and P. Nadkarni
Integrating Query of Relational and Textual Data in Clinical Databases: A Case Study
J. Am. Med. Inform. Assoc., January 1, 2003; 10(1): 21 - 38.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2001 by the American Medical Informatics Association.