help button home button JAMIA Bigger figures
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

First published January 31, 2005 as JAMIA PrePrint; doi:10.1197/jamia.M1695
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
M1695v1
12/3/275    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Huang, Y.
Right arrow Articles by Cucina, R. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Huang, Y.
Right arrow Articles by Cucina, R. J.
J Am Med Inform Assoc. 2005;12:275-285. DOI 10.1197/jamia.M1695.
© 2005 American Medical Informatics Association


Application of Information Technology

Improved Identification of Noun Phrases in Clinical Radiology Reports Using a High-Performance Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon

Yang Huang, MS, Henry J. Lowe, MD, Dan Klein, PhD and Russell J. Cucina, MD, MS

Affiliations of the authors: Stanford Medical Informatics and Stanford Center for Clinical Informatics, Stanford University School of Medicine, Stanford, CA (YH, HJL); Computer Science Division, University of California at Berkeley, Berkeley, CA (DK); Department of Medicine, University of California, San Francisco, San Francisco, CA (RJC).

Correspondence and reprints: Yang Huang, MS, Stanford Medical Informatics, MSOB X215, 251 Campus Drive, Stanford, CA 94305-5479; e-mail: <huangy{at}stanford.edu>.

Received for publication: 09/09/04; accepted for publication: 12/28/04.

Objective: The aim of this study was to develop and evaluate a method of extracting noun phrases with full phrase structures from a set of clinical radiology reports using natural language processing (NLP) and to investigate the effects of using the UMLS® Specialist Lexicon to improve noun phrase identification within clinical radiology documents.

Design: The noun phrase identification (NPI) module is composed of a sentence boundary detector, a statistical natural language parser trained on a nonmedical domain, and a noun phrase (NP) tagger. The NPI module processed a set of 100 XML-represented clinical radiology reports in Health Level 7 (HL7)® Clinical Document Architecture (CDA)–compatible format. Computed output was compared with manual markups made by four physicians and one author for maximal (longest) NP and those made by one author for base (simple) NP, respectively. An extended lexicon of biomedical terms was created from the UMLS Specialist Lexicon and used to improve NPI performance.

Results: The test set was 50 randomly selected reports. The sentence boundary detector achieved 99.0% precision and 98.6% recall. The overall maximal NPI precision and recall were 78.9% and 81.5% before using the UMLS Specialist Lexicon and 82.1% and 84.6% after. The overall base NPI precision and recall were 88.2% and 86.8% before using the UMLS Specialist Lexicon and 93.1% and 92.6% after, reducing false-positives by 31.1% and false-negatives by 34.3%.

Conclusion: The sentence boundary detector performs excellently. After the adaptation using the UMLS Specialist Lexicon, the statistical parser's NPI performance on radiology reports increased to levels comparable to the parser's native performance in its newswire training domain and to that reported by other researchers in the general nonmedical domain.




This article has been cited by other articles:


Home page
J. Am. Med. Inform. Assoc.Home page
O. Uzuner, Y. Luo, and P. Szolovits
Evaluating the State-of-the-Art in Automatic De-identification
J. Am. Med. Inform. Assoc., September 1, 2007; 14(5): 550 - 563.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
Y. Huang and H. J. Lowe
A Novel Hybrid Approach to Automated Negation Detection in Clinical Radiology Reports
J. Am. Med. Inform. Assoc., May 1, 2007; 14(3): 304 - 311.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
R. H. Dolin, L. Alschuler, S. Boyer, C. Beebe, F. M. Behlen, P. V. Biron, and A. Shabo (Shvo)
HL7 Clinical Document Architecture, Release 2
J. Am. Med. Inform. Assoc., January 1, 2006; 13(1): 30 - 39.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2005 by the American Medical Informatics Association.