help button home button JAMIA Hate scrolling?
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Cooper, G. F.
Right arrow Articles by Miller, R. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Cooper, G. F.
Right arrow Articles by Miller, R. A.
Journal of the American Medical Informatics Association 5:62-75 (1998)
© 1998 American Medical Informatics Association


Research Paper

An Experiment Comparing Lexical and Statistical Methods for Extracting MeSH Terms from Clinical Free Text

Gregory F. Cooper, MD, PhD and Randolph A. Miller, MD

Affiliations of the authors: Center for Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA (GFC); Division of Biomedical Informatics, Vanderbilt University, Nashville, TN (RAM).

Correspondence and reprints: Gregory F. Cooper, MD, PhD, University of Pittsburgh, Center for Biomedical Informatics, Suite 8084 Forbes Tower, 200 Lothrop Street, Pittsburgh, PA 15213-2582. e-mail: <gfc{at}cbmi.upmc.edu>.

Abstract Objective: A primary goal of the University of Pittsburgh's 1990-94 UMLS-sponsored effort was to develop and evaluate PostDoc (a lexical indexing system) and Pindex (a statistical indexing system) comparatively, and then in combination as a hybrid system. Each system takes as input a portion of the free text from a narrative part of a patient's electronic medical record and returns a list of suggested MeSH terms to use in formulating a Medline search that includes concepts in the text. This paper describes the systems and reports an evaluation. The intent is for this evaluation to serve as a step toward the eventual realization of systems that assist healthcare personnel in using the electronic medical record to construct patient-specific searches of Medline.

Design: The authors tested the performances of PostDoc, Pindex, and a hybrid system, using text taken from randomly selected clinical records, which were stratified to include six radiology reports, six pathology reports, and six discharge summaries. They identified concepts in the clinical records that might conceivably be used in performing a patient-specific Medline search. Each system was given the free text of each record as an input. The extent to which a system-derived list of MeSH terms captured the relevant concepts in these documents was determined based on blinded assessments by the authors.

Results: PostDoc output a mean of approximately 19 MeSH terms per report, which included about 40% of the relevant report concepts. Pindex output a mean of approximately 57 terms per report and captured about 45% of the relevant report concepts. A hybrid system captured approximately 66% of the relevant concepts and output about 71 terms per report.

Conclusion: The outputs of PostDoc and Pindex are complementary in capturing MeSH terms from clinical free text. The results suggest possible approaches to reduce the number of terms output while maintaining the percentage of terms captured, including the use of UMLS semantic types to constrain the output list to contain only clinically relevant MeSH terms.




This article has been cited by other articles:


Home page
J. Am. Med. Inform. Assoc.Home page
S. Sohn, W. Kim, D. C. Comeau, and W. J. Wilbur
Optimal Training Sets for Bayesian Prediction of MeSH(R) Assignment
J. Am. Med. Inform. Assoc., July 1, 2008; 15(4): 546 - 553.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
C. A. Sneiderman, D. Demner-Fushman, M. Fiszman, N. C. Ide, and T. C. Rindflesch
Knowledge-based Methods to Help Clinicians Find Answers in MEDLINE
J. Am. Med. Inform. Assoc., November 1, 2007; 14(6): 772 - 780.
[Abstract] [Full Text] [PDF]


Home page
Diabetes CareHome page
A. Turchin, I. S. Kohane, and M. L. Pendergrass
Identification of Patients With Diabetes From the Text of Physician Notes in the Electronic Medical Record
Diabetes Care, July 1, 2005; 28(7): 1794 - 1795.
[Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
Y. Huang, H. J. Lowe, D. Klein, and R. J. Cucina
Improved Identification of Noun Phrases in Clinical Radiology Reports Using a High-Performance Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon
J. Am. Med. Inform. Assoc., May 1, 2005; 12(3): 275 - 285.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
C. Friedman, L. Shagina, Y. Lussier, and G. Hripcsak
Automated Encoding of Clinical Documents Based on Natural Language Processing
J. Am. Med. Inform. Assoc., September 1, 2004; 11(5): 392 - 402.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
J. C. Denny, J. D. Smithers, R. A. Miller, and A. Spickard III
"Understanding" Medical School Curriculum Content Using KnowledgeMap
J. Am. Med. Inform. Assoc., July 1, 2003; 10(4): 351 - 362.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
G. Hripcsak and A. Wilcox
Reference Standards, Judges, and Comparison Subjects: Roles for Experts in Evaluating System Performance
J. Am. Med. Inform. Assoc., January 1, 2002; 9(1): 1 - 15.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
W. Kim and W. J. Wilbur
Corpus-based Statistical Screening for Phrase Identification
J. Am. Med. Inform. Assoc., September 1, 2000; 7(5): 499 - 511.
[Abstract] [Full Text]


Home page
J. Am. Med. Inform. Assoc.Home page
D. B. Aronow, F. Fangfang, and W. B. Croft
Ad Hoc Classification of Radiology Reports
J. Am. Med. Inform. Assoc., September 1, 1999; 6(5): 393 - 411.
[Abstract] [Full Text] [PDF]


Home page
Journal of Information ScienceHome page
A. O'Rourke, A. Booth, and N. Ford
Another fine MeSH: clinical medicine meets information science
Journal of Information Science, August 1, 1999; 25(4): 275 - 281.
[Abstract] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
G. Hripcsak, G. J. Kuperman, C. Friedman, and D. F. Heitjan
A Reliability Study for Evaluating Information Extraction from Radiology Reports
J. Am. Med. Inform. Assoc., March 1, 1999; 6(2): 143 - 150.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
E. E. Westberg and R. A. Miller
The Basis for Using the Internet to Support the Information Needs of Primary Care
J. Am. Med. Inform. Assoc., January 1, 1999; 6(1): 6 - 25.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
A. T. McCray and R. A. Miller
Making the Conceptual Connections: The UMLS after a Decade of Research and Development
J. Am. Med. Inform. Assoc., January 1, 1998; 5(1): 129 - 130.
[Full Text]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 1998 by the American Medical Informatics Association.