help button home button JAMIA Hate scrolling?
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Friedman, C.
Right arrow Articles by Liu, H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Friedman, C.
Right arrow Articles by Liu, H.
J Am Med Inform Assoc. 1999;6:76-87. DOI .
© 1999 American Medical Informatics Association


Research Paper

Representing Information in Patient Reports Using Natural Language Processing and the Extensible Markup Language

Carol Friedman, PhD, George Hripcsak, MD, Lyuda Shagina and Hongfang Liu

Columbia University, New York (CF, GH, LS); Queens College City University New York, New York (CF, HL).

Corresdpondence and reprints: Carol Friedman, PhD, Department of Medical Informatics, Columbia University, 161 Fort Washington Avenue, AP-1310, New York, NY 10032. e-mail: <friedma{at}flux.cpmc.columbia.edu>.

Received for publication: 05/18/98; accepted for publication: 09/16/98.

Objective: To design a document model that provides reliable and efficient access to clinical information in patient reports for a broad range of clinical applications, and to implement an automated method using natural language processing that maps textual reports to a form consistent with the model.

Methods: A document model that encodes structured clinical information in patient reports while retaining the original contents was designed using the extensible markup language (XML), and a document type definition (DTD) was created. An existing natural language processor (NLP) was modified to generate output consistent with the model. Two hundred reports were processed using the modified NLP system, and the XML output that was generated was validated using an XML validating parser.

Results: The modified NLP system successfully processed all 200 reports. The output of one report was invalid, and 199 reports were valid XML forms consistent with the DTD.

Conclusions: Natural language processing can be used to automatically create an enriched document that contains a structured component whose elements are linked to portions of the original textual report. This integrated document model provides a representation where documents containing specific information can be accurately and efficiently retrieved by querying the structured components. If manual review of the documents is desired, the salient information in the original reports can also be identified and highlighted. Using an XML model of tagging provides an additional benefit in that software tools that manipulate XML documents are readily available.




This article has been cited by other articles:


Home page
J. Am. Med. Inform. Assoc.Home page
E. S. Chen, G. Hripcsak, H. Xu, M. Markatou, and C. Friedman
Automated Acquisition of Disease Drug Knowledge from Biomedical and Clinical Documents: An Initial Study
J. Am. Med. Inform. Assoc., January 1, 2008; 15(1): 87 - 98.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
O. Uzuner, Y. Luo, and P. Szolovits
Evaluating the State-of-the-Art in Automatic De-identification
J. Am. Med. Inform. Assoc., September 1, 2007; 14(5): 550 - 563.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
Y. Huang and H. J. Lowe
A Novel Hybrid Approach to Automated Negation Detection in Clinical Radiology Reports
J. Am. Med. Inform. Assoc., May 1, 2007; 14(3): 304 - 311.
[Abstract] [Full Text] [PDF]


Home page
Proc Am Thorac SocHome page
Y. A. Lussier and Y. Liu
Computational Approaches to Phenotyping: High-Throughput Phenomics
Proceedings of the ATS, January 1, 2007; 4(1): 18 - 25.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
C. Friedman, T. Borlawsky, L. Shagina, H. R. Xing, and Y. A. Lussier
Bio-Ontology and text: bridging the modeling gap
Bioinformatics, October 1, 2006; 22(19): 2421 - 2429.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
R. H. Dolin, L. Alschuler, S. Boyer, C. Beebe, F. M. Behlen, P. V. Biron, and A. Shabo (Shvo)
HL7 Clinical Document Architecture, Release 2
J. Am. Med. Inform. Assoc., January 1, 2006; 13(1): 30 - 39.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
C. Friedman, L. Shagina, Y. Lussier, and G. Hripcsak
Automated Encoding of Clinical Documents Based on Natural Language Processing
J. Am. Med. Inform. Assoc., September 1, 2004; 11(5): 392 - 402.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
R. H. Dolin, L. Alschuler, C. Beebe, P. V. Biron, S. L. Boyer, D. Essin, E. Kimber, T. Lincoln, and J. E. Mattison
The HL7 Clinical Document Architecture
J. Am. Med. Inform. Assoc., November 1, 2001; 8(6): 552 - 569.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
P. G. Mutalik, A. Deshpande, and P. M. Nadkarni
Use of General-purpose Negation Detection to Augment Concept Indexing of Medical Documents: A Quantitative Study Using the UMLS
J. Am. Med. Inform. Assoc., November 1, 2001; 8(6): 598 - 609.
[Abstract] [Full Text] [PDF]


Home page
RadioGraphicsHome page
R. K. Taira, S. G. Soderland, and R. M. Jakobovits
Automatic Structuring of Radiology Free-Text Reports
RadioGraphics, January 1, 2001; 21(1): 237 - 245.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
C. Lovis and R. H. Baud
Fast Exact String Pattern-matching Algorithms Adapted to the Characteristics of the Medical Language
J. Am. Med. Inform. Assoc., July 1, 2000; 7(4): 378 - 391.
[Abstract] [Full Text] [PDF]


Home page
RadioGraphicsHome page
C. Wang and C. E. Kahn Jr
Potential Use of Extensible Markup Language for Radiology Reporting: A Tutorial
RadioGraphics, January 1, 2000; 20(1): 287 - 293.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 1999 by the American Medical Informatics Association.