| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Submitted on March 16, 2007
Accepted on June 11, 2007
Affiliation of the authors: 1 Department of Informatics, University of Szeged, Szeged, Hungary; 2 Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University of Szeged, Szeged, Hungary
* To whom correspondence should be addressed.
Objective The anonymisation of medical records is of great importance in the human life sciences because a de-identified text can be made publicly available for non-hospital researchers as well, to facilitate research on human diseases. Here the authors have developed a de-identification model that can successfully remove personal health information (PHI) from discharge records to make them conform to the guidelines of the Health Information Portability and Accountability Act.
Design We introduce here a novel, machine learning-based iterative Named Entity Recognition approach intended for use on semi-structured documents like discharge records. Our method identifies PHI in several steps. First, it labels all entities whose tags can be inferred from the structure of the text and it then utilises this information to find further PHI phrases in the flow text parts of the document.
Measurements
Following the standard evaluation method of the first Workshop on Challenges in Natural Language Processing for Clinical Data, we used token-level Precision, Recall and F
=1 measure metrics for evaluation.
Results Our system achieved outstanding accuracy on the standard evaluation dataset of the de-identification challenge, with an F measure of 99.7534% for the best submitted model.
Conclusion We can say that our system is competitive with the current state-of-the-art solutions, while we describe here several techniques that can be beneficial in other tasks that need to handle structured documents such as clinical records.
This article has been cited by other articles:
![]() |
F. P. Morrison, L. Li, A. M. Lai, and G. Hripcsak Repurposing the Clinical Record: Can an Existing Natural Language Processing System De-identify Clinical Notes? J. Am. Med. Inform. Assoc., January 1, 2009; 16(1): 37 - 39. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Bloomrosen and D. Detmer Advancing the Framework: Use of Health Data--A Report of a Working Conference of the American Medical Informatics Association J. Am. Med. Inform. Assoc., November 1, 2008; 15(6): 715 - 722. [Abstract] [Full Text] [PDF] |
||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH |