| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Submitted on March 13, 2007
Accepted on June 11, 2007
Affiliation of the authors: 1 The MITRE Corporation, Bedford, MA; Department of Computer Science, Brandeis University, Waltham, MA ; 2 Center for Biomedical Informatics, Harvard Medical School, Boston, MA; 3 The MITRE Corporation, Bedford, MA; 4 The MITRE Corporation, Bedford, MA; Stanford Biomedical Informatics, Palo Alto, CA
* To whom correspondence should be addressed.
Objective This paper describes a successful approach to de-identification that was developed to participate in a recent AMIA-sponsored challenge evaluation.
Method Our approach focused on rapid adaptation of existing toolkits for named entity recognition using two existing toolkits, Carafe and LingPipe.
Results The "out of the box" Carafe system achieved a very good score (phrase F-measure of 0.9664) with only four hours of work to adapt it to the de-identification task. With further tuning, we were able to reduce the token-level error term by over 36% through task-specific feature engineering and the introduction of a lexicon, achieving a phrase F-measure of 0.9736.
Conclusions We were able to achieve good performance on the de-identification task by the rapid retargeting of existing toolkits. For the Carafe system, we developed a method for tuning the balance of recall vs. precision, as well as a confidence score that correlated well with the measured F-score.
This article has been cited by other articles:
![]() |
M. Bloomrosen and D. Detmer Advancing the Framework: Use of Health Data--A Report of a Working Conference of the American Medical Informatics Association J. Am. Med. Inform. Assoc., November 1, 2008; 15(6): 715 - 722. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. J. Friedlin and C. J. McDonald A Software Tool for Removing Patient Identifying Information from Clinical Documents J. Am. Med. Inform. Assoc., September 1, 2008; 15(5): 601 - 610. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Uzuner, Y. Luo, and P. Szolovits Evaluating the State-of-the-Art in Automatic De-identification J. Am. Med. Inform. Assoc., September 1, 2007; 14(5): 550 - 563. [Abstract] [Full Text] [PDF] |
||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH |