help button home button JAMIA Hate scrolling?
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH

First published November 23, 2004 as JAMIA PrePrint; doi:10.1197/jamia.M1641
Journal of the American Medical Informatics Association 2005;12(2):207-216
© 2005 American Medical Informatics Association


A more recent version of this article appeared on March 1, 2005
This Article
Right arrow Full Text (PDF)
Right arrow Data Supplement
Right arrow All Versions of this Article:
M1641v1
12/2/207    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Aphinyanaphongs, Y.
Right arrow Articles by Aliferis, C. F.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Aphinyanaphongs, Y.
Right arrow Articles by Aliferis, C. F.

Submitted on June 17, 2004
Accepted on November 17, 2004

Text Categorization Models for High Quality Article Retrieval in Internal Medicine

Yindalon Aphinyanaphongs MS1*, Ioannis Tsamardinos PhD1, Alexander Statnikov MS1, Douglas Hardin PhD2, and Constantin F. Aliferis MD, PhD1

Affiliation of the authors: 1 Department of Biomedical Informatics, Vanderbilt University, Nashville, TN; 2 Department of Mathematics, Vanderbilt University, Nashville, TN

* To whom correspondence should be addressed.

Objective Finding the best scientific evidence that applies to a patient problem is becoming exceedingly difficult due to the exponential growth of medical publications. The objective of this study was to apply machine learning techniques to automatically identify high quality, content-specific articles for one time period in internal medicine and compare their performance to the Boolean-based PubMed clinical query filters of Haynes, et. al.

Design The selection criteria of the ACP Journal Club for articles in internal medicine were the basis for identifying high quality articles in the areas of etiology, prognosis, diagnosis, and treatment. Naive Bayes, a specialized AdaBoost algorithm, and linear and polynomial support vector machines were applied to identify these articles.

Measurements The machine learning models were compared in each category to each other and to the clinical query filters using area under the receiver operating characteristic curves, 11-point average recall-precision, and a sensitivity/ specificity match method.

Results In most categories, the data-induced models have better or comparable sensitivity, specificity, and precision than the clinical query filters. The polynomial support vector machine models perform the best among all learning methods in ranking the articles as evaluated by area under the receiver operating curve and 11-point average recall-precision.

Conclusions This research shows that, using machine learning methods, it is possible to automatically build models for retrieving high quality, content-specific articles, using inclusion or citation by the ACP Journal Club as a gold standard, in a given time period in internal medicine that perform better than currently-used PubMed clinical query filters.




This article has been cited by other articles:


Home page
J. Am. Med. Inform. Assoc.Home page
E. Coiera, J. I. Westbrook, and K. Rogers
Clinical Decision Velocity is Increased when Meta-search Filters Enhance an Evidence Retrieval System
J. Am. Med. Inform. Assoc., September 1, 2008; 15(5): 638 - 646.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
S. V.S. Pakhomov, P. L. Hanson, S. S. Bjornsen, and S. A. Smith
Automatic Classification of Foot Examination Findings Using Clinical Notes and Machine Learning
J. Am. Med. Inform. Assoc., March 1, 2008; 15(2): 198 - 202.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
I. A. McCowan, D. C. Moore, A. N. Nguyen, R. V. Bowman, B. E. Clarke, E. E. Duhig, and M.-J. Fry
Collection of Cancer Stage Data by Classifying Free-text Medical Reports
J. Am. Med. Inform. Assoc., November 1, 2007; 14(6): 736 - 745.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
Y. Lin, W. Li, K. Chen, and Y. Liu
A Document Clustering and Ranking System for Exploring MEDLINE Citations
J. Am. Med. Inform. Assoc., September 1, 2007; 14(5): 651 - 661.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
X. Lu, B. Zheng, A. Velivelli, and C. Zhai
Enhancing Text Categorization with Semantic-enriched Representation and Training Data Augmentation
J. Am. Med. Inform. Assoc., September 1, 2006; 13(5): 526 - 535.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
B. Han, Z. Obradovic, Z.-Z. Hu, C. H. Wu, and S. Vucetic
Substring selection for biomedical document classification
Bioinformatics, September 1, 2006; 22(17): 2136 - 2142.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
A.M. Cohen, W.R. Hersh, K. Peterson, and P.-Y. Yen
Reducing Workload in Systematic Review Preparation Using Automated Citation Classification
J. Am. Med. Inform. Assoc., March 1, 2006; 13(2): 206 - 219.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
E. V. Bernstam, J. R. Herskovic, Y. Aphinyanaphongs, C. F. Aliferis, M. G. Sriram, and W. R. Hersh
Using Citation Data to Improve Retrieval from MEDLINE
J. Am. Med. Inform. Assoc., January 1, 2006; 13(1): 96 - 105.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
N. L. Wilczynski and R. B. Haynes
Optimal Search Strategies for Detecting Clinically Sound Prognostic Studies in EMBASE: An Analytic Survey
J. Am. Med. Inform. Assoc., July 1, 2005; 12(4): 481 - 485.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH
Copyright © 1994 by the American Medical Informatics Association.