help button home button JAMIA Hate scrolling?
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH

First published October 12, 2005 as JAMIA PrePrint; doi:10.1197/jamia.M1909
Journal of the American Medical Informatics Association 2006;13(1):96-105
© 2006 American Medical Informatics Association


A more recent version of this article appeared on January 1, 2006
This Article
Right arrow Full Text (PDF)
Right arrow Appendix
Right arrow All Versions of this Article:
M1909v1
M1909v2
13/1/96    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Bernstam, E. V.
Right arrow Articles by Hersh, W. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bernstam, E. V.
Right arrow Articles by Hersh, W. R.

Submitted on July 12, 2005
Accepted on September 16, 2005

Using citation data to improve retrieval from MEDLINE

Elmer V. Bernstam MD, MSE1*, Jorge R. Herskovic MD, MS1, Yindalon Aphinyanaphongs MS2, Constantin F. Aliferis MD, PhD2, Madurai G. Sriram1, and William R. Hersh MD3

Affiliation of the authors: 1 School of Health Information Sciences, The University of Texas Health Science Center at Houston, Houston, TX; 2 Department of Biomedical Informatics, Vanderbilt University, Nashville, TN; 3 Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR

* To whom correspondence should be addressed.

Objective To determine whether algorithms developed for the World Wide Web can be applied to the biomedical literature in order to identify articles that are important as well as relevant.

Design and Measurements A direct comparison of eight algorithms: simple PubMed queries, clinical queries (sensitive and specific versions), vector cosine comparison, citation count, PageRank and machine learning based on polynomial support vector machines. The objective was to prioritize important articles, defined as being included in a pre-existing bibliography of important literature in surgical oncology.

Results Citation-based algorithms were more effective than non citation-based algorithms at identifying important articles. The most effective strategies were simple citation count and PageRank, which on average identified over six important articles in the first 100 results compared to 0.85 for the best non-citation based algorithm (p < 0.001). We saw similar differences between citation-based and non citation-based algorithms at 10, 20, 50, 200, 500 and 1000 results (p < 0.001). Citation lag affects performance of PageRank more than simple citation count. However, in spite of citation lag, citation-based algorithms remain more effective than non-citation based algorithms.

Conclusion Algorithms which have proven successful on the World Wide Web can be applied to biomedical information retrieval. Citation-based algorithms can help identify important articles within large sets of relevant results. Further studies are needed to determine whether citation-based algorithms can effectively meet actual user information needs.




This article has been cited by other articles:


Home page
BioinformaticsHome page
A. Smith, K. Cheung, M. Krauthammer, M. Schultz, and M. Gerstein
Leveraging the structure of the Semantic Web to enhance information retrieval for proteomics
Bioinformatics, November 15, 2007; 23(22): 3073 - 3079.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
Y. Lin, W. Li, K. Chen, and Y. Liu
A Document Clustering and Ranking System for Exploring MEDLINE Citations
J. Am. Med. Inform. Assoc., September 1, 2007; 14(5): 651 - 661.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
J. R. Herskovic, L. Y. Tanaka, W. Hersh, and E. V. Bernstam
A Day in the Life of PubMed: Analysis of a Typical Day's Query Log
J. Am. Med. Inform. Assoc., March 1, 2007; 14(2): 212 - 220.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH
Copyright © 1994 by the American Medical Informatics Association.