| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Submitted on June 16, 2004
Accepted on October 20, 2004
Affiliation of the authors: 1 Section of Medical Informatics, Stanford University, Stanford, CA; 2 Department of Genetics, Stanford Medical Center, Stanford, CA; 3 Section of Medical Informatics, Stanford University, Stanford, CA; Department of Genetics, Stanford Medical Center, Stanford, CA
* To whom correspondence should be addressed.
Objective Biomedical databases summarize current scientific knowledge, but they generally require years of laborious curation effort to build, focusing on identifying pertinent literature and data in the voluminous biomedical literature. It is difficult to manually extract useful information embedded in the large volumes of literature, and automated intelligent text analysis tools are increasingly becoming essential to assist in these curation activities. Our goal was to develop an automated method to identify articles in Medline citations that contain pharmacogenetics data pertaining to gene-drug relationships.
Design We built and evaluated several candidate statistical models that characterize pharmacogenetics articles in terms of word usage and the profile of Medical Subject Headings (MeSH) used in those articles. The best-performing model was used to scan the entire Medline article database (11 million articles) to identify candidate pharmacogenetics articles.
Results A sampling of the articles identified from scanning Medline was reviewed by a pharmacologist to assess the precision of the method. Our approach identified 4,892 pharmacogenetics articles in the literature, with 92% precision. Our automated method took a fraction of the time to acquire these articles compared with the time expected to be taken to accumulate them manually. We have built a Web resource (http://pharmdemo.stanford.edu/pharmdb/main.spy) to provide access to our results.
Conclusion A statistical classification approach can screen the primary literature to pharmacogenetics articles with high precision. Such methods may assist curators in acquiring pertinent literature in building biomedical databases.
This article has been cited by other articles:
![]() |
P. Agarwal and D. B. Searls Literature mining in support of drug discovery Brief Bioinform, September 27, 2008; (2008) bbn035v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Rubin, N. H. Shah, and N. F. Noy Biomedical ontologies: a functional perspective Brief Bioinform, January 1, 2008; 9(1): 75 - 90. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. K. Lin, M. Clyne, M. Walsh, O. Gomez, W. Yu, M. Gwinn, and M. J. Khoury Tracking the Epidemiology of Human Genes in the Literature: The HuGE Published Literature Database Am. J. Epidemiol., July 1, 2006; 164(1): 1 - 4. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Goetz and C.-W. von der Lieth PubFinder: a tool for improving retrieval rate of relevant PubMed abstracts Nucleic Acids Res., July 1, 2005; 33(suppl_2): W774 - W778. [Abstract] [Full Text] [PDF] |
||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH |