| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Review Paper |
Affiliation of the authors: Columbia University, New York, New York.
Correspondence and reprint requests: George Hripcsak, MD, MS, Department of Medical Informatics, Columbia University, 622 West 168th Street, VC5, New York, NY 10032; e-mail: <hripcsak{at}columbia.edu>.
Medical informatics systems are often designed to perform at the level of human experts. Evaluation of the performance of these systems is often constrained by lack of reference standards, either because the appropriate response is not known or because no simple appropriate response exists. Even when performance can be assessed, it is not always clear whether the performance is sufficient or reasonable. These challenges can be addressed if an evaluator enlists the help of clinical domain experts. 1) The experts can carry out the same tasks as the system, and then their responses can be combined to generate a reference standard. 2)The experts can judge the appropriateness of system output directly. 3) The experts can serve as comparison subjects with which the system can be compared. These are separate roles that have different implications for study design, metrics, and issues of reliability and validity. Diagrams help delineate the roles of experts in complex study designs.
This article has been cited by other articles:
![]() |
M. Verduijn, N. Peek, N. F. de Keizer, E.-J. van Lieshout, A.-C. J.M. de Pont, M. J. Schultz, E. de Jonge, and B. A.J.M. de Mol Individual and Joint Expert Judgments as Reference Standards in Artifact Detection J. Am. Med. Inform. Assoc., March 1, 2008; 15(2): 227 - 234. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Zhou, S. Parsons, and G. Hripcsak The Evaluation of a Temporal Reasoning System in Processing Clinical Discharge Summaries J. Am. Med. Inform. Assoc., January 1, 2008; 15(1): 99 - 106. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Kukafka, M. E. Bales, A. Burkhardt, and C. Friedman Human and Automated Coding of Rehabilitation Discharge Summaries According to the International Classification of Functioning, Disability, and Health J. Am. Med. Inform. Assoc., September 1, 2006; 13(5): 508 - 515. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. W. Chapman, J. N. Dowling, and M. M. Wagner Generating a Reliable Reference Standard Set for Syndromic Case Classification J. Am. Med. Inform. Assoc., November 1, 2005; 12(6): 618 - 629. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Hripcsak and A. S. Rothschild Agreement, the F-Measure, and Reliability in Information Retrieval J. Am. Med. Inform. Assoc., May 1, 2005; 12(3): 296 - 298. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Huang, H. J. Lowe, D. Klein, and R. J. Cucina Improved Identification of Noun Phrases in Clinical Radiology Reports Using a High-Performance Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon J. Am. Med. Inform. Assoc., May 1, 2005; 12(3): 275 - 285. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. G. Shapiro, E. Chung, L. T. Detwiler, J. L.V. Mejino Jr., A. V. Agoncillo, J. F. Brinkley, and C. Rosse Processes and Problems in the Formative Evaluation of an Interface to the Foundational Model of Anatomy Knowledge Base J. Am. Med. Inform. Assoc., January 1, 2005; 12(1): 35 - 46. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Ramnarayan, R. R. Kapoor, M. Coren, V. Nanduri, A. L. Tomlinson, P. M. Taylor, J. C. Wyatt, and J. F. Britto Measuring the Impact of Diagnostic Decision Support on the Quality of Clinical Decision Making: Development of a Reliable and Valid Composite Score J. Am. Med. Inform. Assoc., November 1, 2003; 10(6): 563 - 572. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Duclos-Cartolano and A. Venot Building and Evaluation of a Structured Representation of Pharmacokinetics Information Presented in SPCs: From Existing Conceptual Views of Pharmacokinetics Associated with Natural Language Processing to Object-oriented Design J. Am. Med. Inform. Assoc., May 1, 2003; 10(3): 271 - 280. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Bindels, A. Hasman, J. W. J. van Wersch, P. Pop, and R. A. G. Winkens The Reliability of Assessing the Appropriateness of Requested Diagnostic Tests Med Decis Making, January 1, 2003; 23(1): 31 - 37. [Abstract] [PDF] |
||||
![]() |
R. A. Miller Reference Standards in Evaluating System Performance J. Am. Med. Inform. Assoc., January 1, 2002; 9(1): 87 - 88. [Full Text] [PDF] |
||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |