help button home button JAMIA Hate scrolling?
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH

First published August 21, 2007 as JAMIA PrePrint; doi:10.1197/jamia.M2080
Journal of the American Medical Informatics Association 2007;14(6):788-797
© 2007 American Medical Informatics Association


A more recent version of this article appeared on November 1, 2007
This Article
Right arrow Full Text (PDF)
Right arrow Data Supplement
Right arrow All Versions of this Article:
M2080v1
14/6/788    most recent
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Bales, M. E.
Right arrow Articles by Johnson, S. B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bales, M. E.
Right arrow Articles by Johnson, S. B.

Submitted on February 10, 2006
Accepted on July 26, 2007

Topological analysis of large-scale biomedical terminology structures

Michael E. Bales MPH1, Yves A. Lussier MD2, and Stephen B. Johnson PhD1*

Affiliation of the authors: 1 Department of Biomedical Informatics, Columbia University, New York, NY ; 2 Department of Medicine, University of Chicago, Chicago, IL

* To whom correspondence should be addressed.

Objective To characterize global structural features of large-scale biomedical terminologies using currently emerging statistical approaches.

Design Given rapid growth of terminologies, this research was designed to address scalability. We selected 16 terminologies covering a variety of domains from the UMLS Metathesaurus, a collection of terminological systems. Each was modeled as a network in which nodes were atomic concepts and links were relationships asserted by the source vocabulary. For comparison against each terminology we created three random networks of equivalent size and density.

Measurements Average node degree, node degree distribution, clustering coefficient, average path length.

Results Eight of 16 terminologies exhibited the small-world characteristics of a short average path length and strong local clustering. An overlapping subset of nine exhibited a power law distribution in node degrees, indicative of a scale-free architecture. We attribute these features to specific design constraints. Constraints on node connectivity, common in more synthetic classification systems, localize the effects of changes and deletions. In contrast, small-world and scale-free features, common in comprehensive medical terminologies, promote flexible navigation and less restrictive organic-like growth.

Conclusion While thought of as synthetic, grid-like structures, some controlled terminologies are structurally indistinguishable from natural language networks. This paradoxical result suggests that terminology structure is shaped not only by formal logic-based semantics, but by rules analogous to those that govern social networks and biological systems. Graph theoretic modeling shows early promise as a framework for describing terminology structure. Deeper understanding of these techniques may inform the development of scalable terminologies and ontologies.







HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH
Copyright © 1994 by the American Medical Informatics Association.