Metadata mapping and reuse in caBIG.

Isaac Kunz, Ming Chin Lin, Lewis Frey

Research output: Contribution to journalArticle

16 Citations (Scopus)

Abstract

BACKGROUND: This paper proposes that interoperability across biomedical databases can be improved by utilizing a repository of Common Data Elements (CDEs), UML model class-attributes and simple lexical algorithms to facilitate the building domain models. This is examined in the context of an existing system, the National Cancer Institute (NCI)'s cancer Biomedical Informatics Grid (caBIG). The goal is to demonstrate the deployment of open source tools that can be used to effectively map models and enable the reuse of existing information objects and CDEs in the development of new models for translational research applications. This effort is intended to help developers reuse appropriate CDEs to enable interoperability of their systems when developing within the caBIG framework or other frameworks that use metadata repositories. RESULTS: The Dice (di-grams) and Dynamic algorithms are compared and both algorithms have similar performance matching UML model class-attributes to CDE class object-property pairs. With algorithms used, the baselines for automatically finding the matches are reasonable for the data models examined. It suggests that automatic mapping of UML models and CDEs is feasible within the caBIG framework and potentially any framework that uses a metadata repository. CONCLUSION: This work opens up the possibility of using mapping algorithms to reduce cost and time required to map local data models to a reference data model such as those used within caBIG. This effort contributes to facilitating the development of interoperable systems within caBIG as well as other metadata frameworks. Such efforts are critical to address the need to develop systems to handle enormous amounts of diverse data that can be leveraged from new biomedical methodologies.

Original languageEnglish
JournalBMC Bioinformatics
Volume10 Suppl 2
Publication statusPublished - 2009
Externally publishedYes

Fingerprint

Informatics
Metadata
Reuse
Cancer
Grid
Data structures
Neoplasms
Repository
Data Model
Interoperability
Translational Medical Research
National Cancer Institute (U.S.)
Attribute
Dice
Model
Dynamic Algorithms
Domain Model
Reference Model
Common Data Elements
Databases

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications

Cite this

Kunz, I., Lin, M. C., & Frey, L. (2009). Metadata mapping and reuse in caBIG. BMC Bioinformatics, 10 Suppl 2.

Metadata mapping and reuse in caBIG. / Kunz, Isaac; Lin, Ming Chin; Frey, Lewis.

In: BMC Bioinformatics, Vol. 10 Suppl 2, 2009.

Research output: Contribution to journalArticle

Kunz, I, Lin, MC & Frey, L 2009, 'Metadata mapping and reuse in caBIG.', BMC Bioinformatics, vol. 10 Suppl 2.
Kunz, Isaac ; Lin, Ming Chin ; Frey, Lewis. / Metadata mapping and reuse in caBIG. In: BMC Bioinformatics. 2009 ; Vol. 10 Suppl 2.
@article{2cf09e36fcd7475c94f1602a67b62f6a,
title = "Metadata mapping and reuse in caBIG.",
abstract = "BACKGROUND: This paper proposes that interoperability across biomedical databases can be improved by utilizing a repository of Common Data Elements (CDEs), UML model class-attributes and simple lexical algorithms to facilitate the building domain models. This is examined in the context of an existing system, the National Cancer Institute (NCI)'s cancer Biomedical Informatics Grid (caBIG). The goal is to demonstrate the deployment of open source tools that can be used to effectively map models and enable the reuse of existing information objects and CDEs in the development of new models for translational research applications. This effort is intended to help developers reuse appropriate CDEs to enable interoperability of their systems when developing within the caBIG framework or other frameworks that use metadata repositories. RESULTS: The Dice (di-grams) and Dynamic algorithms are compared and both algorithms have similar performance matching UML model class-attributes to CDE class object-property pairs. With algorithms used, the baselines for automatically finding the matches are reasonable for the data models examined. It suggests that automatic mapping of UML models and CDEs is feasible within the caBIG framework and potentially any framework that uses a metadata repository. CONCLUSION: This work opens up the possibility of using mapping algorithms to reduce cost and time required to map local data models to a reference data model such as those used within caBIG. This effort contributes to facilitating the development of interoperable systems within caBIG as well as other metadata frameworks. Such efforts are critical to address the need to develop systems to handle enormous amounts of diverse data that can be leveraged from new biomedical methodologies.",
author = "Isaac Kunz and Lin, {Ming Chin} and Lewis Frey",
year = "2009",
language = "English",
volume = "10 Suppl 2",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Metadata mapping and reuse in caBIG.

AU - Kunz, Isaac

AU - Lin, Ming Chin

AU - Frey, Lewis

PY - 2009

Y1 - 2009

N2 - BACKGROUND: This paper proposes that interoperability across biomedical databases can be improved by utilizing a repository of Common Data Elements (CDEs), UML model class-attributes and simple lexical algorithms to facilitate the building domain models. This is examined in the context of an existing system, the National Cancer Institute (NCI)'s cancer Biomedical Informatics Grid (caBIG). The goal is to demonstrate the deployment of open source tools that can be used to effectively map models and enable the reuse of existing information objects and CDEs in the development of new models for translational research applications. This effort is intended to help developers reuse appropriate CDEs to enable interoperability of their systems when developing within the caBIG framework or other frameworks that use metadata repositories. RESULTS: The Dice (di-grams) and Dynamic algorithms are compared and both algorithms have similar performance matching UML model class-attributes to CDE class object-property pairs. With algorithms used, the baselines for automatically finding the matches are reasonable for the data models examined. It suggests that automatic mapping of UML models and CDEs is feasible within the caBIG framework and potentially any framework that uses a metadata repository. CONCLUSION: This work opens up the possibility of using mapping algorithms to reduce cost and time required to map local data models to a reference data model such as those used within caBIG. This effort contributes to facilitating the development of interoperable systems within caBIG as well as other metadata frameworks. Such efforts are critical to address the need to develop systems to handle enormous amounts of diverse data that can be leveraged from new biomedical methodologies.

AB - BACKGROUND: This paper proposes that interoperability across biomedical databases can be improved by utilizing a repository of Common Data Elements (CDEs), UML model class-attributes and simple lexical algorithms to facilitate the building domain models. This is examined in the context of an existing system, the National Cancer Institute (NCI)'s cancer Biomedical Informatics Grid (caBIG). The goal is to demonstrate the deployment of open source tools that can be used to effectively map models and enable the reuse of existing information objects and CDEs in the development of new models for translational research applications. This effort is intended to help developers reuse appropriate CDEs to enable interoperability of their systems when developing within the caBIG framework or other frameworks that use metadata repositories. RESULTS: The Dice (di-grams) and Dynamic algorithms are compared and both algorithms have similar performance matching UML model class-attributes to CDE class object-property pairs. With algorithms used, the baselines for automatically finding the matches are reasonable for the data models examined. It suggests that automatic mapping of UML models and CDEs is feasible within the caBIG framework and potentially any framework that uses a metadata repository. CONCLUSION: This work opens up the possibility of using mapping algorithms to reduce cost and time required to map local data models to a reference data model such as those used within caBIG. This effort contributes to facilitating the development of interoperable systems within caBIG as well as other metadata frameworks. Such efforts are critical to address the need to develop systems to handle enormous amounts of diverse data that can be leveraged from new biomedical methodologies.

UR - http://www.scopus.com/inward/record.url?scp=63049129906&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=63049129906&partnerID=8YFLogxK

M3 - Article

VL - 10 Suppl 2

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

ER -