Ontology-based data integration involves the use of one or more ontologies to effectively combine data or information from multiple heterogeneous sources. It is one of the multiple data integration approaches and may be classified as Global-As-View (GAV). The effectiveness of ontology‐based data integration is closely tied to the consistency and expressivity of the ontology used in the integration process.
Data from multiple sources are characterized by multiple types of heterogeneity. The following hierarchy is often used:
Syntactic heterogeneity: is a result of differences in representation format of data
Schematic or structural heterogeneity: the native model or structure to store data differ in data sources leading to structural heterogeneity. Schematic heterogeneity that particularly appears in structured databases is also an aspect of structural heterogeneity.
Semantic heterogeneity: differences in interpretation of the 'meaning' of data are source of semantic heterogeneity
System heterogeneity: use of different operating system, hardware platforms lead to system heterogeneity
Ontologies, as formal models of representation with explicitly defined concepts and named relationships linking them, are used to address the issue of semantic heterogeneity in data sources. In domains like bioinformatics and biomedicine, the rapid development, adoption and public availability of ontologies has made it possible for the data integration community to leverage them for semantic integration of data and information.
Ontologies enable the unambiguous identification of entities in heterogeneous information systems and assertion of applicable named relationships that connect these entities together. Specifically, ontologies play the following roles:
Content Explication The ontology enables accurate interpretation of data from multiple sources through the explicit definition of terms and relationships in the ontology.
Query Model In some systems like SIMS, the query is formulated using the ontology as a global query schema.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Enterprise information integration (EII) is the ability to support an unified view of data and information for an entire organization. In a data virtualization application of EII, a process of information integration, using data abstraction to provide a unified interface (known as uniform data access) for viewing all the data within an organization, and a single set of structures and naming conventions (known as uniform information representation) to represent this data; the goal of EII is to get a large set of heterogeneous data sources to appear to a user or system as a single, homogeneous data source.