Résumé
In computing and data management, data mapping is the process of creating data element mappings between two distinct data models. Data mapping is used as a first step for a wide variety of data integration tasks, including: Data transformation or data mediation between a data source and a destination Identification of data relationships as part of data lineage analysis Discovery of hidden sensitive data such as the last four digits of a social security number hidden in another user id as part of a data masking or de-identification project Consolidation of multiple databases into a single database and identifying redundant columns of data for consolidation or elimination For example, a company that would like to transmit and receive purchases and invoices with other companies might use data mapping to create data maps from a company's data to standardized ANSI ASC X12 messages for items such as purchase orders and invoices. X12 standards are generic Electronic Data Interchange (EDI) standards designed to allow a company to exchange data with any other company, regardless of industry. The standards are maintained by the Accredited Standards Committee X12 (ASC X12), with the American National Standards Institute (ANSI) accredited to set standards for EDI. The X12 standards are often called ANSI ASC X12 standards. The W3C introduced R2RML as a standard for mapping data in a relational database to data expressed in terms of the Resource Description Framework (RDF). In the future, tools based on semantic web languages such as RDF, the Web Ontology Language (OWL) and standardized metadata registry will make data mapping a more automatic process. This process will be accelerated if each application performed metadata publishing. Full automated data mapping is a very difficult problem (see semantic translation). Data mappings can be done in a variety of ways using procedural code, creating XSLT transforms or by using graphical mapping tools that automatically generate executable transformation programs.
À propos de ce résultat
Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.
Cours associés (13)
CS-489: Experience design
As we move towards a design economy, the success of new products, systems and services depend increasingly on the excellence of personal experience. This course introduces students to the notion and p
CS-401: Applied data analysis
This course teaches the basic techniques, methodologies, and practical skills required to draw meaningful insights from a variety of data, with the help of the most acclaimed software tools in the dat
FIN-525: Financial big data
The course introduces modern methods to acquire, clean, and analyze large quantities of financial data efficiently. The second part expands on how to apply these techniques and robust statistics to fi
Afficher plus
Séances de cours associées (49)
Interactions et analyse des données
Couvre une mission de travail sur les données de querelle et d'analyse à l'aide de la bibliothèque de pandas de Python pour les ensembles de données du monde réel.
Data Science: Python pour les ingénieurs - Partie II
Explore les data wrangling, le traitement numérique des data, et la visualisation scientifique en utilisant Python pour les ingénieurs.
Introduction au traitement du flux de données
Couvre les bases du traitement des flux de données, y compris des outils comme Apache Storm et Kafka, des concepts clés tels que le temps d'événement et les opérations de fenêtre, et les défis du traitement des flux.
Afficher plus
Publications associées (64)

Data Transformation in the Processing of Neuronal Signals: A Powerful Tool to Illuminate Informative Contents

Mohammad Ali Shaeri

Neuroscientists seek efficient solutions for deciphering the sophisticated unknowns of the brain. Effective development of complicated brain-related tools is the focal point of research in neuroscience and neurotechnology. Thanks to today's technological a ...
2023

E-Scan: Consuming Contextual Data with Model Plugins

Anastasia Ailamaki, Viktor Sanca

Extracting value and insights from increasingly heterogeneous data sources involves multiple systems combining and consuming the data. With multi-modal and context-rich data such as strings, text, videos, or images, the problem of standardizing the data mo ...
2023

Columnar Storage Optimization and Caching for Data Lakes

Haoqiong Bian

As a unified data repository, data lake plays a vital role in enterprise data management and analysis. It composes the raw files into tables that are processed in-situ by various computation engines and applications. Therefore, the read performance of the ...
2022
Afficher plus
Concepts associés (5)
Consolidation informatique
La consolidation est en informatique le regroupement cohérent de données. Elle concerne généralement des données organisées logiquement ou liées entre elles. Plus spécifiquement pour les tableurs, il s’agit du regroupement de plusieurs tableaux issus de feuilles différentes (les feuilles sont des composantes des tableurs) voire de classeurs différents. La consolidation de données consiste à rassembler plusieurs données semblables afin d’obtenir un rapport plus facile à consulter que l’information brute présente sur le serveur, avec le moins de perte d’information possible.
Data transformation (computing)
In computing, data transformation is the process of converting data from one format or structure into another format or structure. It is a fundamental aspect of most data integration and data management tasks such as data wrangling, data warehousing, data integration and application integration. Data transformation can be simple or complex based on the required changes to the data between the source (initial) data and the target (final) data. Data transformation is typically performed via a mixture of manual and automated steps.
Semantic heterogeneity
Semantic heterogeneity is when database schema or datasets for the same domain are developed by independent parties, resulting in differences in meaning and interpretation of data values. Beyond structured data, the problem of semantic heterogeneity is compounded due to the flexibility of semi-structured data and various tagging methods applied to documents or unstructured data. Semantic heterogeneity is one of the more important sources of differences in heterogeneous datasets.
Afficher plus