Data architecture consist of models, policies, rules, and standards that govern which data is collected and how it is stored, arranged, integrated, and put to use in data systems and in organizations. Data is usually one of several architecture domains that form the pillars of an enterprise architecture or solution architecture.
A data architecture aims to set data standards for all its data systems as a vision or a model of the eventual interactions between those data systems. Data integration, for example, should be dependent upon data architecture standards since data integration requires data interactions between two or more data systems. A data architecture, in part, describes the data structures used by a business and its computer applications software. Data architectures address data in storage, data in use, and data in motion; descriptions of data stores, data groups, and data items; and mappings of those data artifacts to data qualities, applications, locations, etc.
Essential to realizing the target state, data architecture describes how data is processed, stored, and used in an information system. It provides criteria for data processing operations to make it possible to design data flows and also control the flow of data in the system.
The data architect is typically responsible for defining the target state, aligning during development and then following up to ensure enhancements are done in the spirit of the original blueprint.
During the definition of the target state, the data architecture breaks a subject down to the atomic level and then builds it back up to the desired form. The data architect breaks the subject down by going through three traditional architectural stages:
Conceptual - represents all business entities.
Logical - represents the logic of how entities are related.
Physical - the realization of the data mechanisms for a specific type of functionality.
The "data" column of the Zachman Framework for enterprise architecture –
In this second, broader sense, data architecture includes a complete analysis of the relationships among an organization's functions, available technologies, and data types.