A dimension is a structure that categorizes facts and measures in order to enable users to answer business questions. Commonly used dimensions are people, products, place and time. (Note: People and time sometimes are not modeled as dimensions.) In a data warehouse, dimensions provide structured labeling information to otherwise unordered numeric measures. The dimension is a data set composed of individual, non-overlapping data elements. The primary functions of dimensions are threefold: to provide filtering, grouping and labelling. These functions are often described as "slice and dice". A common data warehouse example involves sales as the measure, with customer and product as dimensions. In each sale a customer buys a product. The data can be sliced by removing all customers except for a group under study, and then diced by grouping by product. A dimensional data element is similar to a categorical variable in statistics. Typically dimensions in a data warehouse are organized internally into one or more hierarchies. "Date" is a common dimension, with several possible hierarchies: "Days (are grouped into) Months (which are grouped into) Years", "Days (are grouped into) Weeks (which are grouped into) Years" "Days (are grouped into) Months (which are grouped into) Quarters (which are grouped into) Years" etc. A slowly changing dimension is a set of data attributes that change slowly over a period of time rather than changing regularly e.g. address or name. These attributes can change over a period of time and that will get combined as a slowly changing dimension. These dimension can be classified in types: Type 0 (Retain original): Attributes never change. No history. Type 1 (Overwrite): Old values are overwritten with new values for attribute. No history. Type 2 (Add new row): A new row is created with either a start date / end date or a version for a new value. This creates history. Type 3 (Add new attribute): A new column is created for a new value. History is limited to the number of columns designated for storing historical data.
Christoph Koch, Peter Lindner, Zhekai Jiang, Sachin Basil John