In computer science, a rough set, first described by Polish computer scientist Zdzisław I. Pawlak, is a formal approximation of a crisp set (i.e., conventional set) in terms of a pair of sets which give the lower and the upper approximation of the original set. In the standard version of rough set theory (Pawlak 1991), the lower- and upper-approximation sets are crisp sets, but in other variations, the approximating sets may be fuzzy sets.
The following section contains an overview of the basic framework of rough set theory, as originally proposed by Zdzisław I. Pawlak, along with some of the key definitions. More formal properties and boundaries of rough sets can be found in Pawlak (1991) and cited references. The initial and basic theory of rough sets is sometimes referred to as "Pawlak Rough Sets" or "classical rough sets", as a means to distinguish from more recent extensions and generalizations.
Let be an information system (attribute–value system), where is a non-empty, finite set of objects (the universe) and is a non-empty, finite set of attributes such that for every . is the set of values that attribute may take. The information table assigns a value from to each attribute and object in the universe .
With any there is an associated equivalence relation :
The relation is called a -indiscernibility relation. The partition of is a family of all equivalence classes of and is denoted by (or ).
If , then and are indiscernible (or indistinguishable) by attributes from .
The equivalence classes of the -indiscernibility relation are denoted .
For example, consider the following information table:
{| class="wikitable" style="text-align:center; width:30%" border="1"
|+ Sample Information System
! Object !! !! !! !! !!
|-
!
| 1 || 2 || 0 || 1 || 1
|-
!
| 1 || 2 || 0 || 1 || 1
|-
!
| 2 || 0 || 0 || 1 || 0
|-
!
| 0 || 0 || 1 || 2 || 1
|-
!
| 2 || 1 || 0 || 2 || 1
|-
!
| 0 || 0 || 1 || 2 || 2
|-
!
| 2 || 0 || 0 || 1 || 0
|-
!
| 0 || 1 || 2 || 2 || 1
|-
!
| 2 || 1 || 0 || 2 || 2
|-
!
| 2 || 0 || 0 || 1 || 0
|}
When the full set of attributes is considered, we see that we have the following seven equivalence classes:
Thus, the two objects within the first equivalence class, , cannot be distinguished from each other based on the available attributes, and the three objects within the second equivalence class, , cannot be distinguished from one another based on the available attributes.