Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
In computer science, a disjoint-set data structure, also called a union–find data structure or merge–find set, is a data structure that stores a collection of disjoint (non-overlapping) sets. Equivalently, it stores a partition of a set into disjoint subsets. It provides operations for adding new sets, merging sets (replacing them by their union), and finding a representative member of a set. The last operation makes it possible to find out efficiently if any two elements are in the same or different sets. While there are several ways of implementing disjoint-set data structures, in practice they are often identified with a particular implementation called a disjoint-set forest. This is a specialized type of forest which performs unions and finds in near-constant amortized time. To perform a sequence of m addition, union, or find operations on a disjoint-set forest with n nodes requires total time O(mα(n)), where α(n) is the extremely slow-growing inverse Ackermann function. Disjoint-set forests do not guarantee this performance on a per-operation basis. Individual union and find operations can take longer than a constant times α(n) time, but each operation causes the disjoint-set forest to adjust itself so that successive operations are faster. Disjoint-set forests are both asymptotically optimal and practically efficient. Disjoint-set data structures play a key role in Kruskal's algorithm for finding the minimum spanning tree of a graph. The importance of minimum spanning trees means that disjoint-set data structures underlie a wide variety of algorithms. In addition, disjoint-set data structures also have applications to symbolic computation, as well as in compilers, especially for register allocation problems. Disjoint-set forests were first described by Bernard A. Galler and Michael J. Fischer in 1964. In 1973, their time complexity was bounded to , the iterated logarithm of , by Hopcroft and Ullman. In 1975, Robert Tarjan was the first to prove the (inverse Ackermann function) upper bound on the algorithm's time complexity,.
Touradj Ebrahimi, Evgeniy Upenik, Michela Testolina
Pascal Fua, Doruk Oner, Mateusz Kozinski, Leonardo Citraro