A vector clock is a data structure used for determining the partial ordering of events in a distributed system and detecting causality violations. Just as in Lamport timestamps, inter-process messages contain the state of the sending process's logical clock. A vector clock of a system of N processes is an array/vector of N logical clocks, one clock per process; a local "largest possible values" copy of the global clock-array is kept in each process.
Denote as the vector clock maintained by process , the clock updates proceed as follows:
Initially all clocks are zero.
Each time a process experiences an internal event, it increments its own logical clock in the vector by one. For instance, upon an event at process , it updates .
Each time a process sends a message, it increments its own logical clock in the vector by one (as in the bullet above, but not twice for the same event) then it pairs the message with a copy of its own vector and finally sends the pair.
Each time a process receives a message-vector clock pair, it increments its own logical clock in the vector by one and updates each element in its vector by taking the maximum of the value in its own vector clock and the value in the vector in the received pair (for every element). For example, if process receives a message from , it first increments its own logical clock in the vector by one and then updates its entire vector by setting .
Lamport originated the idea of logical Lamport clocks in 1978. However, the logical clocks in that paper were scalars, not vectors. The generalization to vector time was developed several times, apparently independently, by different authors in the early 1980's. At least 6 papers contain the concept.
The papers canonically cited in reference to vector clocks are Colin Fidge’s and Friedemann Mattern’s 1988 works,
as they (independently) established the name "vector clock" and the mathematical properties of vector clocks.
Vector clocks allow for the partial causal ordering of events.
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Computing is nowadays distributed over several machines, in a local IP-like network, a cloud or a P2P network. Failures are common and computations need to proceed despite partial failures of machin
A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another. Distributed computing is a field of computer science that studies distributed systems. The components of a distributed system interact with one another in order to achieve a common goal. Three significant challenges of distributed systems are: maintaining concurrency of components, overcoming the lack of a global clock, and managing the independent failure of components.
Most existing distributed systems use logical clocks to order events in the implementation of various consistency models. Although logical clocks are straightforward to implement and maintain, they may affect the scalability, availability, and latency of t ...
EPFL2014
In modern distributed systems, failures are the norm rather than the exception. In many cases, these failures are not benign. Settings such as the Internet might incur malicious (also called Byzantine or arbitrary) behavior and asynchrony. As a result, and ...
EPFL2008
, , ,
GentleRain is a new causally consistent geo-replicated data store that provides throughput comparable to eventual consistency and superior to current implementations of causal consistency. GentleRain uses a periodic aggregation protocol to deter- mine whet ...