The Lamport timestamp algorithm is a simple logical clock algorithm used to determine the order of events in a distributed computer system. As different nodes or processes will typically not be perfectly synchronized, this algorithm is used to provide a partial ordering of events with minimal overhead, and conceptually provide a starting point for the more advanced vector clock method. The algorithm is named after its creator, Leslie Lamport.
Distributed algorithms such as resource synchronization often depend on some method of ordering events to function. For example, consider a system with two processes and a disk. The processes send messages to each other, and also send messages to the disk requesting access. The disk grants access in the order the messages were received. For example process sends a message to the disk requesting write access, and then sends a read instruction message to process . Process receives the message, and as a result sends its own read request message to the disk. If there is a timing delay causing the disk to receive both messages at the same time, it can determine which message happened-before the other: happens-before if one can get from to by a sequence of moves of two types: moving forward while remaining in the same process, and following a message from its sending to its reception. A logical clock algorithm provides a mechanism to determine facts about the order of such events. Note that if two events happen in different processes that do not exchange messages directly or indirectly via third-party processes, then we say that the two processes are concurrent, that is, nothing can be said about the ordering of the two events.
Lamport invented a simple mechanism by which the happened-before ordering can be captured numerically. A Lamport logical clock is a numerical software counter value maintained in each process.
Conceptually, this logical clock can be thought of as a clock that only has meaning in relation to messages moving between processes. When a process receives a message, it re-synchronizes its logical clock with that sender. The above-mentioned vector clock is a generalization of the idea into the context of an arbitrary number of parallel, independent processes.
Algorithm
The algorithm follows some simple rules:
- A process increments its counter before each local event (e.g., message sending event);
- When a process sends a message, it includes its counter value with the message after executing step 1;
- On receiving a message, the counter of the recipient is updated, if necessary, to the greater of its current counter and the timestamp in the received message. The counter is then incremented by 1 before the message is considered received.
In pseudocode, the algorithm for sending is:
# event is known time = time + 1; # event happens send(message, time);
The algorithm for receiving a message is:
(message, timestamp) = receive(); time = max(timestamp, time) + 1;
Considerations
For every two different events and occurring in the same process, and being the timestamp for a certain event , it is necessary that never equals .
Therefore it is necessary that:
- The logical clock be set so that there is a minimum of one clock "tick" (increment of the counter) between events and ;
- In a multi-process or multi-threaded environment, it might be necessary to attach the process ID (PID) or any other unique ID to the timestamp so that it is possible to differentiate between events and which may occur simultaneously in different processes.
Causal ordering
For any two events, and , if there’s any way that could have influenced , then the Lamport timestamp of will be less than the Lamport timestamp of . It’s also possible to have two events where we can’t say which came first; when that happens, it means that they couldn’t have affected each other. If and can’t have any effect on each other, then it doesn’t matter which one came first.
Implications
A Lamport clock may be used to create a partial ordering of events between processes. Given a logical clock following these rules, the following relation is true: if then , where means happened-before.
This relation only goes one way, and is called the clock consistency condition: if one event comes before another, then that event's logical clock comes before the other's. The strong clock consistency condition, which is two way (if then ), can be obtained by other techniques such as vector clocks. Using only a simple Lamport clock, only a partial causal ordering can be inferred from the clock.
However, via the contrapositive, it's true that implies . So, for example, if then cannot have happened-before .
Another way of putting this is that means that may have happened-before , or be incomparable with in the happened-before ordering, but did not happen after .
Nevertheless, Lamport timestamps can be used to create a total ordering of events in a distributed system by using some arbitrary mechanism to break ties (e.g., the ID of the process). The caveat is that this ordering is artificial and cannot be depended on to imply a causal relationship.
Lamport's logical clock in distributed systems
In a distributed system, it is not possible in practice to synchronize time across entities (typically thought of as processes) within the system; hence, the entities can use the concept of a logical clock based on the events through which they communicate.
If two entities do not exchange any messages, then they probably do not need to share a common clock; events occurring on those entities are termed as concurrent events.
Among the processes on the same local machine we can order the events based on the local clock of the system.
When two entities communicate by message passing, then the send event is said to happen-before the receive event, and the logical order can be established among the events.
A distributed system is said to have partial order if we can have a partial order relationship among the events in the system. If 'totality', i.e., causal relationship among all events in the system, can be established, then the system is said to have total order.
A single entity cannot have two events occur simultaneously. If the system has total order we can determine the order among all events in the system. If the system has partial order between processes, which is the type of order Lamport's logical clock provides, then we can only tell the ordering between entities that interact. Lamport addressed ordering two events with the same timestamp (or counter): "To break ties, we use any arbitrary total ordering of the processes." Thus two timestamps or counters may be the same within a distributed system, but in applying the logical clocks algorithm events that occur will always maintain at least a strict partial ordering.
Lamport clocks lead to a situation where all events in a distributed system are totally ordered. That is, if , then we can say actually happened before .
Note that with Lamport’s clocks, nothing can be said about the actual time of and . If the logical clock says , that does not mean in reality that actually happened before in terms of real time.
Lamport clocks show non-causality, but do not capture all causality. Knowing and shows did not cause or but we cannot say which initiated .
This kind of information can be important when trying to replay events in a distributed system (such as when trying to recover after a crash). If one node goes down, and we know the causal relationships between messages, then we can replay those messages and respect the causal relationship to get that node back up to the state it needs to be in.
Alternatives to potential causality
The happened-before relation captures potential causality, not true causality. In 2011-12, Munindar Singh proposed a declarative, multiagent approach based on true causality called information protocols. An information protocol specifies the constraints on communications between the agents that constitute a distributed system. However, instead of specifying message ordering (e.g., via a state machine, a common way of representing protocols in computing), an information protocol specifies the information dependencies between the communications that agents (the protocol's endpoints) may send. An agent may send a communication in a local state (its communication history) only if the communication and the state together satisfy the relevant information dependencies. For example, an information protocol for an e-commerce application may specify that to send a Quote with parameters ID (a uniquifier), item, and price, Seller must already know the ID and item from its state but can generate whatever price it wants. A remarkable thing about information protocols is that although emissions are constrained, receptions are not. Specifically, agents may receive communications in any order whatsoever -- receptions simply bring information and there is no point delaying them. This means that information protocols can be enacted over unordered communication services such as UDP.
The bigger idea is that of application semantics, the idea of designing distributed systems based on the content of the messages, an idea implicated in the end-to-end principle. Current approaches largely ignore semantics and focus on providing application-agnostic ("syntactic") message delivery and ordering guarantees in communication services, which is where ideas like potential causality help. But if we had a suitable way of doing application semantics, then we wouldn't need such communication services. An unordered, unreliable communication service would suffice. The real value of information protocols approach is that it provides the foundations for an application semantics approach.
See also
References
- "Distributed Systems 3rd edition (2017)". DISTRIBUTED-SYSTEMS.NET. Retrieved 2021-03-20.
- ^ Lamport, L. (1978). "Time, clocks, and the ordering of events in a distributed system" (PDF). Communications of the ACM . 21 (7): 558–565. doi:10.1145/359545.359563. S2CID 215822405.
- "Clocks and Synchronization — Distributed Systems alpha documentation". books.cs.luc.edu. Retrieved 2017-12-13.
- "Information-Driven Interaction-Oriented Programming: BSPL, the Blindingly Simple Protocol Language" (PDF). Retrieved 24 April 2013.