Leslie Lamport, “Time, Clocks, and the Ordering of Events in a Distributed System,” Communications of the ACM, July 1978, 21(7):558-565.
Members of the selection committee were James Anderson (chair), Cynthia Dwork, Vassos Hadzilacos, and Fred Schneider. The following is a description of the winning paper’s contributions, written by Keith Marzullo:
This paper contained two main ideas, each of which led to a line of research that has dominated distributed computing for a decade or more:
- A precise characterization of causality in distributed systems (called the clock condition) and a framework (similar to Minkowski diagrams of special relativity) for explaining and reasoning about event ordering in distributed protocols. The simplest way to implement the clock condition is with what Lamport called in this paper “logical clocks.” The logical clock abstraction had an immediate impact on the field and is the reason that this paper is cited so often. Research enabled by this contribution includes the vector and matrix clock abstractions, the notion of consistent cuts (answering the question of “what is a state in a distributed system”), stable and nonstable predicate detection, and the semantic basis for epistemological logics (such as the logics of Knowledge, Common Knowledge, and Belief that have been advocated for specifying distributed protocols). Finally, it is worth noting that this was among the very first papers to show how distributed systems were fundamentally different from other concurrent systems and was the first paper to show how a rigorous mathematical basis (the “happens before” relation) could be used to talk about such differences.
- The state machine approach as a generalization of n-module redundancy. This is the contribution that has proven to be the most influential, both to the theory and practice of distributed computing. The paper gives a distributed mutual exclusion protocol that ensures access to the critical section is obtained in causal order. More importantly, the paper explains how this protocol is an example of a general approach (the so-called “state machine”‘ approach) for managing replication. Topics that led from this approach include:
- Byzantine agreement. Such protocols ensure that all state machines get the same set of inputs in spite of failures. Much work has led from this problem, including fast protocols, impossibility results, failure model hierarchies, and so on.
- Byzantine clock synchronization and ordered multicast protocols. Such protocols are used to order concurrent requests in the same way. Combined with agreement protocols, nonfaulty deterministic state machines are ensured to have the same states.
It is not surprising that the first papers defining both of the above problem areas were authored or co-authored by Lamport, as he was simply working through the protocols and sub-problems that spun off of the results in the original paper.
A good deal of more practical systems work also traces its roots to the state machine approach. A large number of distributed system toolkits (e.g., ISIS and successors, Transis, Relacs, Spread, Newtop) embody the state machine approach or some variant. And the designs of most critical systems (e.g., Boeing 777 control system, US, English, and French air traffic control systems, stock trading systems, etc.) are based on active replication schemes that are derived from the work in this paper.
The following is a brief historical perspective of the paper written by the author himself:
Jim Gray once told me that he had heard two different opinions of this paper: that it’s trivial and that it’s brilliant. I can’t argue with the former, and I am disinclined to argue with the latter.
The origin of this paper was a note titled The Maintenance of Duplicate Databases by Paul Johnson and Bob Thomas. I believe their note introduced the idea of using message timestamps in a distributed algorithm. I happen to have a solid, visceral understanding of special relativity. This enabled me to grasp immediately the essence of what they were trying to do. Special relativity teaches us that there is no invariant total ordering of events in space-time; different observers can disagree about which of two events happened first. There is only a partial order in which an event e1 precedes an event e2 iff e1 can causally affect e2. I realized that the essence of Johnson and Thomas’s algorithm was the use of timestamps to provide a total ordering of events that was consistent with the causal order. This realization may have been brilliant. Having realized it, everything else was trivial. Because Thomas and Johnson didn’t understand exactly what they were doing, they didn’t get the algorithm quite right; their algorithm permitted anomalous behavior that essentially violated causality. I quickly wrote a short note pointing this out and correcting the algorithm.
It didn’t take me long to realize that an algorithm for totally ordering events could be used to implement any distributed system. A distributed system can be described as a particular sequential state machine that is implemented with a network of processors. The ability to totally order the input requests leads immediately to an algorithm to implement an arbitrary state machine by a network of processors, and hence to implement any distributed system. So, I wrote this paper, which is about how to implement an arbitrary distributed state machine. As an illustration, I used the simplest example of a distributed system I could think of – a distributed mutual exclusion algorithm.
This is my most often cited paper. Many computer scientists claim to have read it. But I don’t think I ever encountered anyone who was aware that the paper said anything about state machines. They seem to think that it is about either the causality relation on events in a distributed system, or the distributed mutual exclusion problem. People have insisted that there is nothing about state machines in the paper. I’ve even had to go back and reread it to convince myself that I really did remember what I had written.
The paper describes the synchronization of logical clocks. As something of an afterthought, I decided to see what kind of synchronization it provided for real-time clocks. So, I included a theorem about real-time synchronization. I was rather surprised by how difficult the proof turned out to be. This was an indication of what lay ahead in the work on Byzantine clock synchronization. But, that’s another story.
This page maintained by Gil Neiger.
Last modified: January 23, 2003