Butler Lampson
Microsoft
OPODIS, Bordeaux, France
12 December 2006
The distinction between lazy and eager (or strict) evaluation has been studied in programming languages since Algol 60’s call by name, as a way to avoid unnecessary work and to deal gracefully with infinite structures such as streams. It is deeply integrated in some languages, notably Haskell, and can be simulated in many languages by wrapping a lazy expression in a lambda.
Less well studied is the role of laziness, and its opposite, speculation, in computer systems, both hardware and software. A wide range of techniques can be understood as applications of these two ideas. Laziness is the idea behind:
Redo logging for maintaining persistent state and replicated state machines: the log represents the current state, but it is evaluated only after a failure or to bring a replica online.
Copy-on-write schemes for maintaining multiple versions of a large, slowly changing state, usually in a database or file system.
Write buffers and writeback caches in memory and file systems, which are lazy about updating the main store.
Relaxed memory models and eventual consistency replication schemes (which require weakening the spec).
Clipping regions and expose events in graphics and window systems.
Carry-save adders, which defer propagating carries until a clean result is needed.
“Infinity” and “Not a number” results of floating point operations.
Futures (in programming) and out of order execution (in CPUs), which launch a computation but are lazy about consuming the result. Dataflow is a generalization.
“Formatting operators” in text editors, which apply properties such as “italic” to large regions of text by attaching a sequence of functions that compute the properties; the functions are not evaluated until the text needs to be displayed.
Stream processing in database queries, Unix pipes, etc., which conceptually applies operators to unbounded sequences of data, but rearranges the computation when possible to apply a sequence of operators to each data item in turn.
Speculation is the idea behind:
Optimistic concurrency control in databases, and more recently in transactional memory.
Prefetching in memory and file systems.
Branch prediction, and speculative execution in general in modern CPUs.
Data speculation, which works especially well when the data is cached but might be updated by a concurrent process. This is a form of optimistic concurrency control.
Exponential backoff schemes for scheduling a resource, most notably in LANs such as WiFi or classical Ethernet.
All forms of caching, which speculate that it’s worth filling up some memory with data in the hope that it will be used again.
In both cases it is usual to insist that the laziness or speculation is strictly a matter of scheduling that doesn’t affect the result of a computation but only improves the performance. Sometimes, however, the spec is weakened, for example in eventual consistency.
I will discuss many of these examples in detail and examine what they have in common, how they differ, and what factors govern the effectiveness of laziness and speculation in computer systems.