Understanding Network Connections

Butler Lampson

 

There are lots of protocols for establishing connections (or equivalently, doing at-most-once message delivery) across a network that can delay, reorder, duplicate and lose packets.  Most of the popular ones are based on three-way handshake, but some use clocks or extra stable storage operations to reduce the number of messages required.  It’s hard to understand why the protocols work, and there are almost no correctness proofs; even careful specifications are rare.

I will give a specification for at-most-once message delivery, an informal account of the main problems an implementation must solve and the common features that most implementations share, and outlines of proofs for three implementations.  The specifications and proofs based on Lamport’s methods for using abstraction functions to understand concurrent systems, and I will say something about how his methods can be applied to many other problems of practical interest.

Understanding Network Connections

Butler Lampson

October 30, 1995

 

This is joint work with Nancy Lynch.

The errors in this talk are mine, however.


Overview

Specify at-most-once message delivery.

Describe other features we want from an implementation

Give a framework for thinking about implementations.

Show how to prove correctness of an implementation.


The Problem

Network Connections
or
Reliable At-Most-Once Messages

 

Messages are delivered in fifo order.

A message is not delivered more than once.

A message is acked only if delivered.

A message or ack is lost only if it is being sent between crash and recovery.


Pragmatics

“Everything should be made as simple as possible, but no simpler.”
                                                                                A. Einstein

Make progress: regardless of crashes,
      if both ends stay up a waiting message is sent, and
      otherwise both parties become idle.

Idle at no cost: an idle agent
      has no state that changes for each message, and
      doesn’t send any packets.

Minimize stable storage operations — <<1 per message.

Use channels that are easy to implement:
      They may lose, duplicate, or reorder messages.


Pragmatics

Some pragmatic issues we won’t discuss:
      Retransmission policy.
      Detecting failure of an attempt to send or ack, by timing it out.


Describing a System

A system is defined by a safety and a liveness property:

Safety: nothing bad ever happens. Defined by a state machine:

A set of states. A state is a pair (external state, internal state)
A set of initial states.
A set of transitions from one state to another.

Liveness: something good eventually happens.

An action is a named set of transitions; actions partition the transitions.

For instance: put(m); get(m); crashs

A history is a possible sequence of actions, starting from an initial state.

The behavior of the system is the set of possible histories.

An external action is one in which the external state changes. Correspondingly there are external histories and behaviors.


Defining Actions

An action is:

A name, possibly with parameters: put(“red”).

A guard, a predicate on the state which must be true for this action to be a possible transition: q ≠ < > and i > 3.

An effect, changes in some of the state variables: i := i + 1.

The entire action is atomic.

Example:

get(m):      m first on q            take first from q, if q now empty
                                                and status = ? then status := true

name            guard                                effect


Specifying At-Most-Once Messages

State:   q         : sequence[M]      := < >
            status : {true, false, ?}   := true
            recs/r    : Boolean             := false

“Sender Actions”

“Receiver Actions”

Name

Guard

Effect

Name

Guard

Effect

put(m)**

 

if ~recs then
  append m to q,
  status := ?

get(m)*

~recr, m first on q

take first from q,
if q now empty
 and status = ? then
   status := true

getAck(b)*

~recs , status = b

none

 

 

 

crashs**

 

recs := true

crashr**

 

recr := true

recovers*

recs

recs := false

recoverr*

recr

recr := false

lose

recs or recr

delete any element
 of q; if it’s the last
 then status := false
or status := false

 

 

 


Histories for AMO Messages

Action

q

status

Action

q

status

Action

q

status

Initially

< >

true

 

 

 

 

 

 

put(“red”)

<“red”>

?

 

 

 

 

 

 

get(“red”)

< >

true

 

 

 

 

 

 

getAck(true)

< >

true

 

 

 

 

 

 

put(“green”)

<“green”>

?

 

 

 

 

 

 

crashr, lose

<“green”>

?

crashr, lose

<“green”>

false

crashr, lose

< >

false

 

 

?

getAck(false)

<“green”>

false

getAck(false)

< >

false

put(“blue”)

<“green”,
 “blue”>

?

put(“blue”)

<“green”,
“blue”>

?

put(“blue”)

<“blue”>

?

get(“green”)

<“blue”>

?

get(“green”)

<“blue”>

?

 

 

 

get(“blue”)

< >

true

 

 

 

 

 

 

getAck(true)

< >

true

 

 

 

 

 

 


Channels

State      sr   : multiset[P]        := {}             P = I ´ M                   rs   : multiset[P]  := {}
                                                                      or I ´ Bool                

Name

Guard

Effect

Name

Guard

Effect

sndsr(p)

 

sr := sr È {p}

sndrs(p)

 

rs := rs È {p}

rcvsr(p)

p Î sr

sr := sr – {p}

rcvrs(p)

p Î rs

rs := rs – {p}

losesr

$ p | p Î sr

sr := sr – {p}

losers

$ p | p Î rs

rs := rs – {p}


Stable Implementation

State:     modes  : {acked, send, rec}  := acked                              moder   : {idle, rcvd, ack}         := idle
              goods   : set[ I ]                      := I                                      goodr   : set[ I ]             := I
              lasts     : I                                                                           lastr     : I
              cur       : M                                                                         buf       : sequence[M]     := <>                        

Name

Guard

Effect

Name

Guard

Effect

put(m)**


sndsr(i, m)*

mode = acked
and $ i | i Î good


mode = send,
i = last, m = cur

cur := m, last := i,
mode
:= send,
take i from good

none

rcvsr(i, m)
**

 

if i Î good then 
  mode := rcvd, take i from good,
  last := i, append m to buf
else if i = last and mode = idle then mode := ack

 

 

 

get(m)*

mode = rcvd,
m first on buf

take first from buf;
if it’s now empty, mode := ack

rcvrs(i, –)
**

 

if mode = send and
i = last then
  mode := acked

sndrs(i, true)*

mode = ack,
i = last

mode := idle

getAck
(true)*

mode = acked

none

 

 

 


What Does “Implements” Mean?

Divide actions into external (marked * or **) and internal (unmarked).
      External actions change external state, internal ones don’t.

An external history is a history (sequence of actions) with all the internal actions removed.

T implements S if

every external history of T is an external history of S, and

T’s liveness property implies S’s liveness property.


Abstraction Functions

Suppose we have a function f from T’s state to S’s state such that:

f takes initial states to initial states;

f maps every transition of T to a sequence of transitions of S, perhaps empty (i.e., the identity on S);

  f(t)     S or skip      ®    f(t')  
   ­                                   ­    

 

    t             T             ®      t'   

f maps every external action of T to a sequence containing the same external action of S and no other external actions.

f maps every internal action of T to a sequence of internal actions.

Then T implements S.
Why bother? Transitions are simpler than histories.


Proof of Stable Implementation

Invariants

(1) goods Ç ( {lasts} È ids(sr) È ids(rs) ) = {}

(2) goodr Ç ids(rs) = {}

(3) goodr Ê goods

(4) ((i, m) Î sr and i Î goodr) implies m = cur

Abstraction function

q          =    <cur> if modes = send and lasts Î goodr
                  < > otherwise
            +    buf

status   =    ? if modes = send
                  true otherwise


Methodology for Proofs

Simplify the spec and the implementations.
      Save clever encodings for later.

Make a “working spec” that’s easier to handle:
      It implements the actual spec.
      It has as much non-determinism as possible.
      All the prophecy is between it and the actual spec.

actual¬     implements      working¬     implements      implemen-

 spec                                 spec                                 tation

Find the abstraction function. The rest is automatic.

Give names to important functions of your state variables.

To design an implementation, first invent the guards you need,
then figure out how to implement them.


History Variables

If you add a variable h to the state space such that

If s is an old initial state then there’s an h such that (s, h) is initial;

If (s, h) ® (s', h') then s ® s';

If s ® s' then for any h there’s an h' such that (s, h) ® (s', h')

then the new state machine has the same histories as the old one.


Predicting Non-Determinism

Suppose we add mode := acked to crashs.

Consider the sequence put(“red”), snd, crashs, put(“blue”), snd.

Now we have sr = {(1, “red”), (2, “blue”)}. We need an ordering on identifiers to order these packets and maintain fifo delivery. On rcvsr(i, m) the receiver must remove all identifiers ≤ i from goodr.

But now “red” is lost if (2, “blue”) is received first. If we use the obvious abstraction function

q = the m’s from {(i, m) Î sr È (lasts, cur)| i Î goodr} sorted by i,

this loss doesn’t happen between crashs and recovers, as allowed by the spec, but later at the rcv.

We give a working spec that makes this delay explicit.


Delayed-Decision Specification

State:   q         : sequence[(M, Flag)]       := < >             Flag = {OK, ?}
            status : ({true, false, ?}, Flag)    := true
            recs/r    : Boolean                           := false

Name

Guard

Effect

Name

Guard

Effect

put(m)**

 

if ~recs then
  add (m, OK) to q,
  status := (?, OK)

get(m)*

~recr,
(m, –) first on q

take first from q,
if q now empty
 and status = (?, f)
   status := (true, f)

getAck(b)*

~recs ,
status = (b, –)

none

 

 

 

mark

recs or recr

in some element of q or in status, set flag := ?

unmark

 

in some element of q or in status, set flag := OK

drop

true

delete some element of q with flag := ?,
     and if it’s the last, status := (false, OK)
or if status = (–, ?), status := (false, OK)

 

 

 


Prophecy Variables

If you add a variable p to the state space such that

If s is an old initial state then there’s a p such that (s, p) is initial;

If (s, p) ® (s', p') then s ® s';

If s ® s' then for any h' there exists an h such that (s, h) ® (s', h')

If s is an old state, there’s a p such that (s, p) is a new state.

then the new state machine has the same histories as the old one.

If T implements S, you can always add history and prophecy variables to T and then find an abstraction function to S.


Prophecy for Delayed-Decision

Extend Flag to include a lost component drawn from the set {OK, lost}.

The lost component prophesies whether drop will attack or not.

The abstraction function is

      qS = the first components of elements of qDP that are not lost

      statusS = the first component of statusDP if it is not lost, else false.

 


Delayed-Decision with Prophecy DP

State:   q         : sequence[(M, Flag)]       := < >       Flag = ({OK, ?},
            status : ({true, false, ?}, Flag)    := (true, OK2)                {OK, lost})
            recs/r    : Boolean                           := false

Name

Guard

Effect

Name

Guard

Effect

put(m)**

 

if ~recs then
 add (m,OK2) to q,
 status := (?, OK2)

get(m)*

~recr,
(m, –) first on q
and not lost

take first from q,
if q now empty
 and status = (?, f)
   status := (true, f)

getAck(b)*

~recs , status =
    (b, –), not lost

none

 

 

 

mark

recs or recr,
$ x Î {OK, lost}

in some element of q or in status, set flag := (?, x). If last of q is lost, set status flag (?, lost)

unmark

 

in some element of q or in status that isn’t lost,
set flag := OK

drop

true

delete some lost element of q
or if status is lost, status := (false, OK2)

 

 

 


Generic Implementation

State:     modes  : {acked, send, rec}  := acked                              moder   : {idle, rcvd, ack}         := idle
              goods   : set[ I ]                      := {}                                    goodr   : set[ I ]             := {}
              lasts     : I                                                                           lastr     : I
              cur       : M                                                                         buf       : sequence[M]     := <>                        
              used     : set[I]                        := {}                                    nacks   : sequence[I]       := <>
              ack      : Boolean                   := false

Name

Guard

Effect

Name

Guard

Effect

put(m)**

choose-id
(i, m)

sndsr(i, m)*

mode = acked

mode = needid,
i Î good, m=cur

as usual

mode := needid,
cur
:= m, good:={}

mode := send,
last := i, move i to from good to used

rcvsr(i, m)
**

 

if i Î good then 
  mode := rcvd, take all Is≤i from
  good, last := i, append m to buf
else if ilast then add i to nacks
else if i = last and mode = idle
  then mode := ack

 

 

 

get(m)*

as usual

as usual

rcvrs(i, b)
**

 

if mode = send and
i = last then ack:=b
  mode := acked

sndrs(i,T)

sndrs(i,F)

mode = ack,
i = last

i first on nacks

mode := idle

take first from nacks

getAck(b)*

mode = acked,
b = ack

none

 

 

 


Generic Magic

State:     goods   : set[ I ]                      := {}                                    goodr   : set[ I ]             := {}
              lasts     : I                                                                           lastr     : I
              useds    : set[I]                        := {}                                    issued  : set[I]              := {}

Name

Guard

Effect

Name

Guard

Effect

shrink-gs(i)

grow-gs(i)

true

i Î good,
i Ï used

take i from good

add i to good

shrink-gr(i)

grow-gr(i)

i Ï goods È
     {lasts}

i Ï issued

take i from good

add i to good, issued

 

 

 

cleanup

modercvd,
lastlasts

last := nil

recovers

rec

mode := acked,
last := nil,
ack := false

recoverr

rec

mode := idle,
last := nil, take some Is from good,
clear buf, nacks


Generic Abstraction Function

msg(id) = {m  |    (id = lasts and m=current ) or id Î ids(sr)
                        or (id = lastr and m is last on buf)}
This defines a partial function msg: ID ® M.

current-q = <(current, OK)> if     modes = send      and lasts Î goodr
                                                  or modes = needid   and goodsgoodr
                    <(current, ?)>     if     modes = needid   and not goodsgoodr
                    <>                       otherwise

inflightsr = {id |       id Î ids(sr) and id Î goodr
                          and not (id = lasts  and modes = send) },
                   sorted by id to make a sequence

inflight-m = the sequence of Ms gotten from msg of each I in inflightrs.

inflightrs = {lasts} if (ack, lasts, true) Î rs and not lasts = lastr


Generic Abstraction Function

queue

   the elements of bufr paired with OK

 

 

+ the elements of inflight-m paired with ?

 

 

+ current-q

status

        (false, OK)      if modes = rec

else  (?, f)                if current-q = <(m, f)>

        (?, OK)            if modes = send, lasts = lastr and bufempty

        (true, OK)       if modes = send, lasts = lastr and buf = empty

        (true, ?)           if modes = send and lasts Î inflightrs

        (false, OK)      if modes = send
                                   and lasts Ï (goodr È {lastr} È inflightrs)

        (ack, OK)        if modes = acked


Practical Implementations

Generic

Five-packet handshake

Liskov-Shrira-Wroclawski

goods

{id | (accept, jds, id) Î rs} if mode = needid
{}             otherwise

{times} – {lasts}

goodr

{lastr}     if moder = accept
{}            otherwise

{i |     lower < i
   and (i £ upper  or moder = rec) }

lastr

lastr         if modeaccept
nil            otherwise

lastr

 

shrink-goods

mode = needid and losers(accept, jds, id) of last copy
or receivers(accept, jds, id), ids = goods – {id}

tick(id), id = times

grow-goods

sendrs(accept, jds, id)

tick(id)

cleanup

receivesr(acked), mode = ack

cleanup

shrink-goodr

mode = accept and receivesr(acked, idr, nil)

increase-lower(id),
  ids = {i | lower < iid}

grow-goodr

mode = idle and receivesr(needid, ...),

increase-upper(id),
  ids = {i | upper < iid}


Summary

Client specification of at-most-once messages.

Working spec which maximizes non-determinism.

Generic implementation with good identifiers maintained by magic.

Two practical implementations

Five-packet handshake

Liskov-Shrira-Wrowclawski


To follow up ...

You can find these slides on research.microsoft.com/lampson, as well as a paper (item 47 in the list of publications).

 

L. Lamport, A simple approach to specifying concurrent systems, Comm. acm 32, 1, Jan. 1989.

M. Abadi and L. Lamport, The existence of refinement mappings, DEC SRC research report 29, Aug. 1988.