DISTRIBUTED SYSTEMS

Optimizing Bracha's Reliable Broadcast: Shaving Rounds off a 37-Year-Old Algorithm

Bracha’s reliable broadcast has been the go-to Byzantine broadcast primitive since 1987. Three rounds, \(O(n^2)\) messages, optimal fault tolerance at \(n > 3f\). Textbook stuff.

I’ve been trying to get a fast multi-value agreement primitive: something where all nodes propose values and agree on one (or ⊥ if they’re hopelessly split). Reliable broadcast is the natural building block, but three rounds felt like overkill for the happy path.

So I went digging. Months of searching the literature turned up plenty of options, but they all came with catches: weaker fault models, synchrony assumptions, trusted setups. Nothing that felt like a clean win. Eventually I circled back to Bracha and noticed something hiding in plain sight: by using a larger quorum for the optimistic path, we can deliver in two rounds when conditions are good. No new assumptions. Same fault tolerance. Just better intersection properties.

This post covers the optimized broadcast and the MVA extension I built on top of it.

0x0: Quick Refresher on Bracha

Byzantine reliable broadcast solves a deceptively simple problem: a designated leader wants to broadcast a value \(v\) to \(n\) parties, where up to \(f < n/3\) can be malicious (Byzantine). The protocol must guarantee that all honest parties either agree on the same value or don’t deliver anything at all, but no split-brain scenarios allowed.

Asynchronous model is the hardest setting for distributed protocols. Messages can be delayed arbitrarily by the adversary, there are no timeouts or clocks you can rely on, and you can’t tell a slow node from a dead one. The upside: protocols proven correct in this model stay safe no matter what happens with timing. The downside: you can’t guarantee termination if enough nodes are faulty (see FLP impossibility).

Quorum intersection is the key trick. If you require \(n - f\) nodes to agree on something, any two such sets overlap. With \(n > 3f\), the overlap is at least \(f + 1\) nodes, so at least one is honest. This honest node “witnesses” both sets and can prevent conflicting decisions.

Correctness of a solution boils down to three properties:

  • Agreement: Honest parties that deliver, deliver the same value
  • Validity: If the leader is honest and broadcasts \(v\), all honest parties eventually deliver \(v\)
  • Totality: If one honest party delivers, all honest parties eventually deliver

Bracha’s protocol uses three message types across three rounds:

upon start as leader with input v:
    send ⟨INIT, v⟩ to all

upon ⟨INIT, v⟩ from leader ∧ sentₑ = ⊥:
    send ⟨ECHO, v⟩ to all; sentₑ ← v

upon E(v) ≥ n−f ∧ sentᵣ = ⊥:
    send ⟨READY, v⟩ to all; sentᵣ ← v

upon R(v) ≥ f+1 ∧ sentᵣ = ⊥:              // Amplification
    send ⟨READY, v⟩ to all; sentᵣ ← v

upon R(v) ≥ n−f ∧ delivered = ⊥:
    deliver(v)

Why the READY round? ECHOs alone can’t guarantee termination. A Byzantine leader can equivocate, sending different values to different parties, fragmenting honest echoes.

Standard quorum intersection gives us \(n - 2f > f\) overlap between any two \((n-f)\)-sized sets. That’s enough honest overlap to prevent two conflicting values from both reaching \(n-f\) ECHOs—but it doesn’t help when neither value reaches threshold. A Byzantine leader can fragment honest echoes across multiple values, leaving everyone stuck. The READY round breaks the deadlock.

0x1: The Counterintuitive Shortcut

The clue came from a Tendermint forum post explaining why their protocol needs a pre-vote phase. The observation: with \(n = 3f+1\) nodes, you need pre-vote. But with \(n = 5f+1\) nodes, you can skip it entirely.

I stared at this for a while. How can adding nodes let you skip a round? Rounds exist to do something. You can’t just add redundancy and magically skip a communication step…

However, it starts to make sense once you realize the extra nodes aren’t just redundancy. They change the flavor of quorum intersection. Standard quorum intersection guarantees overlap, but that overlap might be almost entirely Byzantine. A larger quorum guarantees honest nodes in the overlap. And honest nodes don’t equivocate. What if we had a majority of honest nodes right in there?

We don’t need more nodes. We need a higher threshold on the same nodes, which gives us the same stronger intersection property.

Define three thresholds:

\[ \begin{aligned} Q &= n - f &&\text{(standard quorum)} \\ Q_o &= \lfloor n/2 \rfloor + f + 1 &&\text{(optimistic threshold)} \\ Q_a &= f + 1 &&\text{(amplification threshold)} \end{aligned} \]

The magic is in \(Q_o\):

Lemma (Strong Intersection): Any two sets of size \(Q_o\) overlap in at least \(f+1\) honest parties.

Proof: Let \(S_1, S_2\) be sets of size \(Q_o\).

\[ |S_1 \cap S_2| \geq 2Q_o - n = 2(\lfloor n/2 \rfloor + f + 1) - n \geq 2f + 1 \]

Since at most \(f\) are Byzantine, the overlap contains at least \(f + 1\) honest parties. ∎

Let’s visualize why this matters. With \(n = 10\) and \(f = 3\):

Fundamentally different from standard quorum intersection. With \(Q_o\) echoes for some value \(v\), we know that \(\geq f+1\) honest parties echoed \(v\). And honest parties echo exactly once. So no conflicting value can ever reach \(Q_o\), or even \(Q\), since the remaining honest parties number at most \((n - f) - (f + 1) = n - 2f - 1 < Q - f\).

We can deliver before the READY round completes. READYs still propagate to ensure totality.

0x2: Round-Optimized Bracha

The optimized protocol adds an “optimistic delivery” path:

upon ⟨INIT, v⟩ from leader ∧ sentₑ = ⊥:
    send ⟨ECHO, v⟩ to all; sentₑ ← v

upon E(v) ≥ Qₒ ∧ delivered = ⊥:              // Optimistic
    if sentᵣ = ⊥: send ⟨READY, v⟩ to all; sentᵣ ← v
    deliver(v); delivered ← v

upon E(v) ≥ Q ∧ sentᵣ = ⊥ ∧ delivered = ⊥:   // Q echoes -> ready
    send ⟨READY, v⟩ to all; sentᵣ ← v

upon R(v) ≥ Qₐ ∧ sentᵣ = ⊥ ∧ delivered = ⊥:  // Amplification
    send ⟨READY, v⟩ to all; sentᵣ ← v

upon R(v) ≥ Q ∧ delivered = ⊥:               // Standard delivery
    if sentᵣ = ⊥: send ⟨READY, v⟩ to all; sentᵣ ← v
    deliver(v); delivered ← v

Note on Totality: When a process delivers via the optimistic path, it also sends READY(v), ensuring the value propagates through amplification. Even if only one correct process initially sees \(Q_o\) echoes (with Byzantine help), the resulting READY triggers amplification at other processes. Amplification requires only \(Q_a\) readies (no echo-backing), allowing the value to propagate to \(Q\) readies for standard delivery. READYs propagate progressively as they arrive at slower processes.

Happy case (honest leader, healthy network): All \(n - f\) honest parties ECHO the same value. If the network is well-synchronized, all honest ECHOs arrive quickly, often reaching \(Q_o\) before the READY phase begins.

Adversarial case (Byzantine leader or delayed messages): Some value reaches \(Q\) ECHOs but not \(Q_o\). Fall back to three rounds via READYs.

Safety Proof Sketch

Three cases:

  1. Two optimistic deliveries can’t conflict: Both require \(Q_o\) ECHOs. By strong intersection, \(\geq f+1\) honest parties appear in both sets. But honest parties ECHO once. Contradiction.

  2. Optimistic delivery blocks conflicting standard delivery: If \(v\) gets \(Q_o\) ECHOs, at least \(Q_o - f \geq f + 1\) honest parties echoed \(v\). The remaining honest parties (\(\leq n - f - (f+1) = n - 2f - 1\)) plus \(f\) Byzantine can produce at most \(n - f - 1 < Q\) votes for any other value.

  3. Standard paths use Bracha’s original safety: Quorum intersection ensures agreement.

The TLA+ Spec

bracha_opt.tla (download)
  1-------------------- MODULE bracha_opt --------------------
  2(*
  3 Optimized Bracha Reliable Broadcast with Fast Path
  4 ===================================================
  5 Can Bölük, April 1, 2025.
  6
  7 Single-leader Byzantine reliable broadcast with optimistic 2-round
  8 delivery. Falls back to standard 3-round Bracha when conditions are
  9 adversarial.
 10
 11 Properties (n > 3f):
 12   - Agreement    All honest parties deliver the same value
 13   - Validity     If broadcaster is honest, all deliver its input
 14   - Totality     If one honest delivers, all honest eventually deliver
 15   - Termination  All honest parties eventually deliver
 16
 17 Synchrony: Safety holds under full asynchrony. Termination requires
 18 partial synchrony (bounded message delay after GST).
 19
 20 Fault Model: A faulty broadcaster can send different values to different
 21 recipients. Modeled via a partition function assigning each honest party
 22 to a value. Faulty non-broadcasters can send arbitrary ECHO/READY.
 23 *)
 24
 25EXTENDS Naturals, FiniteSets, TLC
 26
 27\* ============================================================================
 28\* Protocol Parameters
 29\* ============================================================================
 30CONSTANTS
 31   Honest,         \* the set of correct processes
 32   Faulty,         \* the set of Byzantine processes, may be empty
 33   V               \* the set of valid values
 34
 35\* Sentinel for not yet set
 36NULL  "∅"
 37
 38\* Arbitrary value - symmetric so choice irrelevant
 39Command  CHOOSE v  V : TRUE
 40
 41\* ============================================================================
 42\* Definitions
 43\* ============================================================================
 44P  Honest  Faulty       \* the set of all processes
 45N  Cardinality(P)             \* total number of processes
 46F  Cardinality(Faulty)        \* number of Byzantine processes
 47
 48\* Thresholds
 49Q   N - F                     \* standard quorum (2f+1 when n=3f+1)
 50Qo  N ÷ 2 + F + 1          \* optimistic quorum
 51Qa  F + 1                     \* amplification (f+1)
 52
 53ASSUME  N > 3 * F
 54        V  {}
 55        Honest  Faulty = {}
 56        IsFiniteSet(Honest)
 57        IsFiniteSet(Faulty)
 58        IsFiniteSet(V)
 59
 60\* Symmetry: combine permutations across Honest, Faulty, V
 61CombinePerm(ph, pf, pv) 
 62   [x  P  V  IF x  Honest THEN ph[x]
 63                ELSE IF x  Faulty THEN pf[x]
 64                ELSE pv[x]]
 65
 66Sym  {CombinePerm(ph, pf, pv) :
 67         ph  Permutations(Honest),
 68         pf  Permutations(Faulty),
 69         pv  Permutations(V)}
 70
 71\* Broadcaster candidates (one of each type by symmetry)
 72OneHonest  IF Honest  {} THEN {CHOOSE h  Honest : TRUE} ELSE {}
 73OneFaulty  IF Faulty  {} THEN {CHOOSE f  Faulty : TRUE} ELSE {}
 74
 75\* ============================================================================
 76\* Protocol State Variables
 77\* ============================================================================
 78VARIABLES
 79   broadcaster, \* Distinguished leader process
 80
 81   \* === Honest party broadcast state ===
 82   sent_init,           \* [P -> V  {NULL}]
 83   sent_echo,           \* [P -> V  {NULL}]
 84   sent_ready,          \* [P -> V  {NULL}]
 85
 86   \* === Faulty broadcaster equivocation ===
 87   faulty_equivoc,      \* [V -> SUBSET Honest]: partition for faulty broadcaster equivocation
 88
 89   \* === Per-recipient receive tracking (count-based) ===
 90   rcvd_init_v,         \* [Honest -> V  {NULL}]: INIT value received
 91   rcvd_echo_cnt,       \* [Honest -> [V -> 0..N]]: count of ECHOs for v received
 92   rcvd_ready_cnt,      \* [Honest -> [V -> 0..N]]: count of READYs for v received
 93
 94   delivered            \* [Honest -> V  {NULL}]: Delivered value
 95
 96send_vars  sent_init, sent_echo, sent_ready
 97recv_vars  rcvd_init_v, rcvd_echo_cnt, rcvd_ready_cnt
 98vars  broadcaster, sent_init, faulty_equivoc, sent_echo, sent_ready,
 99          rcvd_init_v, rcvd_echo_cnt, rcvd_ready_cnt, delivered
100
101\* ============================================================================
102\* Helper Operators
103\* ============================================================================
104HasInit(p)  rcvd_init_v[p]  NULL
105InitValue(p)  rcvd_init_v[p]
106
107SentEchoCount(v)  Cardinality({src  P : sent_echo[src] = v})
108SentReadyCount(v)  Cardinality({src  P : sent_ready[src] = v})
109
110EchoQuorum(p, t)  {v  V : rcvd_echo_cnt[p][v]  t}
111ReadyQuorum(p, t)  {v  V : rcvd_ready_cnt[p][v]  t}
112
113\* ============================================================================
114\* Type Invariant
115\* ============================================================================
116TypeOK 
117    broadcaster  P
118    sent_init  [P  V  {NULL}]
119    faulty_equivoc  [V  SUBSET Honest]
120    sent_echo  [P  V  {NULL}]
121    sent_ready  [P  V  {NULL}]
122    rcvd_init_v  [Honest  V  {NULL}]
123    rcvd_echo_cnt  [Honest  [V  0N]]
124    rcvd_ready_cnt  [Honest  [V  0N]]
125    delivered  [Honest  V  {NULL}]
126
127\* ============================================================================
128\* Protocol Initialization
129\* ============================================================================
130Init 
131    broadcaster  OneHonest  OneFaulty
132    sent_init = [src  P  NULL]
133    faulty_equivoc = [v  V  {}]
134    sent_echo = [src  P  NULL]
135    sent_ready = [src  P  NULL]
136    rcvd_init_v = [p  Honest  NULL]
137    rcvd_echo_cnt = [p  Honest  [v  V  0]]
138    rcvd_ready_cnt = [p  Honest  [v  V  0]]
139    delivered = [p  Honest  NULL]
140
141\* ============================================================================
142\* Message Reception
143\* ============================================================================
144
145\* Receive INIT from honest broadcaster (uniform value)
146ReceiveInitFromHonest(p) 
147    delivered[p] = NULL
148    rcvd_init_v[p] = NULL
149    broadcaster  Honest
150    sent_init[broadcaster]  NULL
151    rcvd_init_v' = [rcvd_init_v EXCEPT ![p] = sent_init[broadcaster]]
152    UNCHANGED broadcaster, sent_init, faulty_equivoc, sent_echo, sent_ready,
153                  rcvd_echo_cnt, rcvd_ready_cnt, delivered
154
155\* Receive INIT from faulty broadcaster (uses equivocation partition)
156ReceiveInitFromFaulty(p) 
157    delivered[p] = NULL
158    rcvd_init_v[p] = NULL
159    broadcaster  Faulty
160     v  V : p  faulty_equivoc[v]
161    LET v  CHOOSE v  V : p  faulty_equivoc[v]
162      IN rcvd_init_v' = [rcvd_init_v EXCEPT ![p] = v]
163    UNCHANGED broadcaster, sent_init, faulty_equivoc, sent_echo, sent_ready,
164                  rcvd_echo_cnt, rcvd_ready_cnt, delivered
165
166\* Receive one more ECHO for value v (count-based)
167ReceiveEcho(p) 
168    delivered[p] = NULL
169     v  V :
170          rcvd_echo_cnt[p][v] < SentEchoCount(v)
171          rcvd_echo_cnt' = [rcvd_echo_cnt EXCEPT ![p][v] = @ + 1]
172    UNCHANGED broadcaster, sent_init, faulty_equivoc, sent_echo, sent_ready,
173                  rcvd_init_v, rcvd_ready_cnt, delivered
174
175\* Receive one more READY for value v (count-based)
176ReceiveReady(p) 
177    delivered[p] = NULL
178     v  V :
179          rcvd_ready_cnt[p][v] < SentReadyCount(v)
180          rcvd_ready_cnt' = [rcvd_ready_cnt EXCEPT ![p][v] = @ + 1]
181    UNCHANGED broadcaster, sent_init, faulty_equivoc, sent_echo, sent_ready,
182                  rcvd_init_v, rcvd_echo_cnt, delivered
183
184Receive(p) 
185    ReceiveInitFromHonest(p)
186    ReceiveInitFromFaulty(p)
187    ReceiveEcho(p)
188    ReceiveReady(p)
189
190\* ============================================================================
191\* Faulty Party Actions
192\* ============================================================================
193
194\* Faulty broadcaster assigns honest parties to value partitions (equivocation)
195FaultyEquivocate 
196    broadcaster  Faulty
197    faulty_equivoc = [v  V  {}]
198     partition  [Honest  V] :
199         faulty_equivoc' = [v ∈ V ↦ {h ∈ Honest : partition[h] = v}]
200    UNCHANGED broadcaster, sent_init, sent_echo, sent_ready,
201                  rcvd_init_v, rcvd_echo_cnt, rcvd_ready_cnt, delivered
202
203\* Faulty party sends ECHO for arbitrary value
204FaultySendEcho 
205    f  Faulty, v  V :
206       sent_echo[f] = NULL
207       sent_echo' = [sent_echo EXCEPT ![f] = v]
208       UNCHANGED broadcaster, sent_init, faulty_equivoc, sent_ready,
209                     rcvd_init_v, rcvd_echo_cnt, rcvd_ready_cnt, delivered
210
211\* Faulty party sends READY for arbitrary value
212FaultySendReady 
213    f  Faulty, v  V :
214       sent_ready[f] = NULL
215       sent_ready' = [sent_ready EXCEPT ![f] = v]
216       UNCHANGED broadcaster, sent_init, faulty_equivoc, sent_echo,
217                     rcvd_init_v, rcvd_echo_cnt, rcvd_ready_cnt, delivered
218
219FaultyStep 
220    FaultyEquivocate
221    FaultySendEcho
222    FaultySendReady
223
224\* ============================================================================
225\* Honest Party Actions
226\* ============================================================================
227
228\* Honest broadcaster sends INIT (same value to all)
229LeaderInit(self) 
230    self = broadcaster
231    self  Honest
232    sent_init[self] = NULL
233    delivered[self] = NULL
234    sent_init' = [sent_init EXCEPT ![self] = Command]
235    UNCHANGED broadcaster, faulty_equivoc, sent_echo, sent_ready,
236                  rcvd_init_v, rcvd_echo_cnt, rcvd_ready_cnt, delivered
237
238\* Upon receiving INIT: broadcast ECHO(v)
239EchoOnInit(self) 
240    sent_echo[self] = NULL
241    delivered[self] = NULL
242    HasInit(self)
243    sent_echo' = [sent_echo EXCEPT ![self] = InitValue(self)]
244    UNCHANGED broadcaster, sent_init, faulty_equivoc, sent_ready,
245                  rcvd_init_v, rcvd_echo_cnt, rcvd_ready_cnt, delivered
246
247\* Upon receiving Q echoes for v: broadcast READY(v)
248ReadyOnEchos(self) 
249    sent_ready[self] = NULL
250    delivered[self] = NULL
251     v  EchoQuorum(self, Q) :
252         sent_ready' = [sent_ready EXCEPT ![self] = v]
253    UNCHANGED broadcaster, sent_init, faulty_equivoc, sent_echo,
254                  rcvd_init_v, rcvd_echo_cnt, rcvd_ready_cnt, delivered
255
256\* Upon receiving Qa readies for v: broadcast READY(v)
257ReadyAmplify(self) 
258    sent_ready[self] = NULL
259    delivered[self] = NULL
260     v  ReadyQuorum(self, Qa) :
261         sent_ready' = [sent_ready EXCEPT ![self] = v]
262    UNCHANGED broadcaster, sent_init, faulty_equivoc, sent_echo,
263                  rcvd_init_v, rcvd_echo_cnt, rcvd_ready_cnt, delivered
264
265\* Upon receiving Q readies for v: deliver v (and send READY if not yet sent)
266DeliverOnReady(self) 
267    delivered[self] = NULL
268     v  ReadyQuorum(self, Q) :
269          delivered' = [delivered EXCEPT ![self] = v]
270          IF sent_ready[self] = NULL
271            THEN sent_ready' = [sent_ready EXCEPT ![self] = v]
272            ELSE UNCHANGED sent_ready
273    UNCHANGED broadcaster, sent_init, faulty_equivoc, sent_echo,
274                  rcvd_init_v, rcvd_echo_cnt, rcvd_ready_cnt
275
276\* Upon receiving Qo echoes for v: deliver v (fast path)
277OptimisticDeliver(self) 
278    delivered[self] = NULL
279     v  EchoQuorum(self, Qo) :
280          delivered' = [delivered EXCEPT ![self] = v]
281          IF sent_ready[self] = NULL
282            THEN sent_ready' = [sent_ready EXCEPT ![self] = v]
283            ELSE UNCHANGED sent_ready
284    UNCHANGED broadcaster, sent_init, faulty_equivoc, sent_echo,
285                  rcvd_init_v, rcvd_echo_cnt, rcvd_ready_cnt
286
287HonestStep(self) 
288    LeaderInit(self)
289    EchoOnInit(self)
290    ReadyOnEchos(self)
291    ReadyAmplify(self)
292    DeliverOnReady(self)
293    OptimisticDeliver(self)
294
295\* ============================================================================
296\* Specification
297\* ============================================================================
298Terminated   p  Honest : delivered[p]  NULL
299
300Done  Terminated  UNCHANGED vars
301
302Next 
303     p  Honest : HonestStep(p)
304     p  Honest : Receive(p)
305    FaultyStep
306    Done
307
308\* Weak fairness for honest actions and message reception
309Fairness 
310     p  Honest : WF_vars(HonestStep(p))
311     p  Honest : WF_vars(Receive(p))
312
313Spec  Init  [Next]_vars  Fairness
314
315\* ============================================================================
316\* Safety Properties
317\* ============================================================================
318
319Agreement 
320    p, q  Honest :
321      (delivered[p]  NULL  delivered[q]  NULL)  delivered[p] = delivered[q]
322
323Validity 
324   broadcaster  Honest 
325       p  Honest : delivered[p]  NULL  delivered[p] = Command
326
327\* ============================================================================
328\* Liveness
329\* ============================================================================
330
331Termination   p  Honest : (delivered[p]  NULL)
332
333=============================================================================

Model check with n=4, f=1 (or n=7, f=2 if you have patience) and verify Agreement, Validity, and Termination.

0x3: From Broadcast to Multi-Value Agreement

The round-optimized Bracha handles the single-leader case. But what if everyone has a value to propose? This is multi-value agreement (MVA):

  • Every process starts with an input value
  • All honest processes deliver the same output
  • If a supermajority agrees on input \(v\), output must be \(v\)
  • If inputs are split, output can be \(\bot\) (explicit “no consensus”)

MVA is useful whenever you have a bunch of servers that need to agree on their own observations, extracting a supermajority opinion from conflicting proposals: leader election, view changes, detecting failures, etc.

The Challenge

Without a designated leader, equivocation isn’t the threat. Fragmentation is. If honest inputs split across multiple values, no single value might reach quorum. We need:

  1. A way to detect when no value can win
  2. A safe way to terminate with \(\bot\) in that case
  3. Guarantees that \(\bot\) only happens when genuinely necessary

Additional Thresholds

Two more:

\[ \begin{aligned} Q_s &= \lfloor (n + f) / 2 \rfloor + 1 &&\text{(supermajority threshold)} \\ Q_e &= \lfloor n/2 \rfloor + 1 &&\text{(echo-backing threshold)} \end{aligned} \]

If \(\geq Q_s\) honest processes have the same input, that value must be the output (“strong validity”).

The echo-backing threshold \(Q_e\) prevents an adversary from exploiting message scheduling to make minority values win. More on this shortly.

0x4: The Multi-Value Protocol

The protocol extends round-optimized Bracha with:

  • ECHO phase: Each process echoes its own input (not a leader’s INIT)
  • READY phase: Same as before, but can also READY \(\bot\) if echoes fragment
  • ABORT phase: New mechanism for when READYs themselves fragment

Helper definitions (where \(E(v)\) and \(R(v)\) denote counts of ECHOs and READYs received for value \(v\)):

\(\text{PluralityAmongEchoes}\) $$\text{unique } \underbrace{\arg\max_v E(v)}_{\text{most echoes}} \text{, or } \bot$$
\(\text{CanWin}(x)\) $$\underbrace{R(x)}_{\text{current}} + \underbrace{(n - \text{TotalReadies})}_{\text{outstanding}} \geq Q$$
\(\text{CanReachQuorum}\) $$\underbrace{\max_v R(v)}_{\text{best}} + \underbrace{(n - \text{TotalReadies})}_{\text{outstanding}} \geq Q$$
\(\text{CanReachEchoBacking}\) $$\underbrace{\max_v E(v)}_{\text{best}} + \underbrace{\max(0, n - \text{TotalEchos} - f)}_{\text{remaining honest}} \geq Q_e$$
// State variables (per process):
//   sentᵣ ∈ V ∪ {⊥, ∅}       - what READY was broadcast (∅ = none yet)
//   delivered ∈ V ∪ {⊥, ∅}   - delivered value (∅ = none yet)
//   sentₐ ∈ {true, false}    - whether ABORT was sent

upon start:
    send ⟨ECHO, input⟩ to all

upon E(v) ≥ Q ∧ sentᵣ = ∅:                              // Q echoes -> ready
    send ⟨READY, v⟩ to all; sentᵣ ← v

upon TotalEchoes ≥ Q ∧ (∀v: E(v) < Q)                   // R2: timeout path
     ∧ sentᵣ = ∅ ∧ delivered = ∅:
    plurality ← PluralityAmongEchoes()                  // unique max, or ⊥ if tied
    hasBacking ← plurality ≠ ⊥ ∧ E(plurality) ≥ Qₑ
    if ¬hasBacking ∧ CanReachEchoBacking():
        wait                                            // more echoes may arrive
    else if hasBacking:
        send ⟨READY, plurality⟩ to all; sentᵣ ← plurality
    else:                                               // no backing possible, demote
        send ⟨READY, ⊥⟩ to all; sentᵣ ← ⊥

upon R(v) ≥ Qₐ ∧ sentᵣ = ∅:                             // R3: amplify value
    send ⟨READY, v⟩ to all; sentᵣ ← v

upon R(⊥) ≥ Qₐ ∧ sentᵣ = ∅:                             // R4: amplify ⊥
    send ⟨READY, ⊥⟩ to all; sentᵣ ← ⊥

Why ties must wait: When PluralityAmongEchoes() returns ⊥ (a tie), the protocol waits rather than immediately readying ⊥. Additional honest echoes may break the tie. Only when CanReachEchoBacking() becomes false (all honest echoes have arrived, no value can reach \(Q_e\)) does the protocol commit to ⊥.

upon E(v) ≥ Qₒ ∧ delivered = ∅                          // D1: fast path (2 rounds)
     ∧ sentᵣ ∈ {∅, v}:
    if sentᵣ = ∅: send ⟨READY, v⟩; sentᵣ ← v
    delivered ← v

upon R(v) ≥ Q ∧ delivered = ∅                           // D2: deliver on ready quorum
     ∧ (sentᵣ ∈ {∅, v} ∨ ¬CanWin(sentᵣ)):
    if sentᵣ = ∅: send ⟨READY, v⟩; sentᵣ ← v
    delivered ← v

upon R(⊥) ≥ Q ∧ delivered = ∅                           // D3: deliver on ready-⊥ quorum
     ∧ (sentᵣ ∈ {∅, ⊥} ∨ ¬CanWin(sentᵣ)):
    if sentᵣ = ∅: send ⟨READY, ⊥⟩; sentᵣ ← ⊥
    delivered ← ⊥

// Abort mechanism (READYs fragment irrecoverably)
upon sentᵣ ≠ ∅ ∧ TotalReadies ≥ Q                       // A1: initiate abort
     ∧ (∀v: R(v) < Q) ∧ R(⊥) < Q
     ∧ ¬CanReachQuorum():
    send ⟨ABORT⟩ to all; sentₐ ← true

upon Aborts ≥ Qₐ ∧ sentᵣ ≠ ∅                            // A2: amplify abort
     ∧ sentₐ = false:
    send ⟨ABORT⟩ to all; sentₐ ← true

upon Aborts ≥ Q ∧ sentᵣ ≠ ∅ ∧ delivered = ∅             // A3: deliver on abort
     ∧ TotalReadies ≥ Q ∧ (∀v: R(v) ≤ Q − 2f − 1):
    if sentₐ = false: send ⟨ABORT⟩; sentₐ ← true
    delivered ← ⊥

Abort triggers when a process has readied, received \(Q\) total readies, but no value can reach \(Q\). ABORTs propagate via amplification, and \(Q\) ABORTs force delivery of \(\bot\).

On A3’s guard: The ∀v: R(v) ≤ Q − 2f − 1 condition prevents aborting when a value could have achieved quorum elsewhere. If some value \(v\) got \(Q\) readies globally, at least \(Q - f\) correct processes sent READY(v). A3 requires having heard from \(Q\) senders, at least \(Q - f\) correct. Pigeonhole: overlap \(\geq (Q-f) + (Q-f) - (n-f) = Q - 2f\). So we’d observe \(\geq Q - 2f\) readies for \(v\) locally, blocking the abort.

Safety Properties

Agreement: Same intersection arguments, plus: if anyone delivers \(v \in V\), they saw \(Q_o\) ECHOs or \(Q\) READYs for \(v\). For D1, \(\geq Q_e\) correct processes echoed \(v\); for D2, \(\geq Q_a\) correct processes readied \(v\). Either prevents a conflicting value from reaching delivery threshold.

Strong Validity: If \(\geq Q_s\) honest processes input \(v\), then \(v\) has a strict majority among all honest echoes. By Lemma 2 in the spec, any process receiving \(Q\) echoes either sees \(Q\) for \(v\) directly, or \(v\) is the majority. Either way, all READYs go to \(v\), and \(v\) is delivered.

Integrity: If someone delivers \(\bot\), it required \(Q\) READYs for \(\bot\) or \(Q\) ABORTs. Both require \(\geq Q_a\) honest processes to have sent \(\bot\)/ABORT. Only happens when no value achieved supermajority.

Termination (partial synchrony): Every honest process eventually receives \(Q\) total echoes (since \(|Honest| \geq Q\)). From there:

  • Some value reaches \(Q_o\): optimistic delivery
  • Some value reaches \(Q\): standard delivery
  • Echoes fragment: after timeout, demote to \(\bot\) when CanReachEchoBacking() = false
  • READYs fragment: abort mechanism fires

Synchrony Model: Safety (Agreement, Validity, Integrity) holds under full asynchrony. Only demotion-to-⊥ requires partial synchrony (correct-to-correct messages arrive within Δ after GST). Consistent with FLP.

The Echo-Backing Fix

The \(E(v) \geq Q_e\) guard on R2 wasn’t in the initial design. Model checking revealed a subtle attack that only surfaces in MVA (not single-leader broadcast). This echo-backing requirement applies only to R2 (plurality ready), not to R3 (amplification).

The Attack: Consider \(n = 7, f = 2\) with thresholds \(Q = 5\), \(Q_o = 6\), \(Q_e = 4\). Honest processes \(h_1\)–\(h_4\) input \(x\); honest \(h_5\) inputs \(y\). Byzantine processes send \(\text{ECHO}(x)\) to \(h_1\) and \(\text{ECHO}(y)\) to everyone else.

The adversary schedules delivery so \(h_2\) receives its first \(Q = 5\) echoes as \(\{B_1(y), B_2(y), h_5(y), h_1(x), h_3(x)\}\), giving \(E(y) = 3, E(x) = 2\). Without echo-backing, \(h_2\) would ready \(y\) via R2 (plurality). Similarly for \(h_3, h_4, h_5\).

Meanwhile, \(h_1\) sees \(Q_o = 6\) echoes for \(x\) and delivers via D1. If \(h_2\)–\(h_5\) ready \(y\) and deliver via D2, Agreement is violated.

Why Echo-Backing Blocks This: The maximum echoes for \(y\) at any correct process is:

$$ E(y) = \underbrace{1}_{h_5} + \underbrace{2}_{\text{Byzantine}} = 3 < 4 = Q_e $$

Since \(E(y) < Q_e\), R2’s guard fails. Process \(h_2\) must wait. Eventually \(h_2(x)\) and \(h_4(x)\) arrive, yielding \(E(x) = 4 \geq Q_e\). Now \(x\) is both the plurality and has echo-backing, so \(h_2\) readies \(x\).

A value with \(Q_o\) echoes at any process guarantees \(\geq Q_e\) echoes at every correct process (since \(Q_o - f = Q_e\)). The echo-backing guard ensures we only ready values that could plausibly have fast-path support.

Liveness (partial synchrony required): What if plurality exists but can never reach echo-backing? This happens when all correct echoes have arrived but they’re fragmented. Demotion to \(\bot\) requires both:

  1. A local timeout has fired, AND
  2. CanReachEchoBacking() = false

Under partial synchrony, after GST all correct-to-correct messages arrive within Δ. A timeout calibrated to ≥ Δ ensures all correct echoes arrive first. Post-timeout, HonestEchoSenders equals the count of correct processes, so CanReachEchoBacking() accurately reflects whether any value can still reach \(Q_e\).

If any value had supermajority support (\(\geq Q_s\) correct inputs), it would achieve \(\geq Q_e\) echoes from correct processes alone, arriving before timeout.

Implementation: The timeout must fire only after all correct echoes arrive. In partial synchrony, use increasing timeouts (exponential backoff) until post-GST. Pre-GST, if a timeout fires early, don’t demote: re-check CanReachEchoBacking() (still true if correct echoes are missing). The TLA+ model makes timeout contingent on all correct echoes having been received.

Safety vs. Liveness: Safety (Agreement, Validity) holds under full asynchrony. Only demotion-to-\(\bot\) liveness requires partial synchrony. Consistent with FLP.

The CanWin Fallback

The ¬CanWin(sentᵣ) fallback in D2/D3 prevents livelock. Without it, a process that sent READY for a minority value gets stuck forever when a different value achieves \(Q\) readies. The primary guard sentᵣ ∈ {∅, v} fails, but the process can’t deliver. The fallback lets it give up when its value is mathematically eliminated.

The TLA+ Spec

rcast.tla (download)
  1-------------------- MODULE rcast --------------------
  2(*
  3 Rapidcast MVA: Byzantine Multi-Value Agreement with Optimistic Fast Path
  4 ========================================================================
  5 * Version 2. Simplified state representation, abort safety fixes.
  6 Can Bölük, April 3, 2025.
  7
  8 All processes propose values and agree on one output (or  when inputs
  9 are irreconcilably split). Delivers in 2 rounds when a supermajority
 10 shares the same input; falls back to 3-4 rounds otherwise.
 11 
 12 Properties (n > 3f):
 13   - Agreement        No two honest processes deliver different values
 14   - Strong Validity  Supermajority input v: output v
 15   - Weak Validity    Output v  V: some honest process input v
 16   - Integrity        Output : no supermajority existed
 17   - Termination      All honest processes eventually deliver
 18 
 19 Synchrony: Safety holds under full asynchrony. Termination requires
 20 partial synchrony (bounded message delay after GST).
 21 
 22 Fault Model: Byzantine processes may equivocate (send different values
 23 to different recipients). Modeled via per-recipient nondeterministic
 24 message content. Honest processes broadcast uniformly.
 25*)
 26
 27EXTENDS Naturals, FiniteSets, TLC
 28
 29\* ============================================================================
 30\* SYNCHRONY MODEL:
 31\* ============================================================================
 32\* Safety holds under full asynchrony. Termination needs partial synchrony:
 33\* ReadyOnTimeout's ⊥-demotion assumes all honest echoes arrive before
 34\* CanReachEchoBacking becomes false. Implementation: timeout > post-GST Δ.
 35\* Spec abstracts timing via fairness on honest-to-honest delivery.
 36\* ============================================================================
 37
 38\* ============================================================================
 39\* Protocol Parameters
 40\* ============================================================================
 41CONSTANTS
 42   Honest,         \* the set of correct processes
 43   Faulty,         \* the set of Byzantine processes, may be empty
 44   V               \* the set of valid values
 45
 46\* Sentinel values
 47NULL    "∅"    \* not yet set
 48Bottom  "⊥"    \* no value (abort outcome)
 49VExt  V  {Bottom}
 50
 51\* ============================================================================
 52\* Definitions
 53\* ============================================================================
 54P  Honest  Faulty
 55N  Cardinality(P)
 56F  Cardinality(Faulty)
 57
 58\* Protocol thresholds
 59Q  N - F                      \* quorum (2f+1 when n=3f+1)
 60Qo  N ÷ 2 + F + 1          \* optimistic delivery
 61Qa  F + 1                     \* amplification
 62Qs  (N + F) ÷ 2 + 1        \* supermajority
 63Qe  N ÷ 2 + 1              \* echo-backing (majority)
 64
 65ASSUME  N > 3 * F
 66        V  {}
 67        Honest  Faulty = {}
 68        IsFiniteSet(Honest)
 69        IsFiniteSet(Faulty)
 70        IsFiniteSet(V)
 71       \* Disjointness: values, parties, sentinels
 72        V  P = {}
 73        Bottom  V  NULL  V
 74        Bottom  P  NULL  P
 75
 76\* Symmetry: combine permutations across Honest, Faulty, V
 77CombinePerm(ph, pf, pv) 
 78   [x  P  V  IF x  Honest THEN ph[x]
 79                ELSE IF x  Faulty THEN pf[x]
 80                ELSE pv[x]]
 81
 82Sym  {CombinePerm(ph, pf, pv) :
 83         ph  Permutations(Honest),
 84         pf  Permutations(Faulty),
 85         pv  Permutations(V)}
 86
 87\* ============================================================================
 88\* Protocol State Variables
 89\* ============================================================================
 90VARIABLES
 91   input,              \* [Honest -> V]: honest inputs
 92
 93   \* === Honest party broadcast state ===
 94   sent_echo,          \* [Honest -> V  {NULL}]: value broadcast by honest party
 95   sent_ready,         \* [Honest -> VExt  {NULL}]: value broadcast by honest party
 96   sent_abort,         \* [Honest -> BOOLEAN]: whether honest party sent abort
 97
 98   \* === Per-sender receive tracking ===
 99   rcvd_echo,          \* [Honest -> SUBSET (P × V)]: received echoes (only non-NULL)
100   rcvd_ready,         \* [Honest -> SUBSET (P × VExt)]: received readies (only non-NULL)
101   rcvd_abort,         \* [Honest -> SUBSET P]: senders of ABORT
102
103   delivered           \* [Honest -> VExt  {NULL}]: delivered value
104
105send_vars  sent_echo, sent_ready, sent_abort
106recv_vars  rcvd_echo, rcvd_ready, rcvd_abort
107vars  input, sent_echo, sent_ready, sent_abort,
108          rcvd_echo, rcvd_ready, rcvd_abort, delivered
109
110\* ============================================================================
111\* Helper Operators
112\* ============================================================================
113Max(a, b)  IF a  b THEN a ELSE b
114SetMax(S)  IF S = {} THEN 0 ELSE CHOOSE x  S :  y  S : x  y
115
116\* Get echo value from sender s at process p (NULL if not received)
117GetEcho(p, s)  IF  v  V : s, v  rcvd_echo[p]
118                THEN CHOOSE v  V : s, v  rcvd_echo[p]
119                ELSE NULL
120
121\* Get ready value from sender s at process p (NULL if not received)
122GetReady(p, s)  IF  w  VExt : s, w  rcvd_ready[p]
123                  THEN CHOOSE w  VExt : s, w  rcvd_ready[p]
124                  ELSE NULL
125
126\* Who has sent us an echo?
127EchoSenders(p)  {s  P :  v  V : s, v  rcvd_echo[p]}
128
129\* Who has sent us a ready?
130ReadySenders(p)  {s  P :  w  VExt : s, w  rcvd_ready[p]}
131
132\* Counts derived from set cardinalities
133EchoCount(p, v)  Cardinality({s  P : s, v  rcvd_echo[p]})
134ReadyCount(p, w)  Cardinality({s  P : s, w  rcvd_ready[p]})
135AbortCount(p)  Cardinality(rcvd_abort[p])
136TotalEchos(p)  Cardinality(EchoSenders(p))
137TotalReadys(p)  Cardinality(ReadySenders(p))
138MaxEchoCount(p)  SetMax({EchoCount(p, v) : v  V})
139MaxReadyCount(p)  SetMax({ReadyCount(p, v) : v  V})
140
141\* ============================================================================
142\* Threshold Predicates
143\* ============================================================================
144\* [RCAST-ECHO-QUORUM]
145EchoQuorum(p, t)  {v  V : EchoCount(p, v)  t}
146
147\* [RCAST-READY-QUORUM]
148ReadyQuorum(p, t)  {v  V : ReadyCount(p, v)  t}
149
150\* [RCAST-BOT-QUORUM]
151HasReadyBottomQuorum(p, t)  ReadyCount(p, Bottom)  t
152
153\* [RCAST-PLURALITY]
154\* Unique max or  if tied
155PluralityAmongReceived(p) 
156   LET maxCnt  MaxEchoCount(p)
157       winners  {v  V : EchoCount(p, v) = maxCnt}
158   IN IF Cardinality(winners) = 1  maxCnt > 0
159      THEN CHOOSE v  winners : TRUE
160      ELSE Bottom
161
162\* [RCAST-CAN-REACH-QUORUM]
163\* Can any value still reach quorum?
164CanReachQuorum(p) 
165   LET remaining  N - TotalReadys(p)
166   IN MaxReadyCount(p) + remaining  Q
167
168\* [RCAST-CAN-WIN]
169\* Can w still hit quorum? (TRUE if w = NULL since question is N/A)
170CanWin(p, w) 
171   IF w = NULL THEN TRUE
172   ELSE LET remaining  N - TotalReadys(p)
173        IN ReadyCount(p, w) + remaining  Q
174
175\* [RCAST-ECHO-BACKING]
176HasEchoBacking(p, v)  EchoCount(p, v)  Qe
177
178\* [RCAST-CAN-REACH-ECHO-BACKING]
179\* Discount up to f missing senders (Byzantine may omit forever)
180HonestEchoSenders(p)  Cardinality({h  Honest :  v  V : h, v  rcvd_echo[p]})
181
182CanReachEchoBacking(p) 
183   LET honestRemaining  Cardinality(Honest) - HonestEchoSenders(p)
184   IN MaxEchoCount(p) + honestRemaining  Qe
185
186\* ============================================================================
187\* Byzantine Nondeterminism Reduction
188\* ============================================================================
189(*
190 Limit Byzantine to "interesting" values that could affect outcome.
191 Sound because unknown values can't reach thresholds (honest won't echo/ready).
192 *)
193
194\* Values that are "in play" at process p
195InterestingEchoValues(p) 
196   {input[h] : h  Honest}                              \* Honest inputs
197    {v  V : EchoCount(p, v) > 0}                 \* Already seen echoes
198
199InterestingReadyValues(p) 
200   {input[h] : h  Honest}                              \* Honest inputs
201    {v  V : EchoCount(p, v) > 0}                 \* Echoed values
202    {v  V : ReadyCount(p, v) > 0}                \* Already seen readies
203
204\* ============================================================================
205\* Type Invariant
206\* ============================================================================
207TypeOK 
208    input  [Honest  V]
209   \* Honest broadcast state
210    sent_echo  [Honest  V  {NULL}]
211    sent_ready  [Honest  VExt  {NULL}]
212    sent_abort  [Honest  BOOLEAN]
213   \* Per-sender receive tracking
214    rcvd_echo  [Honest  SUBSET (P × V)]
215    rcvd_ready  [Honest  SUBSET (P × VExt)]
216    rcvd_abort  [Honest  SUBSET P]
217    delivered  [Honest  VExt  {NULL}]
218
219\* ============================================================================
220\* Protocol Initialization
221\* ============================================================================
222Init 
223    input  [Honest  V]
224   \* Honest broadcast state
225    sent_echo = [h  Honest  NULL]
226    sent_ready = [h  Honest  NULL]
227    sent_abort = [h  Honest  FALSE]
228   \* Per-sender receive tracking
229    rcvd_echo = [p  Honest  {}]
230    rcvd_ready = [p  Honest  {}]
231    rcvd_abort = [p  Honest  {}]
232    delivered = [p  Honest  NULL]
233
234\* ============================================================================
235\* Message Reception
236\* ============================================================================
237(*
238 Byzantine equivocation: faulty senders send different values to different
239 recipients. Modeled via nondeterministic receive from faulty, constrained
240 by at-most-once per (src, dst). Honest senders broadcast uniformly.
241 *)
242
243\* Receive ECHO from an honest sender (must match what they broadcast)
244ReceiveEchoFromHonest(p, h) 
245    delivered[p] = NULL
246    h  Honest
247    ¬∃ v  V : h, v  rcvd_echo[p]
248    sent_echo[h]  NULL
249    rcvd_echo' = [rcvd_echo EXCEPT ![p] = @ ∪ {⟨h, sent_echo[h]⟩}]
250    UNCHANGED input, send_vars, rcvd_ready, rcvd_abort, delivered
251
252\* Receive ECHO from a faulty sender (limited to interesting values)
253ReceiveEchoFromFaulty(p, f) 
254    delivered[p] = NULL
255    f  Faulty
256    ¬∃ v  V : f, v  rcvd_echo[p]
257     v  InterestingEchoValues(p) :
258         rcvd_echo' = [rcvd_echo EXCEPT ![p] = @ ∪ {⟨f, v⟩}]
259    UNCHANGED input, send_vars, rcvd_ready, rcvd_abort, delivered
260
261ReceiveEcho(p) 
262     h  Honest : ReceiveEchoFromHonest(p, h)
263     f  Faulty : ReceiveEchoFromFaulty(p, f)
264
265\* Receive READY from an honest sender (value or )
266ReceiveReadyFromHonest(p, h) 
267    delivered[p] = NULL
268    h  Honest
269    ¬∃ w  VExt : h, w  rcvd_ready[p]
270    sent_ready[h]  NULL
271    rcvd_ready' = [rcvd_ready EXCEPT ![p] = @ ∪ {⟨h, sent_ready[h]⟩}]
272    UNCHANGED input, send_vars, rcvd_echo, rcvd_abort, delivered
273
274\* Receive READY(v) from a faulty sender (limited to interesting values)
275ReceiveReadyValueFromFaulty(p, f) 
276    delivered[p] = NULL
277    f  Faulty
278    ¬∃ w  VExt : f, w  rcvd_ready[p]
279     v  InterestingReadyValues(p) :
280         rcvd_ready' = [rcvd_ready EXCEPT ![p] = @ ∪ {⟨f, v⟩}]
281    UNCHANGED input, send_vars, rcvd_echo, rcvd_abort, delivered
282
283\* Receive READY() from a faulty sender
284ReceiveReadyBottomFromFaulty(p, f) 
285    delivered[p] = NULL
286    f  Faulty
287    ¬∃ w  VExt : f, w  rcvd_ready[p]
288    rcvd_ready' = [rcvd_ready EXCEPT ![p] = @ ∪ {⟨f, Bottom⟩}]
289    UNCHANGED input, send_vars, rcvd_echo, rcvd_abort, delivered
290
291ReceiveReady(p) 
292     h  Honest : ReceiveReadyFromHonest(p, h)
293     f  Faulty : ReceiveReadyValueFromFaulty(p, f)
294     f  Faulty : ReceiveReadyBottomFromFaulty(p, f)
295
296\* Receive ABORT from an honest sender
297ReceiveAbortFromHonest(p, h) 
298    delivered[p] = NULL
299    h  Honest
300    h  rcvd_abort[p]
301    sent_abort[h] = TRUE
302    rcvd_abort' = [rcvd_abort EXCEPT ![p] = @ ∪ {h}]
303    UNCHANGED input, send_vars, rcvd_echo, rcvd_ready, delivered
304
305\* Receive ABORT from a faulty sender (at any time)
306ReceiveAbortFromFaulty(p, f) 
307    delivered[p] = NULL
308    f  Faulty
309    f  rcvd_abort[p]
310    rcvd_abort' = [rcvd_abort EXCEPT ![p] = @ ∪ {f}]
311    UNCHANGED input, send_vars, rcvd_echo, rcvd_ready, delivered
312
313ReceiveAbort(p) 
314     h  Honest : ReceiveAbortFromHonest(p, h)
315     f  Faulty : ReceiveAbortFromFaulty(p, f)
316
317Receive(p) 
318    ReceiveEcho(p)
319    ReceiveReady(p)
320    ReceiveAbort(p)
321
322\* ============================================================================
323\* Honest Party Actions
324\* ============================================================================
325
326\* E1: [RCAST-SEND-ECHO]
327\* Upon start: broadcast ECHO(input)
328SendEcho(self) 
329    sent_echo[self] = NULL
330    delivered[self] = NULL
331    LET v  input[self]
332      IN sent_echo' = [sent_echo EXCEPT ![self] = v]
333    UNCHANGED input, sent_ready, sent_abort, recv_vars, delivered
334
335\* R1: [RCAST-READY-ON-ECHO-QUORUM]
336\* Upon receiving Q echoes for value v: broadcast READY(v)
337ReadyOnEchoQuorum(self) 
338    sent_ready[self] = NULL
339    delivered[self] = NULL
340     v  EchoQuorum(self, Q) :
341         sent_ready' = [sent_ready EXCEPT ![self] = v]
342    UNCHANGED input, sent_echo, sent_abort, recv_vars, delivered
343
344\* R2: [RCAST-READY-ON-TIMEOUT]
345\* Upon timeout with Q total echoes but no quorum: broadcast READY(plurality) or READY()
346\* Only proceed if: (1) plurality has echo-backing, OR (2) can't reach echo-backing (waiting won't help)
347\* This ensures we wait on ties when more echoes could break them.
348ReadyOnTimeout(self) 
349    sent_ready[self] = NULL
350    delivered[self] = NULL
351    TotalEchos(self)  Q
352    EchoQuorum(self, Q) = {}
353    LET plurality  PluralityAmongReceived(self)
354      IN  (plurality  Bottom  HasEchoBacking(self, plurality))  ¬CanReachEchoBacking(self)
355          sent_ready' = [sent_ready EXCEPT ![self] =
356              IF plurality  Bottom  HasEchoBacking(self, plurality)
357              THEN plurality
358              ELSE Bottom]
359    UNCHANGED input, sent_echo, sent_abort, recv_vars, delivered
360
361\* R3: [RCAST-READY-AMPLIFY-VALUE]
362\* Bracha amplification: f+1 READYs force the network.
363\* No echo-backing gate here - that would break totality under equivocation.
364\* Echo-backing is appropriate for originating READYs (R2), not amplification.
365ReadyAmplifyValue(self) 
366    sent_ready[self] = NULL
367    delivered[self] = NULL
368     v  ReadyQuorum(self, Qa) :
369         sent_ready' = [sent_ready EXCEPT ![self] = v]
370    UNCHANGED input, sent_echo, sent_abort, recv_vars, delivered
371
372\* R4: [RCAST-READY-AMPLIFY-BOT]
373\* Upon receiving Qa READY(): broadcast READY()
374ReadyAmplifyBottom(self) 
375    sent_ready[self] = NULL
376    delivered[self] = NULL
377    HasReadyBottomQuorum(self, Qa)
378    sent_ready' = [sent_ready EXCEPT ![self] = Bottom]
379    UNCHANGED input, sent_echo, sent_abort, recv_vars, delivered
380
381\* D1: [RCAST-OPTIMISTIC-DELIVER]
382\* Upon receiving Qo echoes for v: deliver v (fast path)
383OptimisticDeliver(self) 
384    delivered[self] = NULL
385     v  EchoQuorum(self, Qo) :
386          sent_ready[self]  {NULL, v}
387          delivered' = [delivered EXCEPT ![self] = v]
388          IF sent_ready[self] = NULL
389            THEN sent_ready' = [sent_ready EXCEPT ![self] = v]
390            ELSE UNCHANGED sent_ready
391    UNCHANGED input, sent_echo, sent_abort, recv_vars
392
393\* D2: [RCAST-DELIVER-ON-READY-VALUE]
394\* Upon receiving Q READY(v): deliver v
395DeliverOnReadyValue(self) 
396    delivered[self] = NULL
397     v  ReadyQuorum(self, Q) :
398         \* Primary guard: readied in {, v}. Fallback: readied value can't win.
399           sent_ready[self]  {NULL, v}
400            ¬CanWin(self, sent_ready[self])
401          delivered' = [delivered EXCEPT ![self] = v]
402          IF sent_ready[self] = NULL
403            THEN sent_ready' = [sent_ready EXCEPT ![self] = v]
404            ELSE UNCHANGED sent_ready
405    UNCHANGED input, sent_echo, sent_abort, recv_vars
406
407\* D3: [RCAST-DELIVER-ON-READY-BOT]
408\* Upon receiving Q READY(): deliver 
409DeliverOnReadyBottom(self) 
410    delivered[self] = NULL
411    HasReadyBottomQuorum(self, Q)
412   \* Primary guard: readied in {, }. Fallback: readied value can't win.
413     sent_ready[self]  {NULL, Bottom}
414      ¬CanWin(self, sent_ready[self])
415    delivered' = [delivered EXCEPT ![self] = Bottom]
416    IF sent_ready[self] = NULL
417      THEN sent_ready' = [sent_ready EXCEPT ![self] = Bottom]
418      ELSE UNCHANGED sent_ready
419    UNCHANGED input, sent_echo, sent_abort, recv_vars
420
421\* A1: [RCAST-SEND-ABORT]
422\* Upon detecting no value can reach quorum: broadcast ABORT
423SendAbort(self) 
424    sent_abort[self] = FALSE
425    delivered[self] = NULL
426    sent_ready[self]  NULL
427    TotalReadys(self)  Q
428    ReadyQuorum(self, Q) = {}
429    ¬HasReadyBottomQuorum(self, Q)
430    ¬CanReachQuorum(self)
431    sent_abort' = [sent_abort EXCEPT ![self] = TRUE]
432    UNCHANGED input, sent_echo, sent_ready, recv_vars, delivered
433
434\* A2: [RCAST-ABORT-AMPLIFY]
435\* Upon receiving Qa ABORTs: broadcast ABORT
436AbortAmplify(self) 
437    sent_abort[self] = FALSE
438    delivered[self] = NULL
439    sent_ready[self]  NULL
440    AbortCount(self)  Qa
441    sent_abort' = [sent_abort EXCEPT ![self] = TRUE]
442    UNCHANGED input, sent_echo, sent_ready, recv_vars, delivered
443
444\* A3: [RCAST-DELIVER-ON-ABORT]
445\* Upon receiving Q ABORTs: deliver 
446DeliverOnAbort(self) 
447    delivered[self] = NULL
448    sent_ready[self]  NULL
449    AbortCount(self)  Q
450   \* Coverage guard: must hear from enough distinct parties before deciding.
451   \* Without this, NoValueCouldHaveWon passes trivially when honest readies in flight.
452    TotalReadys(self)  Q
453   \* [RCAST-ABORT-SAFETY-GUARD]
454   \* Block  delivery if any value might have reached quorum elsewhere.
455   \* If v got Q readies globally, at least Q-F honest sent READY(v).
456   \* Having heard from Q processes (>=Q-F honest), by pigeonhole:
457   \*   overlap >= (Q-F) + (Q-F) - (N-F) = Q - 2F
458   \* So we'd see at least Q-2F readies for v locally.
459    ReadyQuorum(self, Q - 2*F) = {}
460    delivered' = [delivered EXCEPT ![self] = Bottom]
461    IF sent_abort[self] = FALSE
462      THEN sent_abort' = [sent_abort EXCEPT ![self] = TRUE]
463      ELSE UNCHANGED sent_abort
464   \* Note: sent_ready[self] /= NULL is a precondition, so no ready-if-not-sent logic needed
465    UNCHANGED input, sent_echo, sent_ready, recv_vars
466
467HonestStep(self) 
468    SendEcho(self)
469    ReadyOnEchoQuorum(self)
470    ReadyOnTimeout(self)
471    ReadyAmplifyValue(self)
472    ReadyAmplifyBottom(self)
473    OptimisticDeliver(self)
474    DeliverOnReadyValue(self)
475    DeliverOnReadyBottom(self)
476    SendAbort(self)
477    AbortAmplify(self)
478    DeliverOnAbort(self)
479
480\* ============================================================================
481\* Specification
482\* ============================================================================
483Terminated   p  Honest : delivered[p]  NULL
484
485Done  Terminated  UNCHANGED vars
486
487Next 
488     p  Honest : HonestStep(p)
489     p  Honest : Receive(p)
490   \* Faulty actions via nondeterminism in ReceiveXxxFromFaulty
491    Done
492
493\* Honest processes take enabled actions; honest-to-honest channels reliable.
494\* NO fairness for Byzantine receives - adversary can omit arbitrarily.
495Fairness 
496     p  Honest : WF_vars(HonestStep(p))
497     p  Honest :  h  Honest : WF_vars(ReceiveEchoFromHonest(p, h))
498     p  Honest :  h  Honest : WF_vars(ReceiveReadyFromHonest(p, h))
499     p  Honest :  h  Honest : WF_vars(ReceiveAbortFromHonest(p, h))
500
501Spec  Init  [Next]_vars  Fairness
502
503\* ============================================================================
504\* Safety Properties
505\* ============================================================================
506
507\* [RCAST-INV-AGREEMENT]
508Agreement 
509    p, q  Honest :
510      (delivered[p]  NULL  delivered[q]  NULL)  delivered[p] = delivered[q]
511
512HonestInputCount(v)  Cardinality({p  Honest : input[p] = v})
513H  Cardinality(Honest)
514
515HasSupermajority(v)  HonestInputCount(v)  Qs
516
517SupermajorityValue 
518   IF  v  V : HasSupermajority(v)
519   THEN CHOOSE v  V : HasSupermajority(v)
520   ELSE Bottom
521
522SimpleMajorityThreshold  H ÷ 2 + 1
523HasSimpleMajority(v)  HonestInputCount(v)  SimpleMajorityThreshold
524
525SimpleMajorityValue 
526   IF  v  V : HasSimpleMajority(v)
527   THEN CHOOSE v  V : HasSimpleMajority(v)
528   ELSE Bottom
529
530\* [RCAST-INV-STRONG-VALIDITY]
531StrongValidity 
532   LET sm  SupermajorityValue
533   IN sm  Bottom 
534       p  Honest : delivered[p]  NULL  delivered[p] = sm
535
536\* [RCAST-INV-WEAK-VALIDITY]
537WeakValidity 
538    p  Honest :
539      delivered[p]  V  HonestInputCount(delivered[p]) > 0
540
541\* [RCAST-INV-INTEGRITY]
542Integrity 
543    p  Honest :
544      delivered[p] = Bottom  SupermajorityValue = Bottom
545
546Validity  StrongValidity
547
548\* ============================================================================
549\* Liveness
550\* ============================================================================
551
552\* [RCAST-TERMINATION]
553Termination   p  Honest : (delivered[p]  NULL)
554
555=============================================================================

Note on CanReachEchoBacking: Only used for R2’s plurality demotion. The formula conservatively assumes all outstanding echoes could go to the leading value, minus f Byzantine who may not send. Post-timeout (when all honest echoes have arrived), this becomes exact. R3 amplification doesn’t require echo-backing.

The implementation uses (n - TotalEchoes) - f to approximate remaining honest senders. The TLA+ spec uses the exact Cardinality(Honest) - HonestEchoSenders(p). These are equivalent post-timeout but differ during execution.

0x5: Putting It Together

PropertyBrachaRound-OptimizedMVA
Best case3 rounds2 rounds2 rounds
Worst case3 rounds3 rounds4 rounds
Messages\(O(n^2)\)\(O(n^2)\)\(O(n^2)\)
Fault tolerance\(n > 3f\)\(n > 3f\)\(n > 3f\)
Input modelSingle leaderSingle leaderAll propose

Bigger quorums give stronger intersection properties, which lets us skip rounds when conditions are good. The cost is a slightly higher threshold, but when the network is healthy and honest parties dominate, hitting \(Q_o\) vs \(Q\) is a matter of one or two more messages, not an extra round.

0x6: Wrapping Up

So there you have it: Byzantine reliable broadcast in 2 rounds instead of 3, using the counterintuitive approach of a larger quorum for the fast path. The bigger quorum gives us stronger intersection properties, enough honest parties in the overlap to eliminate the voting round when conditions are good.

A nice reminder that even “solved” problems from 1987 can hide optimizations. Now to find what other decades-old algorithms are hiding in plain sight…

Can Bölük

Can Bölük

Security researcher and reverse engineer. Interested in Windows kernel development, low-level programming, static program analysis and cryptography.