EIP-3436: Expanded Clique Block Choice Rule


Metadata
Status: StagnantStandards Track: CoreCreated: 2021-03-25
Authors
Danno Ferrin (@shemnon)
Requires

Simple Summary


Add a four step block rule to Clique that should reduce block production deadlocks

Abstract


The current specification of Clique allows for multiple competing blocks from producers but does not provide any strategies to pick blocks aside from the current "highest total difficulty" rule. This EIP proposes a four step choice rule of highest total difficulty, shortest chain, most recently in-turn, and lowest hash. This would prevent deadlocks that have occurred in production systems.

Motivation


There has been more than one deadlock in the Goerli multi-client Clique network. The number of active validators was greater than 1/2 of the available validators so a chain halt should not have occurred. The halt was resolved by an inactive validator coming back on line. The state of the chain was in one of two configurations of 8 validators that can result in a chain halt. Three of the four clients observed a choice sequence of lowest total difficulty followed by first observed block. Geth added one extra rule of preferring the shortest chain before preferring the first observed block. This fork would have resolved itself with Geth's rule, but there is still a configuration where the chain can halt with a shortest chain rule.

Specification


When a Clique validator is arbitrating the canonical status between two different chain head blocks, they should choose the canonical block with the following ordered priorities.

  1. Choose the block with the most total difficulty.
  2. Then choose the block with the lowest block number.
  3. Then choose the block whose validator had the least recent in-turn block assignment.
  4. Then choose the block with the lowest hash.

When resolving rule 3 clients should use the following formula, where validator_index is the integer index of the validator that signed the block when sorted as per epoch checkpointing, header_number is the number of the header, and validator_count is the count of the current validators. Clients should choose the block with the largest value. Note that an in-turn block is considered to be the most recent in-turn block.


When resolving rule 4 the hash should be converted into an unsigned 256 bit integer.

Rationale


Two scenarios of a halted chain are known based on the current total difficulty then first observed rule. One of the scenarios is also resistant to the shortest chain rule.

For the first scenario where chains of different lengths can halt consider a block with 8 validators, whose addresses sort to the same order as their designation in this example. A fully in-order chain exists and validator number 8 has just produced an in-turn block and then validators 5, 7 and 8 go offline, leaving validators 1 to 6 to produce blocks. Two forks form, one with an in-order block from validator 1 and then an out of order block from validator 3. The second fork forms from validators 2, 4, and 6 in order. Both have a net total difficulty of 3 more than the common ancestor. So in this case if both forks become aware of the other fork then both are considered equally viable and neither set of validators should switch to the newly observed fork. In this case, adding a shortest chain rule would break the deadlock as the even numbered validators would adopt the shorter chain.

For the second scenario with the same validator set and in-order chain with validator 7 having just produced an in order block, then validators 7 and 8 go offline. Two forks form, 1,3,5 on one side and 2,4,6 on the other. Both forks become aware of the other fork after producing their third block. In this case both forks have equal total difficulty and equal length. So Geth's rule would not break the tie and only the arrival of one of the missing validators fix the chain. In a worst case scenario the odd and even chains would produce a block for 7 and 8 respectively, and chain halt would result with no validators that have not chosen a fork. Only a manual rollback would fix this.

One consideration when formulating the rules is that the block choice should be chosen so that it would encourage the maximum amount of in-order blocks. Selecting a chain based on shortest chain implicitly prefers the chain with more in-order blocks. When selecting between competing out of order chains the validator who is closest to producing an in-order block in the future should have their chain declined so that they are available to produce an in-order block sooner.

At least one client has been observed producing multiple blocks at the same height with the same difficulty, so a final catch-all standard of lowest block hash should break any remaining ties.

Backwards Compatibility


The current block choice rules are a mix of most total difficulty and most total difficulty plus shortest chain.

As long as the majority of the active validators implement the block choice rules then a client who only implements the existing difficulty based rule will eventually align with the chain preferred by these rules. If less than a majority implement these rules then deadlocks can still occur, and depend on the first observation of problematic blocks, which is no worse than the current situation.

If clients only partially implement the rule as long as every higher ranked rule is also implemented then the situation will be no worse than today.

Security Considerations


Malicious and motivated attackers who are participating in the network can force the chain to halt with well crafted block production. With a fully deterministic choice rule the opportunity to halt is diminished. Attackers still have the same opportunities to flood the network with multiple blocks at the same height. A deterministic rule based on the lowest hash reduces the impact of such a flooding attack. A malicious validator could exploit this deterministic rule to produce a replacement block. Such an attack exists in current implementations but a deterministic hash rule makes such replacements more likely. However the impact of such an attack seems low to trivial at this time.

Copyright


Copyright and related rights waived via CC0.