EIP-2124: Fork identifier for chain compatibility checks


Metadata
Status: FinalStandards Track: NetworkingCreated: 2019-05-03
Authors
Péter Szilágyi (peterke@gmail.com), Felix Lange (fjl@ethereum.org)

Simple Summary


Currently nodes in the Ethereum network try to find each other by establishing random connections to remote machines "looking" like an Ethereum node (public networks, private networks, test networks, etc), hoping that they found a useful peer (same genesis, same forks). This wastes time and resources, especially for smaller networks.

To avoid this overhead, Ethereum needs a mechanism that can precisely identify whether a node will be useful, as early as possible. Such a mechanism requires a way to summarize chain configurations, as well as a way to disseminate said summaries in the network.

This proposal focuses only on the definition of said summary - a generally useful fork identifier - and it's validation rules, allowing it to be embedded into arbitrary network protocols (e.g. discovery ENRs or eth/6x handshakes).

Abstract


There are many public and private Ethereum networks, but the discovery protocol doesn't differentiate between them. The only way to check if a peer is good or bad (same chain or not), is to establish a TCP/IP connection, wrap it with RLPx cryptography, then execute an eth handshake. This is an extreme cost to bear if it turns out that the remote peer is on a different network and it's not even precise enough to differentiate Ethereum and Ethereum Classic. This cost is magnified for small networks, where a lot more trial and errors are needed to find good nodes.

Even if the peer is on the same chain, during non-controversial consensus upgrades, not everybody updates their nodes in time (developer nodes, leftovers, etc). These stale nodes put a meaningless burden on the peer-to-peer network, since they just latch on to good nodes, but don't accept upgraded blocks. This causes valuable peer slots and bandwidth to be lost until the stale nodes finally update. This is a serious issue for test networks, where leftovers can linger for months.

This EIP proposes a new identity scheme to both precisely and concisely summarize the chain's current status (genesis and all applied forks). The conciseness is particularly important to make the identity useful across datagram protocols too. The EIP solves a number of issues:

  • If two nodes are on different networks, they should never even consider connecting.
  • If a hard fork passes, upgraded nodes should reject non-upgraded ones, but NOT before.
  • If two chains share the same genesis, but not forks (ETH / ETC), they should reject each other.

This EIP does not attempt to solve the clean separation of 3-way-forks! If at the same future block number, the network splits into three (non-fork, fork-A and fork-B), separating the forkers from each another will need case-by-case special handling. Not handling this keeps the proposal pragmatic, simple and also avoids making it too easy to fork off mainnet.

To keep the scope limited, this EIP only defines the identity scheme and validation rules. The same scheme and algorithm can be embedded into various networking protocols, allowing both the eth/6x handshake to be more precise (Ethereum vs. Ethereum Classic); as well as the discovery to be more useful (reject surely peers without ever connecting).

Motivation


Peer-to-peer networking is messy and hard due to firewalls and network address translation (NAT). Generally only a small fraction of nodes have publicly routed addresses and P2P networks rely mainly on these for forwarding data for everyone else. The best way to maximize the utility of the public nodes is to ensure their resources aren't wasted on tasks that are worthless to the network.

By aggressively cutting off incompatible nodes from each other we can extract a lot more value from the public nodes, making the entire P2P network much more robust and reliable. Supporting this network partitioning at a discovery layer can further enhance performance as we avoid the costly crypto and latency/bandwidth hit associated with establishing a stream connection in the first place.

Specification


Each node maintains the following values:

  • FORK_HASH: IEEE CRC32 checksum ([4]byte) of the genesis hash and fork blocks numbers that already passed.
    • The fork block numbers are fed into the CRC32 checksum in ascending order.
    • If multiple forks are applied at the same block, the block number is checksummed only once.
    • Block numbers are regarded as uint64 integers, encoded in big endian format when checksumming.
    • If a chain is configured to start with a non-Frontier ruleset already in its genesis, that is NOT considered a fork.
  • FORK_NEXT: Block number (uint64) of the next upcoming fork, or 0 if no next fork is known.

E.g. FORK_HASH for mainnet would be:

  • forkhash₀ = 0xfc64ec04 (Genesis) = CRC32(<genesis-hash>)
  • forkhash₁ = 0x97c2c34c (Homestead) = CRC32(<genesis-hash> || uint64(1150000))
  • forkhash₂ = 0x91d1f948 (DAO fork) = CRC32(<genesis-hash> || uint64(1150000) || uint64(1920000))

The fork identifier is defined as RLP([FORK_HASH, FORK_NEXT]). This forkid is cross validated (NOT naively compared) to assess a remote chain's compatibility. Irrespective of fork state, both parties must come to the same conclusion to avoid indefinite reconnect attempts from one side.

Validation rules

    1. If local and remote FORK_HASH matches, compare local head to FORK_NEXT.
    • The two nodes are in the same fork state currently. They might know of differing future forks, but that's not relevant until the fork triggers (might be postponed, nodes might be updated to match).
      • 1a) A remotely announced but remotely not passed block is already passed locally, disconnect, since the chains are incompatible.
      • 1b) No remotely announced fork; or not yet passed locally, connect.
    1. If the remote FORK_HASH is a subset of the local past forks and the remote FORK_NEXT matches with the locally following fork block number, connect.
    • Remote node is currently syncing. It might eventually diverge from us, but at this current point in time we don't have enough information.
    1. If the remote FORK_HASH is a superset of the local past forks and can be completed with locally known future forks, connect.
    • Local node is currently syncing. It might eventually diverge from the remote, but at this current point in time we don't have enough information.
    1. Reject in all other cases.

Stale software examples

The examples below try to exhaust the fork combination possibilities that arise when nodes do not run matching software versions, but otherwise follow the same chain (mainnet nodes, testnet nodes, etc).

Past forksFuture forksHeadRemote FORK_HASHRemote FORK_NEXTConnectReason
AAYes (1b)Same forks, same sync state.
A< BABYes (1b)Remote is advertising a future fork, but that is uncertain.
A>= BABNo (1a)Remote is advertising a future fork that passed locally.
ABAYes (1b)Local knows about a future fork, but that is uncertain.
ABABYes (1b)Both know about a future fork, but that is uncertain.
AB1< B2AB2Yes (1b)Both know about differing future forks, but those are uncertain.
AB1>= B2AB2No (1a)Both know about differing future forks, but the remote one passed locally.
[A,B]ABYes (2)Remote out of sync.
[A,B,C]ABYes¹ (2)Remote out of sync. Remote will need a software update, but we don't know it yet.
ABA ⊕ BYes (3)Local out of sync.
AB,CA ⊕ BYes (3)Local out of sync. Local also knows about a future fork, but that is uncertain yet.
AA ⊕ BNo (4)Local needs software update.
ABA ⊕ B ⊕ CNo² (4)Local needs software update.
[A,B]ANo (4)Remote needs software update.

Note, there's one asymmetry in the table, marked with ¹ and ². Since we don't have access to a remote node's future fork list (just the next one), we can't detect that it's software is stale until it syncs up. This is acceptable as 1) the remote node will disconnect from us anyway, and 2) this is a temporary fluke during sync, not permanent with a leftover node.

Rationale


Why flatten FORK_HASH into 4 bytes? Why not share the entire genesis and fork list?

Whilst the eth devp2p protocol permits arbitrarily much data to be transmitted, the discovery protocol's total space allowance for all ENR entries is 300 bytes.

Reducing the FORK_HASH into a 4 bytes checksum ensures that we leave ample room in the ENR for future extensions; and 4 bytes is more than enough for arbitrarily many Ethereum networks from a (practical) collision perspective.

Why use IEEE CRC32 as the checksum instead of Keccak256?

We need a mechanism that can flatten arbitrary data into 4 bytes, without ignoring any of the input. Any other checksum or hashing algorithm would work, but since nodes can lie at any time, there's no value in cryptographic hash functions.

Instead of just taking the first 4 bytes of a Keccak256 hash (seems odd) or XOR-ing all the 4-byte groups (messy), CRC32 is a better alternative, as this is exactly what it was designed for. IEEE CRC32 is also used by ethernet, gzip, zip, png, etc, so every programming language support should not be a problem.

We're not using FORK_NEXT for much, can't we get rid of it somehow?

We need to be able to differentiate whether a remote node is out of sync or whether its software is stale. Sharing only the past forks cannot tell us if the node is legitimately behind or stuck.

Why advertise only one next fork, instead of "hashing" all known future ones like the FORK_HASH?

Opposed to past forks that have already passed (for us locally) and can be considered immutable, we don't know anything about future ones. Maybe we're out of sync or maybe the fork didn't pass yet. If it didn't pass yet, it might be postponed, so enforcing it would split the network apart. It could also happen that we're not yet aware of all future forks (haven't updated our software in a while).

Backwards Compatibility


This EIP only defines an identity scheme, it does not define functional changes.

Test Cases


Here's a full suite of tests for all possible fork IDs that Mainnet, Ropsten, Rinkeby and Görli can advertise given the Petersburg fork cap (time of writing).


Here's a suite of tests of the different states a Mainnet node might be in and the different remote fork identifiers it might be required to validate and decide to accept or reject:


Here's a couple of tests to verify the proper RLP encoding (since FORK_HASH is a 4 byte binary but FORK_NEXT is an 8 byte quantity):


Implementation


Geth: https://github.com/ethereum/go-ethereum/tree/master/core/forkid

Copyright


Copyright and related rights waived via CC0.