EIP-7801: eth/70 - Sharded Blocks Protocol

Replaces block range with a bitlist representing 1-million-block spans in the handshake, with probabilistic shard retention


Metadata
Status: DraftStandards Track: NetworkingCreated: 2024-10-30
Authors
Ahmad Bitar (@smartprogrammer93) (smartprogrammer@windowslive.com), Giulio Rebuffo (@Giulio2002)
Requires

Abstract


This EIP introduces a method enabling an Ethereum node to communicate its available block spans via a bitlist, where each bit represents a 1-million-block span. Nodes use this bitlist to signal which spans of historical data they store, enabling peers to make informed decisions about data availability when requesting blocks. This aims to improve network efficiency by providing a probabilistic snapshot of data locality across nodes.

The proposal extends the Ethereum wire protocol (eth) with version eth/70, introducing a blockBitlist field in the handshake. Nodes probabilistically retain certain block spans to support data locality across the network.

Motivation


With EIP-4444, nodes may begin pruning historical data while others continue to serve it. The current approach of connecting and requesting specific blocks to determine data availability is inefficient, consuming unnecessary bandwidth. This EIP addresses this by requiring nodes to retain at least one 1-million-block shard and retain additional shards probabilistically, enabling better-informed and targeted sync processes.

By using a bitlist in place of a range, nodes provide a snapshot of data locality across historical epochs, supporting efficient data requests.

Specification


  • Advertise a new eth protocol capability (version) at eth/70.

    • The existing eth/69 protocol will continue alongside eth/70 until sufficient adoption.
  • Modify the Status (0x00) message for eth/70 to add a blockBitlist field after the forkid:

    • Current packet for eth/69: [version: P, networkid: P, blockhash: B_32, genesis: B_32, forkid]
    • New packet for eth/70: [version: P, networkid: P, blockhash: B_32, genesis: B_32, forkid, blockBitlist], where blockBitlist is a bitlist, with each bit representing a 1-million-block span.
  • Define node behavior based on the bitlist as follows:

    • Bitlist Initialization: Upon startup, nodes MUST set at least one bit in the blockBitlist to on and backfill that 1-million-block span to ensure data availability for that shard.
    • Shard Retention Probability: Nodes SHOULD probabilistically retain new block spans, with an M% probability (e.g., 5–10%) to maintain network data locality and avoid frequent pruning of new ranges.
    • Active Shard Maintenance: Nodes MUST retain all blocks from shards that are still actively being created to support synchronization.

Upon connecting using eth/70, nodes exchange the Status message, including the blockBitlist. This single handshake message, which includes the bitlist, eliminates the need for additional message types.

Nodes should maintain connections regardless of a peer’s bitlist state, except when peer slots are full. In this case, nodes may prioritize connections with peers that serve more relevant shards.

If a node is interested in querying a specific block span, it can use the bitlist to determine which peers are more likely to have the data. If the peer they are connected to does not have the data, they can disconnect and connect to another peer, until they find what they have been looking for.

ENR variation

This EIP could be implemented without a new protocol version by adding a new field to the Ethereum Node Record (ENR) to advertise the bitlist. This would allow nodes to advertise their available block spans without requiring a handshake. However, this approach would not provide the same level of assurance as the handshake, as the ENR is not authenticated. Also, it would make the discovery process much simpler, as nodes just would need to take a look at the ENR to determine if a peer has the data they are looking for.

Rationale


The bitlist approach provides a flexible means to represent and retain block data with support for data locality, especially as nodes probabilistically retain certain shards. This mechanism aligns with the pruning proposed in EIP-4444, while ensuring that historical data spans remain available across the network.

The bitlist approach is already used in the Consensus Layer for attestation subnets, making it a familiar and efficient method for representing data spans. Additionally, communicating shards this way is broadly used in many distributed systems, making it a natural fit for Ethereum.

Backwards Compatibility


This EIP introduces eth/70, extending the handshake in a backward-incompatible manner. However, devp2p allows concurrent versions of the same protocol, so nodes can continue using older protocol versions (e.g., eth/69, eth/68, eth/67) if they lack support for eth/70.

This EIP does not affect the consensus engine or require a hard fork.

Security Considerations


There are some considerations:

  • The size of the network is not taken into account, so for an extended history, shard sizes may need to increase. However, this is not a major issue for Ethereum, as many nodes are active on the network at all times.
  • The more blocks in the network, the more shards there will be, potentially diluting peers across multiple shards. This could make querying specific shards more challenging in the future. However, with sufficiently large shard sizes, this should not pose a significant problem. For instance, if Ethereum has 200,000,000 blocks and a shard size of 1,000,000 blocks, and a node has 32 peers, then the probability that a peer has a specific shard is around 15%, which is manageable.

Copyright


This document is CC0-licensed; rights are waived through CC0.