EIP-234: Add `blockHash` to JSON-RPC filter options.

Metadata

Status: FinalStandards Track: InterfaceCreated: 2017-03-24

Authors

Micah Zoltu (@MicahZoltu)

Requires

EIP-1474

Links

GitHub page Discussion on Ethereum Magicians

Simple Summary

Add an option to JSON-RPC filter options (used by eth_newFilter and eth_getLogs) that allows specifying the block hash that should be included in the results. This option would be an alternative to fromBlock/toBlock options.

Abstract

This addition would allow clients to fetch logs for specific blocks, whether those blocks were in the current main chain or not. This resolves some issues that make it difficult/expensive to author robust clients due to the nature of chain reorgs, unreliable network connections and the result set not containing enough details in the empty case.

Specification

The filter options used by eth_newFilter would have an additional optional parameter named blockHash whose value is a single block hash. The Ethereum node responding to the request would either send back an error if the block hash was not found or it would return the results matching the filter (per normal operation) constrained to the block provided. Internally, this would function (presumably) similar to the fromBlock and toBlock filter options.

Rationale

A client (dApp) who needs reliable notification of both log additions (on new blocks) and log removals (on chain reorgs) cannot achieve this while relying solely on subscriptions and filters. This is because a combination of a network or remote node failure during a reorg can result in the client getting out of sync with reality. An example of where this can happen with Websockets is when the client opens a web socket connection, sets up a log filter subscription, gets notified of some new logs, then loses the web socket connection, then (while disconnected) a re-org occurs, then the client connects back and establishes a new log filter. In this scenario they will not receive notification of the log removals from the node because they were disconnected when the removals were broadcast and the loss of their connection resulted in the node forgetting about their existence. A similar scenario can be concocted for HTTP clients where between polls for updates, the node goes down and comes back (resulting in loss of filter state) and a re-org also occurs between the same two polls.

In order to deal with this while still providing a robust mechanism for internal block/log additional/removal, the client can maintain a blockchain internally (last n blocks) and only subscribe/poll for new blocks. When a new block is received, the client can reconcile their internal model with the new block, potentially back-filling parents or rolling back/removing blocks from their internal model to get in sync with the node. This can account for any type of disconnect/reorg/outage scenario and also allows the client (as an added benefit) to talk to a cluster of Ethereum nodes (e.g., via round-robin) rather than being tightly coupled to a single node.

Once the user has a reliable stream of blocks, they can then look at the bloom filter for the new block and if the block may have logs of interest they can fetch the filtered logs for that block from the node. The problem that arises is that a re-org may occur between when the client receives the block and when the client fetches the logs for that block. Given the current set of filter options, the client can only ask for logs by block number. In this scenario, the logs they get back will be for a block that isn't the block they want the logs for and is instead for a block that was re-orged in (and may not be fully reconciled with the internal client state). This can be partially worked around by looking at the resulting logs themselves and identifying whether or not they are for the block hash requested. However, if the result set is an empty array (no logs fetched) then the client is in a situation where they don't know what block the results are for. The results could have been legitimately empty (bloom filter can yield false positives) for the block in question, or they could be receiving empty logs for a block that they don't know about. At this point, there is no decision the client can make that allows them a guarantee of recovery. They can assume the empty logs were for the correct block, but if they weren't then they will never try to fetch again. This creates a problem if the block was only transiently re-orged out because it may come back before the next block poll so the client will never witness the reorg. They can assume the empty logs were for the wrong block, and refetch them, but they may continue to get empty results putting them right back into the same situation.

By adding the ability to fetch logs by hash, the client can be guaranteed that if they get a result set, it is for the block in question. If they get an error, then they can take appropriate action (e.g., rollback that block client-side and re-fetch latest).

Backwards Compatibility

The only potential issue here is the fromBlock and toBlock fields. It wouldn't make sense to include both the hash and the number so it seems like fromBlock/toBlock should be mutually exclusive with blockHash.

Test Cases

{ "jsonrpc": "2.0", "id": 1, "method": "eth_getLogs", params: [{"blockHash": "0xbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0c"}] } should return all of the logs for the block with hash 0xbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0c. If a topics field is added to the filter options then a filtered set of logs for that block should be returned. If no block exists with that hash then an error should be returned with a code of -32000, a message of "Block not found." and a data of "0xbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0cbl0c".