EIP-7745: Light client and DHT friendly log index

An efficient, light client and DHT friendly replacement for block header bloom filters

Metadata

Status: DraftStandards Track: CoreCreated: 2024-07-17

Authors

Zsolt Felföldi (@zsfelfoldi)

Requires

EIP-7916

Links

GitHub page Discussion on Ethereum Magicians

Abstract

Replace the fixed 2048 bit log event bloom filters in block headers with a new data structure that can adapt to the changing number of events per block and consistently guarantee a sufficiently low false positive ratio.

The proposed structure maps all log entries onto a global linear index space and hashes them into a Merkle tree based on that index. It also contains a filter map for every fixed length section of the index space, hashed into the same Merkle tree. These are two dimensional sparse bit maps that allow searching for log address/topic patterns. Unlike the per-block bloom filters, they allow searching for specific events by accessing only a small portion of the entire dataset which can also be proven with a Merkle proof, making the search both efficient and light client friendly. They also provide exact position information for potential hits. Instead of signaling probable presence in a given block, potential hits derived from a filter map are pointers to the index space and can be directly used to look up the log at the given index from the Merkle tree of log entries.

The proposed structure can be efficiently used both for local search and for remote proof generation/verification, thereby simplifying implementation of provers and verifiers. It also allows validators that are not interested in either searching or proving logs to generate the log index root hash by maintaining a minimal log index state with a relatively small (hard capped) size.

Motivation

Adding logs has a significantly lower gas cost and should accordingly be less resource consuming than writing to the state. The original design of bloom filters in each block achieves this goal as there is no complex data structure like the state to update, the set of logs emitted in each block is all contained within the header and receipts belonging to that block. Logs mostly just have long term storage costs. On the other hand, searching logs in a long range of blocks is very expensive.

Bloom filters are only useful as long as they are sufficiently sparse. False positive ratio rises rapidly with the number of events per filter and the density of 1 bits in the filter bit vector. In the currently existing bloom filter each log address and topic sets 3 out of a fixed length of 2048 bits which resulted in sufficiently sparse filters in the beginning but with the increase of the block gas limits the false positive ratio soon made the filter practically useless. Mainnet blocks currently add over 1000 log addresses and topics in average and therefore the bloom filter size would need to increase about tenfold in order to achieve acceptable false positive rates again. This would raise block header size to about 3 kilobytes. Even if the size of the per-block bloom filters would be raised to a sufficient level, log search would still require accessing the entire header chain. Searching in just the most recent one year history would cost over 6 gigabytes of data, not counting the access of actual logs where the bloom filters have a match. The current situation is even worse, requiring a large portion of the full block receipt sets to be accessed due to the high false positive rate of the bloom filters.

Specification

Terms and definitions

log value: either an address value or a topic value. Each LOG opcode adds one address value and 0..4 topic values. A log value is represented by a 32 byte hash which is calculated as sha2(address) or sha2(topic)
log value index: values are globally mapped to a linear index space, with a monotonically increasing log value index assigned to each added log value. The log values are added in the order of EVM execution (address value first, then the topic values) so the logs generated in each block and in each transaction of the block occupy a continuous range in the index space. A block delimiter is also added between blocks which has its own log value index and is added to the Merkle tree of log entries but not to the filter maps.
log entry: an SSZ encoded log event with position metadata (a LogEntry container) is added to the log_entries Merkle tree at the first log value index assigned to the event (the one assigned to the address value). The entries at indices assigned to topic values are left empty (a LogEntry with all zero fields). The position metadata contains the block number but not the block hash since the block hash is not known yet when the block where the newly added logs were emitted is still being constructed. If needed, the block hash can be found in the block delimiter positioned after the logs emitted in the given block (except for the head block which does not have a block delimiter yet but its hash is always expected to be known to the prover/verifier).
block delimiter: a special entry in the log value index space that is placed between the logs emitted by each block. In order to fit into the fixed shape tree structure its Merkle tree shape is identical to the LogEntry container, with the Log part being empty and the meta info part is a BlockDelimiterMeta container instead of a LogMeta. The two are always distinguishable based on the dummy_value field which goes in place of the log_in_tx_index of log entries and always has a value of 2**64-1. It references the block by number and hash. Each block delimiter is placed after the logs emitted by the referenced block when the next block is added (when the hash of the referenced block is already known), before the logs emitted by that block. This makes it easy to prove the log value index boundaries of a searched block range and also allows proving the hash of the blocks emitting the matching logs which is required for filling out the position metainfo of the RPC response.
filter map: a MAP_WIDTH by MAP_HEIGHT sized sparse bit map intended to help searching for log values in a fixed VALUES_PER_MAP length section of the log value index space. Each log value is marked on the map at a row and column that depends on the log value index and the log value itself. Rows are sparsely encoded as a list of marked column indices (in strictly ascending order, which also coincides with the order of occurence). Each map contains at most VALUES_PER_MAP marks and therefore the chance of false positives is kept at a constant low level.
filter epoch: a MAPS_PER_EPOCH sized group of consecutive filter maps stored in the hash tree in a way so that multiple rows of adjacent filter maps with the same row index can be efficiently retrieved in a single Merkle multiproof. The log value to row index mapping is constant during a single epoch but changes between epochs.

Consensus data format

Block headers

Beginning at the execution timestamp FORK_TIMESTAMP, execution clients MUST replace the logs_bloom field of the header schema with log_index_root which is the root hash of the LogIndex structure after adding the logs emitted in the given block.

The resulting RLP encoding of the header is therefore:

Container types

These definitions use the ProgressiveByteList SSZ type as defined in EIP-7916.

Log entries and block delimiters

Log events with position metadata are added to the log_entries tree at the log value index assigned to the address value of the log. This allows a simple Merkle proof of all fields of the eth_getLogs JSON-RPC response except for the blockHash which cannot be hashed into the log_entries tree at the time of block processing because the hash of the processed block is not known yet. Before adding entries of the next block, a BlockDelimiterEntry is also added with the previous block's number and hash.

The following table shows an example of mapping log entries and block delimiters onto the log value index space:

Block number	Transaction index	Log index	Log event	log value indices
0			block delimiter	0
1	0	0	Addr, Topic1, Topic2, Topic3	1, 2, 3, 4
1	0	1	Addr, Topic1, Topic2, Topic3	5, 6, 7, 8
1	1	0	Addr, Topic1, Topic2	9, 10, 11
1	1	1	Addr, Topic1	12, 13
1	1	2	Addr, Topic1, Topic2	14, 15, 16
1			block delimiter	17
2	0	0	Addr, Topic1, Topic2, Topic3, Topic4	18, 19, 20, 21, 22

Filter map row encoding

Each row of the filter map is encoded as a series of little endian binary encoded column indices. With the proposed value of MAP_WIDTH = 2**24 this results in a simple and efficient encoding as a series of 3 byte values. The number of indices in a row may vary in a wide range. The total number of indices in a fully populated filter map is VALUES_PER_MAP minus the number of block delimiters which are not marked on the map. Though the average row size is about VALUES_PER_MAP // MAP_HEIGHT, the upper limit of individual row length is VALUES_PER_MAP.

Proposed constants

Name	Value
MAP_WIDTH	2**24
MAP_HEIGHT	2**16
VALUES_PER_MAP	2**16
MAPS_PER_EPOCH	2**10
MAX_EPOCH_HISTORY	2**24
MAX_BASE_ROW_LENGTH	2**3
LAYER_COMMON_RATIO	2**4

Constructing the filter map

For each VALUES_PER_MAP long section of the log value index space a filter map is generated. These are fixed size MAP_WIDTH by MAP_HEIGHT sparse bit maps and each log value is marked on the map with a single bit being set to one. Block delimiters are not marked on the map but otherwise the density of the maps are close to constant. The number of marks in a row (the length of the sparse encoded row) is referred to as "row length", not to be confused with the constant MAP_WIDTH.

Design goals

The proposed data structure is intended to represent a balance between the cost of adding items and accessing old ones. The filter maps have a fixed tree shape, making in-memory maintenance, Merkle hashing and DHT distribution easy to implement. Filter entries are sorted into rows based on content and position in a way that allows quick linear database access and size efficient Merkle proofs. The difficulties arising from certain types of events being much more frequent than others are also mitigated.

Update and maintenance costs are also limited as tree nodes are eventually finalized and the number of non-finalized non-empty nodes is always hard capped, ensuring moderate memory requirements. Initialization costs of the data structure at any point of the chain are also capped. Additional database storage costs of filter maps is about 15% of the size of the actual logs. The Merkle tree structure also makes it easy to discard entire epochs along with the corresponding Merkle subtrees, making the implementation of history expiry of the log index simple.

Epochs and mapping layers

In order to allow efficient search of a certain log value in a long historical range, filter maps are organized into epochs, each consisting of a fixed MAPS_PER_EPOCH number of maps. In the most usual case (when row density is around average or below) row mapping stays the same during an entire epoch. The database and the hash tree both should be organized in a way that instead of putting all rows of a single map close to each other, rows of the same _row index_throughout an entire epoch are close to each other and therefore are cheap to access and/or prove with a Merkle proof.

In order to mitigate collisions in densely populated rows, the concept of mapping layers is introduced, meaning that if a certain row already has a certain number of entries then the row mapping is changed and very frequent log values are mapped into multiple rows. Initially, when a map is empty, every log value is mapped using mapping layer 0 or the "base layer". If a row reaches MAX_BASE_ROW_LENGTH then any further log values mapped onto that row in the base layer mapping will use layer 1 mapping instead. On layer 1 a different row is assigned to the same log value and the row length limit in increased to MAX_BASE_ROW_LENGTH * LAYER_COMMON_RATIO. If this row also reaches its limit, layer 2 mapping is used and so on. The row length limit increases exponentially until is reaches MAX_BASE_ROW_LENGTH * MAPS_PER_EPOCH where it does not grow further. Note that a row filled at a certain mapping layer can be grown further on a higher layer. Different log values colliding in the same row on a certain layer are probably mapped in different rows on the next layer, which means that a very popular value might populate multiple rows (also very long ones) but an unlucky less popular one colliding with it on base layer will probably just have to move one layer up. The search process is similar, if the searcher finds that the row belonging to the searched value is full according to the base layer limit then it also has to check the next layer and so on, until in finds a non-full row.

If a row is longer than the limit according to the layer the searcher is looking at then it can safely ignore the extra entries assuming that they were added by another value on a higher layer. The ProgressiveByteList container makes it efficient to prove row data belonging to the lower layer even if there is much more data in the same row added on a higher layer.

Row mapping

While base layer row mapping stays the same for an entire epoch, higher layer mappings are changed more frequently. Each mapping change has a cost in terms of database access overhead and Merkle proof size overhead and epoch size is determined in a way that these overheads stay sufficiently low compared to the cost of accessing the actual useful data. On higher layers where the rows are longer, a more frequent remapping is possible because the useful data size per map is also higher. It is also desirable so that a less frequent log value will only suffer from colliding with longer rows for a shorter time.

The row index and maximum row length are calculated as follows (note that from_binary32 and to_binary32 uses little endian encoding):

The following figure shows how log values are mapped to rows on different mapping layers. Each dot represents a map row and the numbers indicate the mapping layer on which the row has been assigned to the given log value. Note that it might happen that a higher layer mapping coincides with a lower layer mapping for the same value; this does not cause any problem though as the row is simply grown further on the higher layer. The search algorithm can also simply revisit the same row in a higher layer iteration if necessary and process the rest of the row that it first ignored.

Column mapping

Column mapping assumes that MAP_WIDTH is a multiple of VALUES_PER_MAP. column index is calculated as follows:

As shown on the figure below, this mapping practically assigns a log_value_width by MAP_HEIGHT rectangle to each log value index and ensures that each log value places exactly one mark in its own rectangle (the letters A-D represent different log values). This property also ensures that log value index can be restored from map index and column index, column indices never collide and keep the original order of log value indices, allowing efficient Merkle exclusion proofs of certain column indices in long rows.

Updating the log index

The following pseudocode shows how to add log data of a new block to the log index:

Finding potential matches

Determining whether a column index found in a row that is relevant for the searched log value is possible by restoring the log value index and then calculating the column index from it again in order to check whether the quasi-random collision filter part matches the expected value for the given log value:

Iterating through all relevant mapping layers and corresponding rows is similar to how new log values are added. Filtering all potential matches from all relevant rows can be done with the following funcions:

Initialization and minimal state

A Merkle tree updated with strictly monotonically increasing keys can be initialized with a Merkle branch of the next leaf to be added. If the keys are non-strictly monotonical (the same leaf can be updated multiple times) then the previous leaf value itself is also needed. In case of LogIndexMinimalState epochs are strictly monotinically added. The log_entries trees are also strictly monotinical while the MAPS_PER_EPOCH sized subtrees belonging to each row index are non-strictly monotonically updated.

This structure consists of log2(MAX_EPOCH_HISTORY)+MAP_HEIGHT*log2(MAPS_PER_EPOCH)+log2(MAPS_PER_EPOCH * VALUES_PER_MAP) branch hashes and in total at most VALUES_PER_MAP column indices in MAP_HEIGHT rows. Assuming that the variable length of each filter row is encoded in two bytes, with the proposed constants this amounts to 32*(24+65536*10+10+16)+3*65536+2*65536 = 21300800 bytes. This number can be considered as a worst case hard cap on the amount of data required to initialize the log index data structure at any point.

Providing LogIndexMinimalState on request needs to be added to the sync protocol in order to allow freshly synced nodes to bootstrap. A validator/block producer that does not want to generate proofs of historical log index data and only wants to verify or generate consensus while keeping the memory overhead minimal can also discard old Merkle tree nodes after update and use this structure as its log index state.

Note that initialization is also possible on an epoch boundary, in which case the log index is very cheap to initialize (only requires the epoch branch) but in this case all block headers and receipts between the boundary and the current head are required to generate the last unfinished epoch.

Rationale

Log value index space

In each block a varying number of log values are emitted. In addition to inefficient search, another drawback of per-block fixed size bloom filters is the varying filter utilization leading to over-utilized filters giving many false positives in some blocks and/or wastefully under-utilized filters in some blocks. Block gas limits also tend to change significantly over the long term so any future-proof solution has to be able to adapt to the varying number of log values per block.

Mapping log values on their own linear index space ensures uniform filter utilization of identically structured filter maps. Compared to the alternative of constantly changing adaptive filter size, this approach greatly simplifies the storage and tree hashing scheme and the construction of Merkle multiproofs covering longer block ranges. It also allows mapping each address and topic value separately to consecutive indices and implementing specific address/topic pattern filters.

Log entries tree

Hashing entire logs along with position info into a tree greatly simplifies the remote proving/verifying process. There is no need to separately prove the canonicalness of block headers and the receipts referenced in them, everything can be proven with Merkle proof of a single LogIndex structure and the BlockDelimiterMeta belonging to the same block number if the block hash is needed.

Storing the log_entries subtrees directly in their proposed merkleized format on disk is not really efficient though. This is not an issue for validators that want to maintain a minimal state; for them, updating the log_entries subtrees is really cheap as they only need to maintain a single Merkle branch pointing to the next log value index. Provers can implement log_entries efficiently by storing a subset of the Merkle tree nodes (the ones close to the root of each epoch's subtree) and generate the rest on demand based on the receipts that encode the same data in a more compact form.

Alternative filter structures considered

In a search structure of a constantly growing dataset there is typically a tradeoff between the cost of adding new data and the cost of searcing the existing dataset. One extreme is just linearly storing the data, which is practically the case now with logs, with the bloom filters being mostly useless. The other extreme is one big Merkle tree with all log values ever used as keys and the list of all occurences (possibly in a further merkleized format) as values. With billions of unique log values, adding new entries here is expected to have costs similar to that of the state, with multiple lookups and modifications/insertions at random places in a database on the order of magnitude of hundreds of gigabytes. Another issue where this is similar to the state is that removing old entries is hard and expensive. Adding logs is supposed to be cheaper than writing the state so solutions between these two extremes were considered as potentially practical, with multiple smaller structures generated periodically.

One question considered was whether to add separate keys for each unique log value emitted in the given period, or to use a more compressed fixed size tree format where different log values might collide (though preferably not too many of them). The second option may also include some kind of small probabilistic filter information that can help filter out the occurences of colliding log values without having to access/prove the entire logs belonging to them. This decision mostly boiled down to data access efficiency, both in terms of local disk access and remote Merkle proof size. Identically structured trees can be efficiently arranged in larger units (called epochs here), with values belonging to the same key in subsequent trees of an epoch located close to each other. This improves database access speed. It also allows smaller Merkle proofs with a series of leaves encoded together in an efficient format and internal nodes on only two boundary branches. Database writes are also efficient as the order of adding tree entries is not random and all the non-finalized parts of the tree can be kept in memory with a hard capped memory requirement.

The other design decision considered here was whether to hash entire logs into the list of log value occurences or just store position info and have a separate tree of log entries. This does not necessarily affect local storage efficiency which should probably only store position info in the local database anyways in order to avoid duplicating log data but could still generate the hash tree based on the full log data. Though the separate filter maps and log entry trees do present some additional complexity, the second option was chosen because of the size of Merkle proofs proving matches of multiple log value patterns. Tests have shown that realistic log searches often yield a lot more matches for the individual log values themselves that the pattern itself. Hashing entire logs into the occurence lists would mean that the proof would have to include at least the root hashes of all the individual log value matches, while in the second case only the position index is needed which is more than 10x smaller with the proposed parameters.

In conclusion, for the given application the fixed tree size approach with separate position info plus probabilistic collision filter approach seemed to be the most appropriate approach. Since the log value position info can be conveniently merged with the collision filter, the whole structure can be imagined as a sparse bit map on which each search operation can be thought of as applying a mask to the bit map.

False positive rate

From the filter maps a set of potential matches can be derived for any block range and log value or pattern of log values. These matches can then be looked up in the corresponding log_entries trees and actually matching logs can be added to the set of results. The design guarantees that the set of potential matches includes all actual matches but and also has a consistent rate of random false positive rate.

False positives can happen when the quasi-random collision filter part of a column index accidentally matches the expected value even though it was generated by a log value other than the searched one. The chance of this happening is VALUES_PER_MAP / MAP_WIDTH per colliding enrty in a row that is relevant for the search. Assuming that most entries in a map are different from the searched one, assuming uniform random distribution of entries, the average number of colliding entries found in a relevant row is VALUES_PER_MAP / MAP_HEIGHT.

Though certain log values might be emitted a lot more than others and therefore the row index distribution might not be entirely uniform, periodical remapping of rows and using multiple mapping layers ensures that over a long enough search period random collisions with more frequent log values do even out. Mapping layers do have another consequence though; if any row has at least MAX_BASE_ROW_LENGTH entries then the search logic requires looking into another row that is mapped to the searched log value on the next mapping layer. The maximum possible number of such rows is VALUES_PER_MAP / MAX_BASE_ROW_LENGTH and therefore the chance of randomly hitting one is VALUES_PER_MAP / MAX_BASE_ROW_LENGTH / MAP_HEIGHT in the worst case. In this case an extra row has to be processed, with extra chance of finding false positives. A collision with a frequent value at a certain mapping layer does not indicate a collision on the next layer though, therefore the expected number of entries in that row is no different from the first one. Having to process a third row would presume that the second one had at least MAX_BASE_ROW_LENGTH * LAYER_COMMON_RATIO entries. The chance of this happening after the first coincidence is practically negligible in the context of expected false positives.

The expected number of false positives for a single log value search can be estimated as VALUES_PER_MAP ^ 2 / MAP_WIDTH / MAP_HEIGHT * (1 + VALUES_PER_MAP / MAX_BASE_ROW_LENGTH / MAP_HEIGHT) per filter map. With the proposed constants this roughly equals 0.0044 false positives per map. As of March 2025 the average number of log values emitted in a mainnet block is slightly over 1000 while a filter map consists of 65536 log values. This gives a rough estimate of one false positive per 14000 blocks, which costs the searcher an extra lookup in a log_entries tree. The expected number of false positives in the entire chain history is around 1200.

Note that this is only true for a single value search while a typical pattern search requiring certain values on multiple positions has an exponentially lower false positive rate. For example if the pattern is [Addr, Topic1, Topic2] then three log value searches are performed and an actual log lookup is only necessary if the first search yields N, the second N+1 and the third N+2 simultaneously. If necessary, the rate can be easily reduced by using a higher MAP_WIDTH, at the cost of growing the size of encoded rows.

Backwards Compatibility

The existing log filter API (eth_getLogs, eth_newFilter, eth_getFilterLogs, eth_getFilterChanges) can be implemented with the new filter data structure. Applications relying on this API can operate without any change, with a higher performance. Repricing the LOG opcode might be considered after performing benchmarks but the extra processing cost is not significant while the extra storage cost is around 15%. Other than that the EVM is not affected in any way as it only emits logs but does not directly access them.

Security Considerations

Safe access with a remote prover

In order to prove a complete set of matches matching a given search pattern in a given block range, the prover needs to

prove the log value index range that corresponds to the searched block number range by proving the block delimiter entries of first_block - 1 and last_block
prove the relevant rows of filter maps based on map index and row index (verifier can determine the relevant rows in advance based on the log values in the search pattern and the relevant log value index range)
prove the actual log entry belonging to any potentially matching log value index and also the block delimiter entry with the same block number if the blockHash of the log position info is needed

Since all three steps can be realized with Merkle proofs of the same LogIndex structure referenced in the block headers, any search with a remove prover is as safe as the client's knowledge about the chain head.

Deliberate false positive attacks

The design guarantees that false positive rates do even out statistically over several epochs, even in case of random collisions with very frequent values, ensuring that an excessive amount false positives will not make the bandwidth and processing costs of the search prohibitively high. All of this is true for random collisions only though, not deliberately created collisions. A deliberate attack on a certain important log value in order to raise its false positive rate can not be ruled out entirely since with a low amount of filter data generated per log value it is always possible to "mine" another value that generates colliding filter data. The column mapping used here makes this attack a lot harder though, since the column index depends on both the log value and the exact log value index, making this attack only possible for block creators who are probably offered MEV rewards for much more lucrative manipulations of the transaction set than making the search of certain log events slightly more expensive.