ERC-7841: Cross-chain Message Format and Mailbox

A mailbox API and message format to provide a unified cross-chain developer experience

Metadata

Status: DraftStandards Track: ERCCreated: 2024-12-12

Authors

Ellie Davidson (@elliedavidson), Alex Xiong (@alxiong), Philippe Camacho (@philippecamacho), and Ben Fisch (@benafisch)

Links

GitHub page Discussion on Ethereum Magicians

Abstract

This ERC proposes a mailbox API and message format for sending and receiving data between L2s. This standard makes it easier for developers to build cross-chain applications that work over a variety of VMs, chain settlement mechanisms (e.g., ZK or optimistic settlement), and messaging protocols (e.g., synchronous or asynchronous protocols). This ERC accomplishes this by 1.) defining a mailbox interface through which cross-chain messages can be sent and received independent of their payload; 2.) defining a VM-agnostic message format and address type to make the interface compatible with many VMs; 3.) keeping the mailbox interface minimal to make it compatible with many kinds of cross-chain communication. Example applications include an intent-based bridge, a liquidity-unifying DEX operating across multiple chains, or a cross-chain lending application.

Motivation

L2s have scaled Ethereum and unlocked new avenues for innovation, but left the ecosystem fragmented. To address this, there are a variety of cross-chain communication protocols designed to make L2s composable with each other, each implements its own message format that is incompatible with others. This ERC proposes a neutral, standard format for sending and receiving cross-chain messages. By standardizing the interface chains for messaging, we achieve:

Unified developer experience: This standard abstracts away the low-level details of message passing from applications. This allows application developers to achieve the following, even among chains with different VMs, coordination protocols, or settlement logic:
- send/receive messages to/from many chains using the same interface.
- deploy their application across multiple chains with little-to-no code changes.
- focus on their application’s design instead of cross-chain infrastructure.
Modularity: This ERC standardizes only the low level information required for sending and receiving messages between chains, similar to the Internet Protocol. This allows a clean separation between the interface for sending/receiving messages (this ERC) and a coordination protocol or settlement mechanism. This allows chains to adopt this standard with minimal changes, and gives chains flexibility to choose the specific protocols they need, instead of forcing all chains to agree on a single coordination protocol or settlement mechanism.
Shared Infrastructure: This standard allows applications and chains to reuse/repurpose infrastructure for different use cases. Applications can leverage existing library contracts for common operations like encoding message payload for token transfer, while relayer networks can serve multiple purposes without significant modifications. This shared foundation simplifies the development and deployment of new chains, applications, and protocols.

Specification

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 and RFC 8174.

Terminology

A Messaging Protocol consists of mailbox contracts and a coordination (sub)protocol. A chain sends a message through a transaction that writes to its outbox and receives a message through a transaction that reads from the inbox.
Coordination Protocol: A mechanism that relays messages between source and destination mailboxes, commonly through off-chain mechanisms. Examples include shared sequencers or intent relayers. A settlement mechanism to ensure the integrity of cross-chain messages is often part of a coordination protocol. It is not always required in the case of asynchronous messaging.
In synchronous messaging protocols, chains have synchronized blocks (e.g., a block for each chain is produced every t seconds) and messages are received on the destination chain within the same block timeslot as they were sent from the source chain. In particular, a chain may send a message and read a response in one transaction within a block on a chain.
In asynchronous messaging protocols the restrictions of synchronous protocols do not apply. In particular, the chains sending and receiving messages may not have synchronized block timeslots and there may be a delay (measured in elapsed blocks) between the transaction on the source chain that sends a message and the transaction on the destination chain in which it is received.

There is a wide range of ways in which both synchronous and asynchronous protocols may operate. For example, some protocols (particularly for asynchronous messaging) may require the source chain's block in which the message was sent to be finalized (e.g., settled on Ethereum) before it can be included within a block produced for the destination chain. Other protocols may allow for messages to be optimistically included in blocks with consistency checks delayed until settlement time (reminiscent of speculative execution).

Message Format

All messages MUST follow this format:

Implementations SHOULD use a global rollup registry service that supports registration, deregistration, and efficient lookup of a rollup's chain ID. This work is outside the scope of this ERC, however. Our standard is compatible with any ERC defining chain-specific addresses to better display the sender and receiver of a Message

Mailbox APIs

Each chain SHOULD have two canonical Mailbox contracts, one for synchronous and the other for asynchronous messaging, responsible for managing both incoming and outgoing messages. The following APIs are RECOMMENDED to provide the minimal required functionality. Implementations of these APIs MAY include additional functions to support customized or more complex workflows.

The Mailbox contract SHOULD keep track of an inbox of incoming messages. The concrete data structure used to store the inbox queue SHOULD be a hash-map-like mapping in Solidity to enable efficient lookup by the dApps with payload-independent query keys. Being able to read/receive messages based on their metadata only, rather than their actual payload, is critical in achieving synchronous messaging with a dynamic payload known only at runtime. It is OPTIONAL to track the full outbox messages in the contract storage — an outbox digest may be sufficient for some cases.

While a single Mailbox contract could support both synchronous and asynchronous messaging protocols, this would require applications to specify the messaging mode for each interaction since each mode may require different settlement logic. Such a design would significantly complicate the Mailbox contract API, its implementation, and related infrastructure components like settlement logic. A more practical approach is to deploy separate, canonical Mailbox contracts for each mode. This allows applications to switch between synchronous and asynchronous messaging simply by pointing to the appropriate contract address.

Messages received via Mailbox.recv() are authenticated because they must first be populated in the inbox, with their integrity verified before the receiving action finalizes. This verification is performed either through the aux field during the populateInbox() call or via an external settlement layer.

Rationale

Multiple VM Support

Notice that our standard requires no changes to a chain's VM (i.e. no new opcode or precompiles required), but only proposes some smart contract interfaces. Specifically, the sender/recipient account type — generic bytes32 instead of EVM-specific address type -- allows this standard to support a much wider class of VMs (e.g. SolanaVM that uses Ed25519 public key as their accounts).

An optional MultiplexABIEncoder contract can abstract away the VM-specific encoding when preparing the message payload: a generic encode(chainId) function, as opposed to EVM’s abi.encode, that takes in the destination chain ID and decides the corresponding encoder logic so that the receiving party can decode natively. If the apps only care about interoperating with EVM chains, they can safely use abi.encode() as is and not go through any general encoder.

On the receiving side, EVM chains would type cast address(parsedAddress) on the bytes32 parsedAddress from the received message.

Arbitrary message payload

The Message.payload field is designed for maximum flexibility, capable of encoding arbitrary data, including application-specific structures like the CrossChainOrder from ERC-7683 (See Example Usage section below for details of this integration). In contrast to some existing bridge designs that restrict the payload to function calls, the message payload in our design can also represent simpler data types, such as a boolean status flag for acknowledgments. This flexibility enables a wider range of use cases and simplifies integration across various applications.

Use Metadata Digest Instead of `sessionId` for Message Query Key

For message lookups in the mailbox, the query key is derived from the hash of all message metadata rather than relying on a single sessionId field. This is because the sessionId derivation is customizable and may not adequately bind to key metadata like source and destination addresses. In contrast, the wrapping messaging protocol may enforce permissions for sending or receiving based on these metadata fields, so the query key must bind to the entire set of metadata.

Observe that when applications allow users to define sessionId, these values may not be unique across messages. Depending on the implementation of the inbox, this could be a concern. For example, a mapping-based inbox needs an additional nullifier set to enforce the uniqueness of message metadata and avoid message overwrites in the case of a colliding map-key.

Pre-filled Inbox

In contrast to other asynchronous bridge designs, our standard explicitly separates inbox filling from reading, enabling a unified interface for message retrieval. This separation allows specialized parties like builders or coordinators to handle protocol-specific message authentication when writing to the inbox, while applications can fetch messages directly. Additionally, message retrieval requires only metadata rather than the complete message, which is valuable in synchronous settings where message payloads are determined at execution time.

As mentioned above, pre-filling the inbox of the destination chain incurs additional gas costs which are manageable on L2s. We list some advantages of our approach:

Eliminates Merkle inclusion proof verification when receiving every message. Some messaging protocols obviate Merkle proofs completely, even during populateInbox(), and delay the mailbox check to the settlement layer.
Allows upgrade of the coordination protocol (e.g. the internal of populateInbox()and/or the settlement layer logic) without requiring changes to connected apps.
Enables user-signed transactions on the destination chain instead of shared sequencer-signed transactions in synchronous messaging. This flexibility simplifies gas payment handling and ensures a consistent msg.sender. Detailed explanations are omitted for brevity (extended answers on the website).

Related Proposals

In comparison to Inter-blockchain Communication(IBC)-like standards, this ERC is designed to work in a stateless manner. Messages do not need to pass a proof from the source chain at the time they are consumed on the destination chain. This allows use cases such as synchronous composability and intra-block messaging since messages don’t need to include finalized state from the source chain. Additionally, this ERC does not require multiple steps to establish a link between two chains. Messages can be directly sent from one chain to another in a single step.

ERC-7683 standardizes intent-based systems by defining structs for orders and interfaces for settlement smart contracts. This standard is application-specific and aimed at designers of cross-chain intent systems, while our proposal is more general and targets developers implementing arbitrary cross-chain applications. However, an intent system based on ERC-7683 can be built on top of our standard due to its modularity. An application implementing ERC-7683 could use the Mailbox API defined in this proposal to send originData from event messages between the source chain (where user funds are deposited) and the destination chain(s) (where intents are solved). We provide more details in the Example Usage section.

Backwards Compatibility

No backward compatibility issues found. Since this is an opt-in protocol, L2s that do not opt-in to this will not be affected. Furthermore, this protocol can operate in existing cross-chain flows today, such as in intent-based bridges or native bridging of assets between L2s.

Reference Implementation

We show a possible implementation of the mailbox contract for synchronous messaging protocol, and explain how it can be easily modified to support asynchronous protocols as well.

The high-level flow is as follows:

Sending a message simply updates the outbox digest.
Prefill the inbox by inserting the messages sequentially in the same order as the outbox in the source chain. Each insertion updates the corresponding inbox digest.
- The prefilling transaction is likely created by a coordinator in the coordination protocol, who monitors messages sent from all rollups and relay them to the intended destination chain. The rollup sequencer includes this transaction in the block, ensuring it precedes any mailbox "read" operations and enforces a single prefilling per block.
Applications can then read a message by querying the key-value map inbox with the (hashed) message metadata.
External to these transactions, the settlement layer will receive new inbox digest and outbox digest (with storage proofs against a proven new rollup state) and checks chain_i.inboxDigest[chain_j] == chain_j.outboxDigest[chain_i] for all i!=j
- Note: We ignore the slight complication of mailbox reset at the beginning of each block using nested mapping in our description above for brevity, but they are dealt with in our code snippet below.

To extend the synchronous mailbox to support asynchronous protocols, we only need these modifications:

Remove the first layer of mapping from inbox, inboxDigest, outboxDigest, since the “domain-separation from block number” requirement is gone. (e.g. changed to mapping(bytes32 => bytes) inbox)
At the settlement layer, enforce that the destination chain’s inbox is a subset of the source chain’s outbox messages, rather than a full equality check.
- Consequently, change the accumulator algorithm used to compute inboxDigest,outboxDigest to ones with efficient subset proof.
Remove the _reset() logic since all messages for async will be permanently stored.

Note on Gas Cost

The most costly operations are sstore during Mailbox.populateInbox(), which writes to the mapping inbox in contract storage, and sload, during Mailbox.recv() which reads from the inbox in storage. Luckily, Mailboxes costs on L2 are much cheaper. In cases of more gas-sensitive chains and applications, we suggest these potential optimizations:

delete inbox[key] during .recv() to get gas refunds for cleaning some storage
- Synchronous messages are cleaned up at the end of the same block in which they are populated. L2 can optionally implement gas optimizations for such block ephemeral storage.
utilize the EIP-2930 access list to “pre-warm” predictable storage slots for lower execution cost
batch-populate inbox messages and cluster them under fewer keys (bucketed mapping) e.g.: mapping(bytes32 bucketKey => mapping(bytes32 => bytes)

Example Usage - Sync and Async Cross-chain Transfers

An ERC token contract wishing to allow cross-chain transfers would need to add the functions xTransfer and xReceive . The logic of a single chain transfer (e.g. Token.send) must be split into two functions Token.xTransfer and Token.xReceive. Each of these functions respectively mints and burns the same amount of assets and interact with the Mailbox contract.

Example Usage - Cross-chain function calls

In this example we show how to implement a cross-chain function call using the Mailbox abstraction: The logic of cross-chain execution is handled by a contract RemoteExecuter deployed on both source and destination chains.

On the source chain A, a user wanting to call a function fun of a contract Foo on the destination chain B can invoke RemoteExecuter.remoteCall with the address of the contract Foo and other parameters including the function name and its arguments. This generates a message that is sent to chain B.

On the destination chain B, a call to RemoteExecuter.execute fetches the message sent from the source chain A, parses it and executes the corresponding function of the local contract Foo. The RemoteExecuter contract also takes care of preventing messages replays.

Note that in this example gas on the destination chain is paid by the caller of RemoteExecuter.execute . In practice this participant can be the same user who called RemoteExecuter.remoteCall on the source chain. More advanced gas management policies can be implemented where another party calls RemoteExecuter.execute and pays on behalf of the user.

Security Considerations

Security concerns for the messaging format and mailbox APIs themselves are minimal, as the protocol specification focuses on providing a rich and expressive interface for various message passing protocols. Security responsibilities lie with the underlying messaging protocol design and the application utilizing it. This specification ensures the interfaces are flexible enough to support the majority of cross-chain messaging protocols, while security within those protocols is outside the scope of this proposal.

Copyright

ERC-7841: Cross-chain Message Format and Mailbox

A mailbox API and message format to provide a unified cross-chain developer experience

Metadata

Status: DraftStandards Track: ERCCreated: 2024-12-12

Authors

Ellie Davidson (@elliedavidson), Alex Xiong (@alxiong), Philippe Camacho (@philippecamacho), and Ben Fisch (@benafisch)

Links

GitHub page Discussion on Ethereum Magicians

Abstract

Motivation

Unified developer experience: This standard abstracts away the low-level details of message passing from applications. This allows application developers to achieve the following, even among chains with different VMs, coordination protocols, or settlement logic:
- send/receive messages to/from many chains using the same interface.
- deploy their application across multiple chains with little-to-no code changes.
- focus on their application’s design instead of cross-chain infrastructure.
Modularity: This ERC standardizes only the low level information required for sending and receiving messages between chains, similar to the Internet Protocol. This allows a clean separation between the interface for sending/receiving messages (this ERC) and a coordination protocol or settlement mechanism. This allows chains to adopt this standard with minimal changes, and gives chains flexibility to choose the specific protocols they need, instead of forcing all chains to agree on a single coordination protocol or settlement mechanism.
Shared Infrastructure: This standard allows applications and chains to reuse/repurpose infrastructure for different use cases. Applications can leverage existing library contracts for common operations like encoding message payload for token transfer, while relayer networks can serve multiple purposes without significant modifications. This shared foundation simplifies the development and deployment of new chains, applications, and protocols.

Specification

Terminology

A Messaging Protocol consists of mailbox contracts and a coordination (sub)protocol. A chain sends a message through a transaction that writes to its outbox and receives a message through a transaction that reads from the inbox.
Coordination Protocol: A mechanism that relays messages between source and destination mailboxes, commonly through off-chain mechanisms. Examples include shared sequencers or intent relayers. A settlement mechanism to ensure the integrity of cross-chain messages is often part of a coordination protocol. It is not always required in the case of asynchronous messaging.
In synchronous messaging protocols, chains have synchronized blocks (e.g., a block for each chain is produced every t seconds) and messages are received on the destination chain within the same block timeslot as they were sent from the source chain. In particular, a chain may send a message and read a response in one transaction within a block on a chain.
In asynchronous messaging protocols the restrictions of synchronous protocols do not apply. In particular, the chains sending and receiving messages may not have synchronized block timeslots and there may be a delay (measured in elapsed blocks) between the transaction on the source chain that sends a message and the transaction on the destination chain in which it is received.

Message Format

All messages MUST follow this format:

Mailbox APIs

Rationale

Multiple VM Support

On the receiving side, EVM chains would type cast address(parsedAddress) on the bytes32 parsedAddress from the received message.

Arbitrary message payload

Use Metadata Digest Instead of `sessionId` for Message Query Key

Pre-filled Inbox

As mentioned above, pre-filling the inbox of the destination chain incurs additional gas costs which are manageable on L2s. We list some advantages of our approach:

Eliminates Merkle inclusion proof verification when receiving every message. Some messaging protocols obviate Merkle proofs completely, even during populateInbox(), and delay the mailbox check to the settlement layer.
Allows upgrade of the coordination protocol (e.g. the internal of populateInbox()and/or the settlement layer logic) without requiring changes to connected apps.
Enables user-signed transactions on the destination chain instead of shared sequencer-signed transactions in synchronous messaging. This flexibility simplifies gas payment handling and ensures a consistent msg.sender. Detailed explanations are omitted for brevity (extended answers on the website).

Related Proposals

Backwards Compatibility

Reference Implementation

We show a possible implementation of the mailbox contract for synchronous messaging protocol, and explain how it can be easily modified to support asynchronous protocols as well.

The high-level flow is as follows:

Sending a message simply updates the outbox digest.
Prefill the inbox by inserting the messages sequentially in the same order as the outbox in the source chain. Each insertion updates the corresponding inbox digest.
- The prefilling transaction is likely created by a coordinator in the coordination protocol, who monitors messages sent from all rollups and relay them to the intended destination chain. The rollup sequencer includes this transaction in the block, ensuring it precedes any mailbox "read" operations and enforces a single prefilling per block.
Applications can then read a message by querying the key-value map inbox with the (hashed) message metadata.
External to these transactions, the settlement layer will receive new inbox digest and outbox digest (with storage proofs against a proven new rollup state) and checks chain_i.inboxDigest[chain_j] == chain_j.outboxDigest[chain_i] for all i!=j
- Note: We ignore the slight complication of mailbox reset at the beginning of each block using nested mapping in our description above for brevity, but they are dealt with in our code snippet below.

To extend the synchronous mailbox to support asynchronous protocols, we only need these modifications:

Remove the first layer of mapping from inbox, inboxDigest, outboxDigest, since the “domain-separation from block number” requirement is gone. (e.g. changed to mapping(bytes32 => bytes) inbox)
At the settlement layer, enforce that the destination chain’s inbox is a subset of the source chain’s outbox messages, rather than a full equality check.
- Consequently, change the accumulator algorithm used to compute inboxDigest,outboxDigest to ones with efficient subset proof.
Remove the _reset() logic since all messages for async will be permanently stored.

Note on Gas Cost

delete inbox[key] during .recv() to get gas refunds for cleaning some storage
- Synchronous messages are cleaned up at the end of the same block in which they are populated. L2 can optionally implement gas optimizations for such block ephemeral storage.
utilize the EIP-2930 access list to “pre-warm” predictable storage slots for lower execution cost
batch-populate inbox messages and cluster them under fewer keys (bucketed mapping) e.g.: mapping(bytes32 bucketKey => mapping(bytes32 => bytes)

ERC-7841: Cross-chain Message Format and Mailbox

A mailbox API and message format to provide a unified cross-chain developer experience

Abstract

Motivation

Specification

Terminology

Message Format

Mailbox APIs

Rationale

Multiple VM Support

Arbitrary message payload

Use Metadata Digest Instead of sessionId for Message Query Key

Pre-filled Inbox

Related Proposals

Backwards Compatibility

Reference Implementation

Note on Gas Cost

Example Usage - Sync and Async Cross-chain Transfers

Example Usage - Cross-chain function calls

Security Considerations

Copyright

ERC-7841: Cross-chain Message Format and Mailbox

A mailbox API and message format to provide a unified cross-chain developer experience

Abstract

Motivation

Specification

Terminology

Message Format

Mailbox APIs

Rationale

Multiple VM Support

Arbitrary message payload

Use Metadata Digest Instead of sessionId for Message Query Key

Pre-filled Inbox

Related Proposals

Backwards Compatibility

Reference Implementation

Note on Gas Cost

Example Usage - Sync and Async Cross-chain Transfers

Example Usage - Cross-chain function calls

Security Considerations

Copyright

Use Metadata Digest Instead of `sessionId` for Message Query Key

Use Metadata Digest Instead of `sessionId` for Message Query Key