CAIP-19: Asset Type and Asset ID Specification

Metadata

Status: ReviewStandardCreated: 2020-06-23

Authors

Antoine Herzog (@antoineherzog), Pedro Gomes (@pedrouid), Joel Thorstensson (@oed)

Requires

EIP-2

Links

GitHub page Discussion on Ethereum Magicians

Simple Summary

CAIP-19 defines a way to identify a type of asset (e.g. Bitcoin, Ether, ATOM) with an optional asset identifier suffix (for individually-addressable tokens like NFTs) in a human-readable, developer- and transaction-friendly way.

Abstract

Often you need to reference an asset type, or an asset type + asset identifier to identify a specific token from that set (if non-fungible). For example, precise specifications of assets exchanged as an atomic swap (within or between blockchains) require this kind of unambiguous addressing, as do dashboards for tracking assets held by a given address or in a given collection.

Motivation

Currently, each wallet or each exchange needs to create their own registry of types of assets and their associated metadata for example like Trust-Wallet or CoinMarketCap. Providing a unique Asset Type and, where applicable, a type-specific Asset ID for each asset for developers can reduce the risk of confusion between different assets.

Specification of Asset Type

The Asset Type is a string designed to uniquely identify the types of assets in a developer-friendly fashion.

Syntax

The asset_type is a case-sensitive string in the form

Note that -, % and . characters are allowed in asset_references, which include on-chain addresses like those specified in [CAIP-10][], but no other non-alphanumerics such as :, / or \. Implementers are recommended to use "URL encoding" (% + 2-character codes, canonically capitalized) as per Section 2 of RFC 3986 to escape any further non-alphanumeric characters, and to consider homograph attack surfaces in the handling of any non-alphanumerics.

Specification of Asset ID

The optional addition of an asset ID suffix separated by / uniquely identifies an addressable asset of a given type in a developer-friendly fashion. In the case of non-fungible tokens or other collections, this address is called a token_id (commonly referred to as a "serial number" since they are often sequentially numbered). Note: [ERC721][] defines identifiers for specific tokens as uint256 values (i.e. an integer ranging from 0 to 2^256-1) and recommends but does not require them to be serially assigned.

Syntax

The asset_id is a case-sensitive string in the form

Note that -, % and . characters are allowed, but no other non-alphanumerics such as :, / or \. Implementers are recommended to use "URL encoding" (% + 2-character codes, canonically capitalized) as per Section 2 of RFC 3986 to escape any further non-alphanumeric characters, and to consider homograph attack surfaces in the handling of any non-alphanumerics.

More constrained character sets per namespace may be specified in each namespaces' CAIP-19 profile, which outline some common asset types.

Canonicalization

Note that for smart contract addresses used in some Asset Types (like ERC721 and its equivalents), some namespaces like the EVM offer canonicalization schemes that use capitalization (e.g. EIP-55), an option suffix (e.g. HIP-15), or some other transformation. At the present time, this specification does NOT require canonicalization, and implementers are advised to consider deduplication or canonicalization in their consumption of CAIP-addresses. CAIP-19 profiles in CASA namespaces may contain additional information per namespace.

Semantics

Each asset_namespace covers a class of similar assets. Usually, it describes an ecosystem or standard, such as e.g. slip44 or erc20. One asset_namespace should include as many assets as possible. asset_reference is a way to identify an asset within a given asset_namespace.

To date, the only cross-chain/multi-namespace standard incorporated into CAIP system is SLIP-44, described in [CAIP-20][]; the former offers a registry for native fungible tokens across namespaces. Namespace-specific standards are profiled in CAIP-19 profiles in the CASA namespaces registry; the erc20 addressing on EVM chains, for example, is defined in namespaces/eip155/caip19.

Rationale

The goals of the general asset type and asset ID format is:

Uniqueness within the entire asset ecosystem
To some degree human-readable and helps for basic debugging
Restricted in a way that it can be stored on-chain
Character set basic enough to display in hardware wallets as part of a transaction content

The following secondary goals can easily be achieved:

Can be used unescaped in URL paths
Can be used as a filename in a case-sensitive UNIX file system (Linux/git).

Those secondary goals have been given up along the way:

Can be used as a filename in a case-insensitive UNIX file system (macOS).
Can be used as a filename in a Windows file system.

Test Cases

This is a list of manually composed examples

Changelog

2022-10-23:
- expanded charset to include -,., and %
- added canonicalization section and links
- better language for use cases, wider-characterset syntax, etc
2022-05-12: regex for token_id expanded to include entire uint256 range
2021-06-25: regex max lengths raised and test cases updated accordingly
2020-06-23: added distinction between asset type and asset ID

CAIP-19: Asset Type and Asset ID Specification

Simple Summary

Abstract

Motivation

Specification of Asset Type

Syntax

Specification of Asset ID

Syntax

Canonicalization

Semantics

Rationale

Test Cases

Changelog

Links

Copyright