EIP-7939: Count leading zeros (CLZ) opcode
Opcode to count the number of leading zero bits in a 256-bit word
Abstract
Introduce a new opcode, CLZ(x)
, which pops x
from the stack and pushes the number of leading zero bits in x
to the stack. If x
is zero, pushes 256.
Motivation
Count leading zeros (CLZ) is a native opcode in many processor architectures (even in RISC architectures like ARM).
It is a basic building block used in math operations, byte operations, compression algorithms, data structures:
- lnWad
- powWad
- lambertW0Wad
- sqrt
- cbrt
- byte string comparisons
- generalized calldata compression/decompression
- bitmaps (for finding the next/previous set/unset bit)
- post quantum signature schemes
Adding a CLZ
opcode will:
- Lead to cheaper compute.
- Lead to cheaper ZK proving costs. The fastest known Solidity implementation uses several dynamic bitwise right shifts
shr
, which are very expensive to prove. In SP1 rv32im, a 32-bit right shift costs 1.6x more than a 32-bit mul. - Lead to smaller bytecode size. The fastest known Solidity implementation contains several large constants and is often inlined for performance.
Specification
A new opcode is introduced: CLZ
(0x1e
).
- Pops 1 value from the stack.
- Pushes a value to the stack, according to the following code:
Or in Python,
Or in C++,
The cost of the opcode is 3, the same as ADD
.
Rationale
The special 0 case
256 is the smallest number after 255. Returning a small number allows the result to be compared with minimal additional bytecode.
For byte scanning operations, one can get the number of bytes to be skipped for a zero word by simply computing 256 >> 3
, which gives 32.
Preference over CTZ
(count trailing zeros)
Computing the least significant bit can be easily implemented with CLZ
by isolating the smallest bit via x & -x
.
However, it is not possible to implement CLZ
with CTZ
.
Gas cost
We have benchmarked the CLZ
implementation against the ADD
implementation in the intx library. CLZ
uses approximately the same amount of compute cycles as ADD
.
The SP1 rv32im optimized variant uses less compute cycles than ADD
, in the average and worst cases.
In SP1 rv32im, a 256-bit CLZ
is cheaper to prove than ADD
.
Backwards Compatibility
This is a new opcode not present prior.
Test Cases
Security Considerations
CLZ
is a stateless opcode that has a low worst-case constant cost in memory usage, compute and proving costs. It is therefore safe from being exploited for denial of service attacks.
Copyright
Copyright and related rights waived via CC0.