EIP-7939: Count leading zeros (CLZ) opcode

Opcode to count the number of leading zero bits in a 256-bit word


Metadata
Status: DraftStandards Track: CoreCreated: 2025-04-28
Authors
Vectorized (@Vectorized), Georgios Konstantopoulos (@gakonst), Jochem Brouwer (@jochem-brouwer), Ben Adams (@benaadams), Giulio Rebuffo (@Giulio2002)

Abstract


Introduce a new opcode, CLZ(x), which pops x from the stack and pushes the number of leading zero bits in x to the stack. If x is zero, pushes 256.

Motivation


Count leading zeros (CLZ) is a native opcode in many processor architectures (even in RISC architectures like ARM).

It is a basic building block used in math operations, byte operations, compression algorithms, data structures:

  • lnWad
  • powWad
  • lambertW0Wad
  • sqrt
  • cbrt
  • byte string comparisons
  • generalized calldata compression/decompression
  • bitmaps (for finding the next/previous set/unset bit)
  • post quantum signature schemes

Adding a CLZ opcode will:

  • Lead to cheaper compute.
  • Lead to cheaper ZK proving costs. The fastest known Solidity implementation uses several dynamic bitwise right shifts shr, which are very expensive to prove. In SP1 rv32im, a 32-bit right shift costs 1.6x more than a 32-bit mul.
  • Lead to smaller bytecode size. The fastest known Solidity implementation contains several large constants and is often inlined for performance.

Specification


A new opcode is introduced: CLZ (0x1e).

  • Pops 1 value from the stack.
  • Pushes a value to the stack, according to the following code:

Or in Python,


Or in C++,


The cost of the opcode is 3, the same as ADD.

Rationale


The special 0 case

256 is the smallest number after 255. Returning a small number allows the result to be compared with minimal additional bytecode.

For byte scanning operations, one can get the number of bytes to be skipped for a zero word by simply computing 256 >> 3, which gives 32.

Preference over CTZ (count trailing zeros)

Computing the least significant bit can be easily implemented with CLZ by isolating the smallest bit via x & -x.

However, it is not possible to implement CLZ with CTZ.

Gas cost

We have benchmarked the CLZ implementation against the ADD implementation in the intx library. CLZ uses approximately the same amount of compute cycles as ADD.

The SP1 rv32im optimized variant uses less compute cycles than ADD, in the average and worst cases.

In SP1 rv32im, a 256-bit CLZ is cheaper to prove than ADD.

Backwards Compatibility


This is a new opcode not present prior.

Test Cases








Security Considerations


CLZ is a stateless opcode that has a low worst-case constant cost in memory usage, compute and proving costs. It is therefore safe from being exploited for denial of service attacks.

Copyright


Copyright and related rights waived via CC0.