Block roots permanent cache (EIP-4788 plugin)

GM everyone, it is @madlabman. Recently, I attempted to adopt EIP-4788 to one of the projects I’m currently working on. I found out that there’s a problem with the time during which the CL block roots are available for use on EL. In this post, I would like to propose a solution that will help to overcome this issue and improve EIP-4788 capabilities for DeFi protocols like Lido.

TLDR: Introduce a contract with persistent storage of block roots pulled from the EIP-4788 buffer. In addition, the storage should provide the ability to back-fill any missing root using a proof. Eventually, the storage will allow querying any required root for a 3rd party contract.

Problem statement

EIP-4788 made consensus layer block roots available on the execution layer. It has a variety of use cases, especially for liquid staking protocols, such as Lido. But the current state of things makes this EIP less applicable than one can imagine. The problem lies in the way EIP is implemented to store the roots mentioned above. It uses a buffer with a limited capacity, meaning only a subset of recent-enough roots will be available. While there is more than a 27-hour window at the moment, and it covers most of the use cases, under certain conditions, it could not be enough. For example, if a staking protocol allows its operators to report withdrawals of validators by themselves, there is no guarantee that the operators will do it in time. Even for the protocol-driven permissioned oracles, there is some place for accidents resulting in missing of an opportunity to deliver a report within the time frame.

That’s why it’s crucial to have a spare mechanism for using the proofs of CL structures, even for blocks whose roots have already been overwritten in the buffer.

Proposed solution

To extend an availability of block roots on EL the separate long-term storage of roots is proposed. The main difference of this storage is that it accumulates not all the roots created, but only the selected ones, hence called “checkpoints”. Implemented as a standalone contract, the storage will accept a call to a permissionless method, which will query the EIP roots buffer and store the latest one as a “checkpoint”.

These “checkpoints” play a crucial role in the storage. They are not only a trustless source of a root not presented in the buffer anymore, but a way to add additional “checkpoints”. Anyone can bring a proof of a block’s root somewhere close to one of the “checkpoints” known to the storage and back-fill the missing root. Pulling roots from the buffer frequently enough makes it cost-efficient to back-fill any historical root at the storage. Ultimately, it leads to full coverage of CL block roots on the execution layer in a dense manner.

While providing long proofs to the end user contracts to prove any old data is still possible, using single storage for this purpose makes UX much smoother. Additionally, wide storage adoption across protocols leads to deeper coverage and reduced costs of using proofs.

I’m seeking comments and suggestions from the community regarding the proposed design. You can delve into the technical details of the proposed solution here.

12 Likes

I’d actually suggest to make this an EIP by itself; a common canonical cache for block roots sounds valuable for the ecosystem

4 Likes

Great idea @madlabman ! I have a few questions.

Would be interesting to first understand why 4788 doesn’t already do that (or at least integrate/offer a “fuller” option) and devs opted not to. Creating and championing EIPs is pretty onerous and if the devs already decided against doing something similar it’ll probably result in a lot of wasted effort :frowning:

Can you explain a bit more about the use-case here? I’m not understanding when NOs would need to report full withdrawals (these are available on the EL anyway, no?).

What makes doing it frequently more cost efficient than… (not sure what the alternative is, obtaining the root yourself from CL once it’s “aged out” and then also providing the proof? doesn’t proof need to be provided anyway, or is it just cheaper to check against current buffer than validating proof?)?

Are there any adverse affects to the costs of interacting with this contract as a part of a write method due to the potential “state bloat” of eventually having more roots in the contract than just one protocol/use-case may need (e.g. eventually making it non cost effective to use)?

1 Like

Hey, @Izzy ! Nice questions, let me express my thoughts.

Nobody probably needs all the block roots available. So, it would be a waste of space on Ethereum. That’s why I speak about the proposed solution as a “spare” mechanism.

By the availability of withdrawals on EL, I understand minting new ethers to withdrawal credentials (WC). Correct me if I’m wrong. So, multiple withdrawals become indistinguishable for the protocols with a single WC. That’s why reporting about these withdrawals is still required if a protocol somehow relies on that data.

Given that we want to prove any aged block root, you must provide proof against any trusted block root. It doesn’t matter if it comes from the buffer or the proposed storage. But what does matter is how old is the root you’re bringing. Proof will grow linearly depending on that and since the cost of verifying as well. So, the older the root, the higher the cost of validating against the EIP buffer. At the same time, the closer any “checkpoint” in the proposed storage will be, the cheaper it will be to prove a root against a root in the storage.

Based on my previous reply, the more roots are in the storage, the cheaper the addition of a new one. I hope I understood your question right.

My understanding is that the specific indices are included in the EL payload but perhaps they’re lost when it’s transformed into a root?

It’s not accessible on-chain (via SCs) but it should be via nodes so should suffice for our purposes as you can calculate against root and prove.

Got it, thank you!

Hmm not exactly. I’m wondering if the number of roots there are in storage increases the cost to read from this contract if that read is ever a part of another write operation, or not, and if it does whether that might make it not cost effective in the long run?

They don’t want an unbounded state growth (so don’t want to store all the hashed forever) and there’s no good way to introduce demand-based storage right in the protocol.

Standards for execution environment constructs that are permissionless to deploy and do not require any changes to client code are a lot less onerous to push than consensus/evm changes.

1 Like

Took a look at tech design and it seems there’s opportunity to improve efficiency:

  • for blocks after 4788 is introduced, you can skip up to HISTORY_BUFFER_LENGTH blocks instead of proving it one by one (at any point in time, block roots for up to HISTORY_BUFFER_LENGTH are available in storage and can be proved)
  • for blocks before, collapsing computations in a zk proof is a very clear tradeoff of efficiency vs complexity

Did I get you right that blocks up to HISTORY_BUFFER_LENGTH old can be just obtained using EIP contract and the proposed contract should just re-route the requests for such blocks?

That too, but that is possible to do not just for a current block and its buffer. You will be able to prove up to HISTORY_BUFFER_LENGTH block roots from the state of the block. Not just parent block.

Yeah, essentially, a caller just can do that in two hops: pull a new checkpoint by a timestamp close to the back of the buffer (because it doesn’t know when his transaction will be included), and then provide a proof against this new checkpoint. It’s hard to implement on chain, because of the missing slots and hence unknown timestamp.

1 Like

For older blocks you can also use state proofs to use block headers buffer, like there: GitHub - lidofinance/curve-merkle-oracle: Trustless oracle for a Curve ETH/stETH pool using MPT proofs

3 Likes

Quick question/suggestion - how difficult would it be to expose an additional method to get block roots by slot number, rather than timestamp? Basically,

def parent_root(ts: uint64) -> bytes32: view
def parent_root_by_slot(slot: uint64) -> bytes32: view

I think this would be quite handy for trustless Oracle implementations, such as

1 Like

Hi! The short answer is it is easy, because it is a matter of a simple calculation from a GENESIS_TIMESTAMP (at least for now). And end-user contracts such as oracles you mentioned can do that internally. Since the slot interval could be a subject of change, I would say it works better to have a separate library deployed for that.

1 Like

Thanks! Yes, doing that slot => timestamp transformation internally is possible, so it’s mostly the matter of API convenience. I think calling a dedicated method would be a bit more convenient than importing a library + transforming slot => timestamp - would be great if that could be added (and ok if not :slight_smile: )

1 Like

@vsh thank you for your finding! It really cuts the gas costs.
I’ve added more details to the HackMD note.

1 Like

You’re right, it’s possible to create MPT proof to prove a withdrawal against an execution blockhash. The caveat here is the need for source of trust to blockhash, and the EVM gives access to the 256 most recent blocks only. Using Merkle proofs for withdrawals in CL blocks seems to be easier and provides a deeper stack even with vanilla EIP-4788 buffer.

Read from the storage is the same as a plain read operation from a contract storage, just with call’s overhead. So it doesn’t depend on the size of the storage.

@Izzy, sorry for the late response.

1 Like

hi @madlabman ! thanks for the proposal!

I want to point out that there is already a historical buffer committed to in the beacon state.

Refer the block_roots, state_roots, and historical_roots fields in the BeaconState in the capella fork of the consensus-specs. (I tried to include a link but was forbidden by the forum rules).

With these fields the only constraint the ~1d of 4788 history provides is the window from when a user makes a proof (against whatever historical state they desire) and when it expires with respect to on-chain verifiability.

Useful things to “cache” in the execution layer would be object-level questions like “is this validator with this index slashed?” or similar things where the answer does not change once answered and the first prover of this fact on-chain could cache the query in some smart contract state to save the cost for all subsequent queries.

In light of this, I don’t see the need to try to cache all block roots within the execution state itself.

Also note that withdrawals are committed to in the ExecutionPayload so any individual withdrawal can be proven block_root (from 4788) to state_root to execution_payload_header to withdrawal data.

7 Likes

Hi @ralexstokes ! Thank you for your awesome comment!

I see your point and I agree it doesn’t make sense to store all block roots on EL, and it seems reasonable to use a beacon state object and its internal accumulators. As I understand the historical_roots field is deprecated and frozen during the Capella fork and superseded by the historical_summaries.
Given that we have several possible ways to bring on-chain verifiable proofs, let’s say of some execution payload property such as withdrawals:

  1. One brings a proof against a state_root related to an available 4788 block_root.
  • :arrow_down: 4788 block_root
  • :arrow_down: state_root
  • :arrow_down: execution_payload_header
  • :green_circle: property
  • or even 4788 block_rootexecution_payloadproperty bring in time.
  1. One brings a proof against a state_root from state_roots of BeaconState. In this case it’s possible to verify a state_root up to 2 eras old ~512 epochs (the oldest 4788 block_root and the oldest state_root respectively).
  • :arrow_down: 4788 block_root
  • :arrow_down: state_root
  • :arrow_down: state_roots
  • :arrow_down: old_state_root
  • :arrow_down: execution_payload_header
  • :green_circle: property
  1. One brings a proof against a state_root from HistoricalSummary entry accumulator.
  • :arrow_down: 4788 block_root
  • :arrow_down: state_root
  • :arrow_down: historical_summaries
  • :arrow_down: state_summary_root
  • :arrow_down: old_state_root
  • :arrow_down: execution_payload_header
  • :green_circle: property

The 1st option is the most straightforward and the cheapest one, and the 3rd option is the most complicated one (it takes 3 full states to construct a full proof and requires a longer proof in general).
In my opinion an end-user contract should support 1st option as the most convenient way to bring a proof. As an addition the contract may accept proofs created by the 2nd option to verify aged facts.

For the 2nd option to be applicable most of the time, we need for a reliable source of state roots. So, what if to re-frame the proposal and instead of storing all block roots, to make a lightweight storage of state_roots populated once an era (only once).

In this case, there’s no need for the end-user contract to explicitly bring a block_root to the state_root proof, as an external actor handles this incorporation into the cache. This not only streamlines the process but also contributes to a smoother user experience for the end-user contract.

@ralexstokes I’m seeking for your review and spotting any mistakes in my considerations.

4 Likes

yeah, id just focus on option 1

the only reason that wouldn’t work is if you could not get a transaction on-chain w/in ~1 day of making the proof

I would assume for most applications this is very achievable

2 Likes