On Friday, December 8th we noticed occasional spikes in our relay’s processing time on getPayload. Upon investigation, we discovered the following:
- Increased validator registration load near epoch boundaries was consuming enough resources to impact header/payload request processing.
- Some consensus clients (e.g. Prysm, Teku) have a hardcoded timeout on getPayload of 3 seconds vs. the MEV-Boost 4 second timeout. So our understanding of timeout budgets was incomplete.
- On slots where we saw the spike in getPayload, validator’s running consensus client’s with a 3 second timeout experienced a timeout error triggered by their consensus client because the Blocknative relay didn’t deliver within 3 seconds, causing them to miss the slot.
On Monday, Dec 12th ~1800 UTC, Blocknative deployed an update to our relay to better handle validator registration congestion on the Blocknative Relay at the edge of each epoch. This allows the Blocknative Relay to scale as we onboard more validators and deliver on a higher getPayload rate. Since this release, we are seeing significantly improved getPayload response times.
We continue to make improvements to our relay implementation and supporting infrastructure to ensure accurate and timely responses on all our APIs.
We have started a discussion in the Flashbots forum regarding the discrepancy in timeout for getPayload between the consensus client and MEV-Boost to reduce any confusion on where timeouts are happening. Feel free to let us know if you have any questions.