Although I am sure most already know and maybe LIDO has already established contact, I’d suggest to actively engage with new relays/builders in this early days of the market evolving, in order to help possible smaller players gain traction quickly and ideally enriching the whole ecosystem.
E.g. https://relayooor.wtf/ by https://twitter.com/builder0x69 (some details First 10,000 blocks and new features | by builder0x69 | Oct, 2022 | Medium ) - started Oct26th
Thank you for this input and great that we’re having this discussion openly. I hope that manifold replies here so that we can constructively move forward on this.
I think the license issue is important but murky. Manifold have raised an interesting point re: the CLA that Flashbots has employed and apparent reticence by FB to accept Manifold to contribute upstream directly. One can argue that just pushing their changes to a forked repo is probably enough here to “comply”, and I think they should do it (and if they have and we’ve missed it we should update accordingly), but the behavior by FB here is also kinda weird. However I don’t think it’s productive to litigate license compliance , so instead what if we do something like discuss what we think acceptable way forward may be here:
- Open sourcing of current derivative changes to fb relay & builder-geth
- Clear plan for open sourcing relay implementation (if own is to be built)
- Update on external builder integration and reward simulation
I agree with your points re: communication and speed of action.
That said, I believe that the over-arching concern here is “what’s best for the ecosystem” and having an additional relay is net positive, because ultimately what we want is relay and builder diversity.
Freddy makes a good point that we can see the “allow list” usage as an expression of dissatisfaction with the status quo but also allowing for a speedier path to amelioration and best overall outcome. While there’s room for improvement, we’re still early days here and I’d suggest that we consider working through this stuff optimistically rather than conservatively.
Definitely! We have reached out to relayooor and 2 other relay operators who have expressed a desire to run independent/neutral relays and figuring out how fast we can get them connected to testnet validators. If anyone else is bumping into relay operators, please ask them to post in the Lido on Ethereum: Call for Relay Providers topic!
It’s a good thing to be Easy-Track-ed, indeed. The contract was designed with the ability to be integrated into the Easy Track governance process.
Speaking about the capacity of some better-to-be-involved contributors, I may suggest putting it on the table for Q1 '23. As far as I know, we expect huge deployments of the regular DAO payroll committees (having the higher priority, basically).
It looks like we have rough consensus here we all agree with. I’d like to see as much relays on the must-include list as possible, on the sole principle that community is decently sure they won’t steal or lose MEV from our stakers. I think that Manifold has to do a few things to get to the “decently sure” level: compensate the loss incurred and address the comms issue.
Detailed response for the Lido Community
This response is in two main parts:
- a detailed summary with some commentary
- a personal statement
FOSS Relay
We have a monorepo, our entire infrastructure is designed, built, deployed and managed that way. Though this argument is a strawman: just because the code is open source does not mean that it is being ran 1:1 (i.e. it is not being ran modified, etc).
Post Mortem response
This issue in response was entirely my fault, I had taken a work break after devcon and was traveling. I did not understand the issue at the time, and had mistook it for a previous service disruption that had occurred earlier in the week which was not comparable in magnitude. We had already provided a summary to the operations team at Lido at this event (in which we voluntarily disclosed to them). I confused it for a repeat of the same event and dismissed the need for a timely public statement.
If it is not clear, we are providing 100% restitution. We are currently working with authorities and an external investigator in this matter. The restitution can be coordinated with Lido’s operations team. Note that this does not mean we admit fault in this manner.
Relay and Network Reliability
Building a reliable, robust service often means building something that can keep working when some parts fail. A service where not every feature is available is often better than one that’s entirely offline.
Doing this in a meaningful way is not obvious.
The usual response is to hire more engineering, more support, and even more managers. Error handling, or making components that can recover from faults, often feels like the option of last resort—especially in blockchain networks.
The usual response to error handling is optimism. Unfortunately, the other choices aren’t exactly clear, and often difficult to choose from too. If you have two services, what do you do when one of them is offline: Try again later? Give up entirely? Or just ignore it and hope the problem goes away?
This was my thinking in making those frustrated comments. I no longer interact on Twitter either personally or professionally, nor care to.
Communications and Engagement
We have been considering for more than two months how best to approach this role, and found someone we think will be invaluable to the entire team. We suggest a community call participation at the next regularly scheduled Lido community event or if desired we can do a one-off meeting at whatever time the community finds desirable to have such a call.
Summary
-
We have created a new dedicated Statuspage strictly for SecureRpc
-
Create a dedicated Ethereum Ecosystem RSS aggregation and notification alert page for defect response and alerts. This is meant to provide notifications to node operators of any potential issue affecting them as it relates to software they are running themselves.
-
We have recruited a new business and community lead for coordinating and facilitating our engagements across communities and different projects.
-
Attend and interact with the Lido community during a live community call/gathering within the next 7-14 days. A time that is scheduled can be attended by us.
-
100% Restitution, within the next 7-14 days, as was originally implied in the post mortem.
-
In the coming days we will be making public a few on-chain and off chain solutions that should provide not only improved recovery it will also provide validators with a type of insurance bond protecting both potential losses as well as surplus reward payouts to them.
-
Manifold is also (and has been) ready to operate as a node operator as well, we see no meaningful reason for the relay/operator separation.
-
A 3rd party forensic auditor has been engaged and is helping assist authorities in our internal investigation.
Personal Remarks
This was an offensive attack. An attack on a service that most Ethereum developers will never even work on. This service also happens to offer Ethereum a non-censoring transaction relay that is operating against a potential counter party that is primarily concerned with maintaining its international rules-based system by financial controls for purposes of managing the global economy. We consider this event as strong evidence of battle-space preparation. We know the potential outcomes, we have seen the tactics being used thus far.
Cheers,
Sam
Also, Manifold has run on average the most profitable relay since the merge till 10/22/2022, this is from a 3rd party that will be publishing a rigorous analysis of relay performance:
Is it this: https://status.manifoldfinance.com/?
Would it be valuable in your mind to have an open-source repo to serve as a devops primer or knowledge base to collect learnings & best practices from various MEV relay runners + to help prospective relay operators bootstrap & get started?
This is good news and congrats. And yes that is a great idea. I would also suggest inviting the ethstaker community to that call or perhaps holding a separate one with them, to the extent the Lido community would prefer to keep it within Lido.
Is the idea behind this that it would be an aggregated feed covering all known MEV relays, public RPC endpoints, and execution & consensus client updates? I like the idea and would like something like that (more surprised it doesn’t exist already)
Funded presumably out of treasury? Or would it be via a portion of your builder’s profits?
Do you mean relay operator, builder, proposer, or all of those things?
Would be eager to see results of their investigation once finished, especially if the Manifold team is comfortable publishing in its entirety.
Same comment as above, would be eager to see if the 3P auditor can corroborate this, in which case alarm bells should be going off for all other relay operators (particularly the “unregulated” ones).
As someone who runs blockchain services: Troubleshoot the failing service, find root cause, fix whatever went wrong, and think on how to improve things so it doesn’t fail again. If that’s an option. Sometimes it’s an upstream github issue to fix a bug.
I’m not quite sure what the message is, here. “Ops is hard, sometimes it’s best to just do nothing”? Or, “we don’t have enough staff”? Both would require action and a change in operational behavior, in my opinion.
I am uneasy with this answer. This sounds entirely dismissive: “No, we won’t open-source; and anyway, if we did, what guarantee do you have that we’d actually run that code?”
No guarantee but your word. No evidence but that of an open-source repo and issues and PRs opened upstream. Relays are trusted entities, more so than other ecosystem participants. That requires behavior that goes above and beyond when it comes to transparency, community engagement and good faith.
We are weird humans here at Flashbots
I’m curious to hear more about this. We are coming back from conference+meeting+rest, with a stronger commitment, better structure, and more time to engage with our community. If there are issues around our free software, I’m happy to get them resolved.
I want to bring back attention to this: What makes a relay trust-worthy? - Relays - The Flashbots Collective
In particular, that after a brief discussion in the PBS developers roundtable in Bogotá I’ve added:
- In case of missing blocks, does not retroactively pay to the affected validators.
When designing the side-car approach for mev-boost we took into account that getting extra MEV rewards comes with extra risks. I didn’t like when bloxroute paid the affected validators after an incident. I don’t like this to become a thing, because it introduces weird economic dynamics that we haven’t understood yet. If you have thoughts about this, please share them.
While I agree that the idea of restitution can have weird economic dynamics, I think the current trust assumptions unfortunately skew risk in such a way that restitution is one of the few viable mechanisms for trust to be regained when incidents occur.
I think the risk here is disproportionate to rewards, at least in the current status quo, especially when you consider the damage that might be caused by a mal-performing relay (or set of relays) to the network.
Validators need to implicitly trust relays until such a time that payout is guaranteed (e.g. via enshrined PBS) and/or validators are able to effectively operationally reduce the impact of operational failure of the MEV Boost stack at a relatively quick speed (which to be honest isn’t really feasible currently or practical – features like circuit breakers are still not implemented and even in such a case they’re “local” (i.e. if a relay has an issue then each validator needs to process that issue on its own via issues somehow being communicated across validator and operator sets)).
Especially when taken into consideration with the coming of new and less eponymous relays (which we obviously want to support), consider that something goes wrong with a relay and a lot of validators are affected. What reason would these validators have to re-trust that relay if they are not made whole?
I hear you. I know I’m not making an economic argument in here, which might be naive. I still like my point
Two facts are that there will be bugs and that the reward boost is tempting.
With that, I think that we can make sure we never lose trust by following:
- Take down the server when a high-severity bug is found, so validators switch to local building and don’t miss slots.
- Publishes a post-mortem after every incident.
(from What makes a relay trust-worthy? - Relays - The Flashbots Collective)
If a team fails to do that, I don’t think it should be re-trusted, ever. In the Lido case, I see that as down-grading from must
to allow
. Currently, I don’t see a way from must
to allow
to must
again.
I think this will change once the relay monitor is operating, because then we can quantify the risk and the reward of every relay. At that time, maybe getting into the must
list means having a high score in the monitor, and it’s no longer subjective.
This is deliberately misleading. Manifold counts regular mempool transactions to the proposer feeRecipient as “additional value provided”, using such transactions to inflate the bid value. In one instance, there was a 170 ETH transaction to the proposer, which Manifold counts towards “being the most profitable relay”, but in reality that value would have been part of any mempool block as well, and other relays wouldn’t have counted that as part of the bid value at all.
There is currently ongoing discussions around the block-scoring mechanism, and full balance diff is a possible option, but it’s disingenuous to claim being the most profit relay while comparing inflated numbers (concentrated in only few blocks too).
We welcome upstream contributions, but haven’t seen any attempts to do so from Manifold.
More importantly, Manifold claiming the CLA being an issue is a straw-man argument! Our relay is released under the AGPL license, in the hope it’s useful to others, with the simple requirement to make their changes available to the public again. Manifold has chosen to use our codebase but deliberately rejects to simply make their fork public, which would be enough to satisfy the AGPL license requirements.
Sigma Prime will use the “must-include” list of relays.
We do not believe that Ethereum should achieve censorship resistance by pressuring validators to potentially expose themselves to the legal risks of other entities on the network. We believe that Ethereum is capable of specifying and implementing novel forms of cryptography which clearly place the responsibility of transacting on those that are transacting, rather than the block-signing “middle-men” (middle-people). This is not evasion of the law, rather ensuring that individuals are solely responsible for their own actions. This is what we will work towards at Sigma Prime.
We fully support node operators that wish to run only non-censoring relays and it is our wish that Lido makes room for these operators, too. We believe that if Lido wishes to engage with operators that have a high degree of moral integrity, then it should also make allowances for those operators to exercise their own moral judgement. We believe that Lido can survive perfectly well in a world where some operators use exclusively non-censoring relays and it would speak volumes to Lido’s credibility to allow those operators the freedom to operate within their own moral framework.
Looking forward to the outcomes of your efforts here. Neutral infrastructure is critical and something we aim to build and your work here will help immensely.
One factor worth bringing up is relay diversity. Relay diversity, like client diversity, is critical to the health of the ethereum ecosystem. It is one thing to be connected to multiple relays - but if they all use the same source code you’re vulnerable to invalid and empty blocks. Source code diversity is critical.
To-date there are two verifiably unique codes, Flashbots and Blocknative’s Dreamboat. For operational redundancy validators should ensure that they have - at a minimum - one relay on Flashbots source code and one on Dreamboat source code hooked up to their MEV-boost.Transparency in this is something we think all validators (and relays) should consider because code diversity ensures operational redundancy.
Strong agree. Do you think relay codebase diversity should be held up to the same standard as execution and consensus client diversity? Or should we demand more/less of the MEV relay ecosystem (which is quite small and concentrated at the moment)?
Hi everyone, Aidan from ChainSafe here.
The very limited discussion here on the impact of forcing censorship on a significant portion of the network is highly concerning. ChainSafe stands firmly alongside Sigma Prime in recognizing that this is an unnecessary action that does not contribute or align with the open values of the Ethereum community.
For some Node Operators being compliant is a necessity, and we fully support those operators in doing so. For those who have the choice, we urge you to consider the implications of censorship.
We would like to see further discourse on this topic, with the hope that it will lead to a healthy dialogue surrounding censorship at the protocol layer, and foster further options for validators.
I’m also strongly opposed to requiring operators to use censoring relays.
I would prefer there to be no censoring relays whatsoever and for LIDO to require this. Failing that, it should give operators the choice.
I’ve discussed this internally with Paul and we disagree on how SigP’s validators should be managed, but have agreed to carry divergent opinions. Personally I would prefer SigP to cease functioning as a LIDO node operator than to engage in censorship.
The goal of Lido policy at it is at the moment is to emulate PBS as close as possible. You won’t have an option to censor Flashbots’ blocks in PBS and I think we should want to know how the landscape of MEV block building will end up if we continue as is - while Ethereum still can reconsider.
Having censorship resistance of Ethereum to hang on assumptions that will not hold true post-pbs (“node operators can exclude some block builders they don’t like”) is I think not the right choice.
Thanks @aidan and @michaelsproul for your input. I’d like to reiterate the objective and thrust of the policy, which is to attempt to implement a mechanism which mirrors enshrined PBS as closely as possible.
In enshrined PBS (based on current designs), the validator (block proposer) will not be able to differentiate between what’s in blocks, thus they will effectively process whatever block pays the highest, with the possible addition of txs via crLists (which can, and should, be implemented in out-of-protocol PBS as well, and I don’t see why something like this wouldn’t be suggested for adoption by Lido at the earliest possibility in keeping with the same principle). This is mimic’d by hooking up a proposer to as many relays (and thus builders) as possible, with the caveat that due to the fact that MEV Boost-based PBS requires trust assumptions between Relays and Proposers there’s a vetting process that otherwise wouldn’t be necessary.
The point of using as many relays as possible is to foster relay and especially builder diversity; it is builder diversity and builder dominance which ultimately ends being the decision maker with regards to whether transactions are censored or not, and for how long.
Using “non-censoring” relays doesn’t actually prevent potential censorship from happening, because “non-censoring” is not “anti-censoring”. e.g. if the best builder 6/10 times is a tx-filtering builder, and that builder is sending blocks to non-filtering relays, then those are the bids that the relay will send and will be proposed.
Until such a time that anti-censorship mechanisms (crLists and/or ideally encrypted transactions) can be made a reality, I think that the best way forward is to test the system “as it will be” so that the right solutions can then be designed and appropriately prioritized. These are failings at the protocol level (both from a design and incentive perspective) – they should be remedied as such.