Discussion - Treatment of Potentially Harmful Incidents

Izzy · June 23, 2023, 8:10am

To revive this thread a little… I will work with the NOM workstream will try to put together a discussion document for interested DAO stakeholders & NOs to discuss to roughly outline what a “guardrails” approach to treatment of potentially harmful incidents (or things like sustained malperformance) in order to advance discussion on having a more concrete approach. I will aim to have this draft ready the week of July 3rd.

I would love to see more initiative/collaboration amongst NOs for something like this. I’m not sure what the best way to foster this (grants? putting everyone in a (virtual) room to workshop it out?) is, but open to suggestions.

Because the above might take some time and there is a question of the actual recent slashing, I think the DAO should have a more concrete conversation about how to treat these specific types of incidents and what we should do in this specific case. It’s been quite a while since the event, and all stakeholders would benefit from clarity.

I think slashings should be (and have been) rare enough that decisions can be made ad hoc, but would benefit from a general framework for them. With that in mind, my thinking is below (trying to establish a set of static constraints, but with enough allowance for context). IMO it’s incredibly important that we get input from as many parties as possible here (especially NOs), so the below is just my 0.02 to serve as a starting point.

Factors around the incident itself that might be considered:

was the act malicious or not
the proximate cause of the event
whether any best practices, infrastructure setups or configurations, common safety measures, or reasonable processes / mechanisms could have prevented the slashing from happening
how quickly was the issue identified and by whom
how quickly was the issue resolved

Consequences, could depend on:

Which “module” did this happen in; or, more broadly, what are the trust assumptions associated with the affected validators (e.g. are the validators unbonded)
impact of the event (in case of slashing, finance impact can be small but damage to trust can be high, etc. for example: how does the event affect the trust assumptions between stakers, the DAO, and the NO)
extenuating circumstances (what are the pros/cons of this decision, and what other substantial things may need to be taken into account – e.g. does the NO somehow bring key value to the protocol?)
what other options there are for the node operator to participate
what is the status of remediation of the issue
is there a way to gauge likelihood of something like this happening again, and if so what is the assessed likelihood
how can remediation be assessed and if it can, is the remediation deemed satisfactory

For example, in the current “curated operator set” on Lido on Ethereum, the options may be:

Do nothing
Warning (do nothing w/ the condition that the next time the consequence is one/any of the below)
Limit the Node Operator’s key count for a certain period of time
Decrease the Node Operator’s key count (by prioritizing those keys for exit)
Offboard the operator (with the ability to rejoin the permissioned set at a later time)
Offboard the operator (without the ability to rejoin the permissioned set at a later time)

Topic		Replies	Views
Slashing Incident involving RockLogic GmbH Validators - April 13, 2023 Node Operators	37	12466	August 10, 2023
Sushi RouteProcessor2 Post-Exploit Request For Comment Proposals	22	10439	May 1, 2023
Redirecting incoming revenue stream from insurance fund to DAO treasury Proposals	25	16672	October 21, 2022
Extension of probation period for the new Lido on Ethereum operators Node Operators	5	1778	November 2, 2023
Introducing NO Bonding and Increasing Stakeholder Incentive Alignment Proposals	3	3723	June 15, 2023

Discussion - Treatment of Potentially Harmful Incidents

Related topics