A proposal for partnering with Nethermind to design a mechanism for a good validator set maintenance

TL;DR

This proposal is to fund Nethermind to deliver a Systematization of Knowledge for Decentralized Identities and Verifiable Credentials. During the project, a dedicated team will investigate what the state-of-the-art is and what solutions are used/planned to be used in practice, and how. The project is one of the steps toward allowing Lido to onboard new operators in a permissionless manner.
The project will take 6 weeks, and its cost — 150 000 DAI — will be covered by Lido DAO.

Proposer

Michał Zając on behalf of Nethermind.

Terminology

  • Operator: A party that runs, or participates in running, one or many Ethereum validators. Operators, solely or jointly, have access to validators’ signing keys but do not know validators’ withdrawal keys. Operators can be divided into nodes.
  • Node: A virtual sub-party (a piece of hardware and software) controlled by an operator that performs the operator’s jobs w.r.t. to a concrete validator. When an operator is a party that may control multiple validators, a node is its representation for a concrete validator.
  • Committee: With DVT (Distributed Validator Technology), multiple operators may jointly run a validator in a distributed manner. We call a committee the set of all nodes assigned to such validator.
  • White-label operators: If an operator delegates its tasks to another party, we call the latter a white-label operator.

Ideal mechanism overview

An ideal mechanism evaluates Lido’s DAO validator set according to the operator & validator set strategy described in this note by Lido. The mechanism has methods for improving the validator set if there is an option to do so. It has zero input from permissioned roles (i.e., there are no admins/committees). And it has an input of low to zero impact from LDO, stETH, and ETH token holders.

The mechanism has to be capital efficient: Collateral for operators can be used, but it can’t be the single or primary mechanism; it has to function mainly by staking with other people’s money.

The mechanism has to account for the bull-bear cycle effect in a way that would allow operators to stop validating if that becomes too expensive for them and for the protocol to contract the number of operators in bear markets and expand in bull markets.

The mechanism has to prevent the set of operators from becoming worse. This includes but is not limited to avoiding the following:

  • reduced performance,
  • offline time,
  • slashable offenses,
  • reduced geodiversity,
  • reduced Ethereum client diversity and other diversity vectors,
  • giving up independence (e.g., in a merger),
  • destructive MEV
  • delegation of operation has to reduce the amount of stake that an operator can get, potentially down to removing it from the set altogether.

Improving operational quality should increase an operator’s revenue (by increasing the stake or the commission).

The stake should be distributed flat-ish. No operator should control more than 1% of total ETH staked (globally).

The mechanism can’t overfit on any one parameter, but most importantly, it can’t overfit on performance: super-performant operators often cut corners or sacrifice certain attributes for others. There has to be a “good enough” level of performance.

The mechanism should allow for a new operator to enter the set of operators with essentially no collateral or reputation and work its way to an optimal position within the network of operators. That should be possible, although it may take a long time, if the operator has a “good enough” performance and is ecosystem aligned, independent, and runs its own hardware in non-concentrated geographical/jurisdictional areas. There might be a need for an insurance pool or collateral to enter at zero or to rise to the top, but it shouldn’t be an important requirement in the middle.

Objectives

We offer to help Lido with maintaining a high-quality validator set. This entails:

  1. Designing and implementing methods for assuring that validators are run by a high-quality set of operators. In particular, each operator performs its duties on its own and does not cede them to an external party (i.e. the operator does not hire a white-label node), is a proficient DevOps engineer, and ensures that its hardware and software run performantly.
  2. Conducting economic analysis to understand how market changes, or changes in the Ethereum protocol itself, can impede the system’s security and how to secure the system against unfavorable market changes.

The project will be divided into four phases:

  • Phase 1: We survey the literature and state-of-the-art approaches to identity and attestation schemes. See the Roadmap for Phase 1 below for specific details. The proposal focuses solely on this phase.
  • Phase 2: During this phase, we will survey the literature and state-of-the-art approaches to oracles, token-curated assets, and prediction markets.
  • Phase 3: Next, we will proceed to design solutions for assuring a good quality set of operators and economic security of the protocol. We will also describe the resources required to implement the solutions proposed in Phases 1, 2, and 3.
  • Phase 4: This phase is mainly concerned with implementing the solutions designed during Phases 1, 2, and 3. Additionally, we will research some extra topics and problems, as done in the previous phases, and afterward, we will implement them. Further information on this phase will be provided later, by the end of Phase 3.

About Nethermind

Nethermind is a team of world-class builders & researchers. Our work touches many parts of the industry, from our Nethermind node to fundamental cryptography research and application-layer protocol development.

Motivation

Permissionless operator set

Our research will first focus on ensuring a high-quality set of operators for the Lido network. A light-heartedly created set of operators could put users’ stakes at risk or even threaten the security of the Ethereum network. Hence the method for permissionless and secure operators’ evaluation, addition, and removal is crucial.

Although the whole set of operators should be of high quality, we need to allow newcomers to join the network as well. Newcomers may not have records good enough to be considered of high quality, but they should be able to work their way up to a high-quality status.

Another problem to study is how and when to arrange the operators into committees. Since the network allows newcomers that may be of a lower quality, it is essential from the network’s performance and security perspectives to ensure that high-quality operators have the required majority of voting power in all validators.

Since on-chain data may not be enough to assure a high-quality set of operators, it is important to design a mechanism that pulls off-chain data on-chain. Here we differentiate two sources of data: issuer data and community data. The former is taken from official institutions, trusted issuers, etc. The latter is taken from distributed communities. It is crucial for the data to be obtainable and verifiable. The quality of data determines the quality of the reputation and quality systems.

Economic analysis

Another crucial part of assuring a good set of validators is to create a robust incentive mechanism that assures that rational actor behave honestly and in a manner that helps shape the operator set according to our design goals (e.g. having operators be as diverse as possible), being this the behavior with the greatest payoff. To that end, an in-depth analysis of liquid staking economics incentivization methods is required.

We also note that an incentive mechanism is needed to obtain good quality data — both to have data pulled on-chain and to have it verified.

We also propose to analyze how users’ and operators’ incentives change if the proposer-builder separation is included in Ethereum.

General work mindset

The following principles will drive the development of the protocols:

  • All the design considerations and risk analysis will be done with the consent of the Lido DAO.
  • Nethermind will set up a dedicated team for this effort.
  • All proposed solutions will come with security analysis. When available, the protocols’ security will be proven.
  • Milestones and deliverables will be small to assure a good overview of the progress the team makes.

Project Objective

Phase 1. Decentralized identity and verifiable credentials. Systematization of knowledge.

  • We will start by investigating classical results in Decentralized Identity Schemes and Verifiable Credentials.
  • Then we will discuss the recent advancements in these two areas
  • Finally, we will investigate what solutions are used in practice (or planner to be used), how projects use them, what are the security assumptions and properties, what are the known roadblocks.

The deliverable will be a systematization of knowledge research survey.

The deliverable will be completed within 6 weeks from the date of the agreement.

Organization, Funding, and Budget

Nethermind will create a dedicated team to run this project.

The project will be funded by Lido DAO. The DAO will pay Nethermind 150 000 DAI on delivery.

At the end of the project, the LEGO council will decide whether the provided systematization of knowledge meets the agreed requirements and, if that is the case, proceed with the payment.

The payment will be made to address eth:0x237DeE529A47750bEcdFa8A59a1D766e3e7B5F91

Next steps

We would like to put this proposal to a vote in 7 days. The voting will remain open for 7 days.

9 Likes

Defining what specifically “good validator set” requires understanding & communicating surprising amount of nuance. Really happy the stellar Nethermind team agreed to lend a hand here & help Lido with this research, looking forward for the proposal to go live!

3 Likes

Snapshot vote started: ** A proposal for partnering with Nethermind to design a mechanism for a good validator set maintenance**
:clock2: the vote ends Oct 17, 2022, 5:00 PM UTC

3 Likes

Hi! It’s my pleasure to inform you that we have finished the first phase of our project, which was a systematization of knowledge for decentralized identities and verifiable credentials. Please see the details below. The deliverable can also be found here.

Systematization of Knowledge for Decentralized Identities and Verifiable Credentials

On behalf of Nethermind Research, in fulfillment of Phase I of Research for Lido DAO.

I. Introduction

In this systematization of knowledge, we examine the current innovations and approaches to the field of decentralized identity (also self-sovereign identity), as well as its relevant foundations. In the words of the Ethereum foundation, decentralized identity is “the idea that identity-related information should be self-controlled, private, and portable.” Accordingly, an introductory article by Dock Labs defines decentralized identity as “a type of identity management that allows people to control their own digital identity without depending on a specific service provider.”

Users can construct their decentralized identities from various data sources—whether from interactions happening on a blockchain, information gathered from major social networks or centralized websites, or even an ID issued by a government or an educational institution. Self-sovereign identity implementations then store this data (encrypted or not) in a document on a distributed ledger (such as a blockchain), and associate this document to a set of keys in control of the user, which can be used to assert ownership of the data. Thereafter, a unique address pointing to this document is generated to facilitate access and communication. These addresses are known as decentralized identifiers (DIDs)

In order to make use of this identity, users are able to request or create verifiable credentials, which are cryptographically-verifiable claims to a third party about the data which conforms their identity. Verifiable credentials give the user control of exactly which pieces of information are shown; they also give information requesters tools to combat forgery and fraud.

Understanding current solutions in the landscape of self-sovereign identity is relevant to Lido’s aim of increasing the quality of its validator set in a distributed fashion. A mechanism that decentralizes Lido’s validator set will require robust methods for identity management and authentication. The compilation and analysis of the research sources herein represents the first step towards a state-of-the-art-informed design of a mechanism fulfilling Lido’s goals.

Transferring identity data to Web3

We have mentioned how inputs from Web2 may be used in order to construct a decentralized identity. As examples, one may consider a reputation score on Reddit, the number of stars on repositories created by a user on GitHub, or even the existence of an open TLS connection with a government website, which a user presents as evidence of citizenship from a given country.

In the context of Web3/blockchain, one reason to be interested in bridging Web2 data to Web3 is that it may provide some degree of resistance to a *Sybil attack—*that is, the ability for a malicious entity on a decentralized protocol to create an arbitrary number of identities and gain disproportionate influence over it. If, for example, we require a decentralized identity to bridge a reputation score from Web2 that is valuable enough, then this mechanism can complicate the creation of numerous identities by a single entity.

Besides Web2, there are alternative sources of off-chain information that one may attempt to transfer to Web3 in order to create an identity. Among these, we count government IDs, institutional credentials, and even biometrics. The goal behind using this data remains the same: using sources that are valuable enough to facilitate identification and obfuscate the creation of Sybils.

The main technical challenge when following this approach is: how do we pull this data in a verifiable way, so that the system is not likely to be exploited? For example: are oracles to be used? If so, what incentivization mechanism is used in order to enforce their honesty?

Due to their potential for building Sybil-resistant solutions (and for making decentralized identities more meaningful in general), we will pay special attention to implementations which explore transferring identity data to Web3 in a way that is trustless, or verifiable.

Additional introductory reading

The interested reader who is not previously familiar with decentralized identities may benefit from the following introductory posts:

II. Preliminaries

In order to read the systematization of knowledge, familiarity with some technical concepts in blockchain and cryptography is advised. For review purposes—as well as for standardizing the concepts to be used—, we have prepared the section below.

:books:Preliminaries


III. Paper database

The following database organizes the results of our work.

:card_file_box:Decentralized Identity and Verifiable Credential systems. Paper database

In it, a collection of 70papers and protocols have been selected and analyzed as follows:

  • A summary note was prepared for each paper, which can be accessed by clicking on each paper’s title.

  • Papers were rated from 1 to 5 according to their quality and originality. This is reflected in the “quality score” column.

  • Papers were rated according to how relevant they are to Lido’s mechanism design problem. This is reflected in the “relevance score” column.

IV. Selected papers

Finally, we highlight a selection of papers which were rated as highly relevant. Readers are advised to study these first.

Classical papers

The Sybil Attack

Work of Camenisch and Lysyanskaya on Anonymous Credentials

Decentralized Identity and Verifiable Credentials

W3C’s Decentralized Identifiers data model v1.0

W3C’s Verifiable Credentials Data Model v1.1

[Decentralized Society: Finding Web3’s Soul (Soulbound tokens)](https://nethermindeth.github.io/lido_phase_1/Database%2011b9b21206a8466191f8587fb73edf58/Decentralized%20Society%20Finding%20Web3’s%20Soul%20(Soulbou%20d1a47c2d4f334040a09ba956b90fc9f8.html)

Decentralized Anonymous Credentials

Zero-knowledge credentials with deferred revocation checks

Web2 to Web3 data

CanDiD

DECO

TLSNotary Proof

Project implementations

Polygon ID

Sismo

[EBSI (joint initiative from the European Commission and the European Blockchain Partnership)](https://nethermindeth.github.io/lido_phase_1/Database%2011b9b21206a8466191f8587fb73edf58/EBSI%20(joint%20initiative%20from%20the%20European%20Commissio%20ba190ea5d9c64af18d7d3558b09f4d25.html )

Interep (by PSE)

7 Likes

Thank you @mpzajac and to the rest of the team for the serious amount of work that you put in to put together this knowledge base. It would be get your team on one of the upcoming Community Calls to tell us a little more.

Are there any takeaways from the work you’ve done so far, especially regarding the papers that you’ve rated highly in terms of quality and originality and found “most relevant” for the research questions at hand?

5 Likes

Hi @Izzy, sorry for the late reply. Let me deliver the takeaways in batches :slight_smile:

Points on DIDs/VCs

  • The most common problem in the verifiable credential literature is the centralized issuer. Namely, there is a party that is mutually trusted by the user (credential holder) and the verifier. There needs to be more discussion on how to efficiently instantiate such issuers in a decentralized manner. See Decentralized Anonymous Credentials
  • Several papers in the database have pointed towards the need of standardization when implementing DIDs and VCs. Fortunately, we have strong recommendations from W3C, which are being followed by a multitude of teams. Some of these recommendations are not as specific when it comes to combining zero-knowledge and credentials, however. See Towards a standardized model for privacy-preserving Verifiable Credentials.
  • There is an excellent line of work by Camenisch and Lysyanskaya, who showed how to practically build anonymous credentials that allow entities to claim their properties without revealing anything else about their identity.

On getting Web2 data to Web3

  • With regards to the problem of getting Web2 data to Web3, there is a lot of data in Web2, yet most of it is hidden behind a TLS protocol. That is, it is accessible for users, but those cannot show it in a verifiable manner to other parties. This is because the TLS protocol establishes a symmetric cryptography-based connection between the user and server. We have found a series of protocols (CanDID, DECO, TLS Notary) which aim to solve that problem by allowing the user to include in the communication with a server a verifier who could verify the correctness of this data.
  • On the other hand, quite little has been written about establishing identities using different sources of data than governmental data. Most approaches assume that the input data is good and trusted. Few projects discuss how to set up a Web3 identity using Web2 data/data that may be, to some extent, coerced. Among these, we count Interep (by PSE)

Other sources of identity data

On Sybil resistance

  • Virtually no papers try to solve the problem of Sybil or white-label identities. Namely, the users are usually allowed to create as many identities as they wish. This is often a desirable feature, but it is not so in the case of identifying prospective operators.
  • Although our research on Sybil resistance has not fully been launched yet, we have found papers like The Sybil Attack, which show fundamental restrictions behind the mechanisms for detecting and discarding Sybils. For example, we have seen that malicious nodes can easily spin off an unlimited number of nodes if the only requirement to access the network is possession of resources checked from time to time.

Implementations

  • Finally, some interesting implementations to pay attention to (which could be used as a part of a final solution) include PolygonID, Sismo, Interep, EBSI and Coconut.
4 Likes

Thank you the great research & summary!

Hey @mpzajac thank you for this huge work being done! :partying_face:

The DAO previously had a Snapshot approving this proposal and LEGO council also reacted very well to the results of it.
We created an EasyTrack motion to top up LEGO multisig for your grant to be disbursed to you.

2 Likes

@mpzajac the grant was disbursed in two transactions: test tx, main tx.

Thank you once again!

3 Likes