Announcement: Onboarding for (Terra Wave 3)

Hey Spaydh,

Thanks for the explanation. I think you should also be clear about what you expect from validators in advance.

You just mentioned that relying on AWS is considered to be a negative factor. In our case, for example, we are happy to re-think our infra setup if there is an expectation that no cloud providers should be used. Or we can use datacenters in specific preferred regions if that helps decentralisation.

We strongly believe that participation in Lido’s and similar programs should be mutually beneficial for both validators and organisations. But that could only be possible if interests of both parties are communicated in advance. Otherwise, we are just shooting the form without even knowing what are the most important priorities.

Thanks

First of all.
Thank you for all the effort you have put in so far.

You had 60+ applications, and there is no ‘standard’ about what makes a good validator, so you’re going to get people upset as the criteria they believe wasn’t present/weighted as heavily as they expect.

so let me give you my 2c’s :slight_smile:

For me the core job for a validator in any network is to ensure that blocks are valid, and that there is consensus so that the network can move forward.

So the first thing we need to measure is risk of the network.

So I usually look at these similar to how an insurance company looks at, with the ‘eggs all in one basket’ approach. (i.e. if something outside of anyone’s control happened how would that affect the network).

Could you imagine if we had 70% of all servers in a country who had their internet shut off? or if there was a law brought up similar to the one in China banning bitcoin mining?

So criteria to diversify aspects, like regulatory body oversight, hosting provider, and physical geographic placement is important. (and I believe you have done a good job on that).

It would be cool if you had a tool where you could measure where you are today, and what effect adding a validator would do to increase the overall decentralization score of your validator network, and the chain being onboarded, as well as all the chains you participate in

The second thing we need to measure is the risk of the validator itself.

Things like failover, operational & security capability, & track record are important factors.

For these you have access to information supplied that the general public doesn’t.

We see things like ‘uptime’ and ‘slashing’, but to be honest you can achieve 100% uptime by simply not rebooting my machine and giving it a big drive, and relying on your internet provider to do their job.

An interesting test you could do is the ‘mystery shopper approach’ where you attempt to contact a validator, and see how responsive they are to your concerns,
or potentially ask them to do a failover test to see how good their recovery plans actually are.

And lastly I would go into things above and beyond.
Services they perform for the network above and beyond their core role as a validator. Relaying, Participation in forums, twitter, onboarding new users etc.

As for governance voting as a metric, I think it’s important, but too easily gamed.

so I would put more weight in scores around the network, then the validator operational capability, and lastly their differentiators.

(note I am on Shortlist A)

5 Likes

Seems like there is a lot of ‘emphasis’ on how well applicants ‘write’ about how good their infrastructure is vs the true historical performance of the node.

Should we start giving olympic medals to participants based on how well they write about their training instead of how well they performed?

What’s the point of writing down that you have 1,000 backup nodes or 1,000 people monitoring the node if nobody can prove it?

Does the above even matter when there are transparent statistics on how well the node perform?

In fact any person serious about the security of their node infrastructure will be wary of giving away too much information on their setup to a third-party because this poses an enormous security risk.

1 Like

I am Do Kwon, founder of Terraform Labs

In my experience, just because a validator is “professional” doesn’t mean its actually good - for example, when i reached out personally about a “professional validator”'s node being down their CEO lashed out for me pinging the CEO directly for something so trivial.

Whereas there were multiple community validators that are 100% focused on terra, are responsive, and have dev capabilities that were sidelined in this selection process

ultimately there are two dimensions of:

  1. picking validators that will minimize overall slashing risk and maximizing aggregate uptime - whether the selection optimized for this function well is debatable, but there are fair arguments on both sides. But I feel like the decision criteria here should be driven by empirical data (previous history of uptime, slashing events, and voting correctly on oracle votes), and not things like “24/7 monitoring”

  2. community support - the community will act to defend its own interests, and it will gravitate towards supporting other staking protocols if its most influential validators are sidelined by Lido.

Validators are rewarded for contributions they make towards the network, and #2 should be weighted heavily in consideration as well.

7 Likes

Historical data is important as baseline estimation, but overall established org structure is good to reduce tail risks like burning out, being hit by a bus etc. Have to weight both, and we do, but the implicit weights in this assessment were skewed.

2 Likes

I agree that we can be a bit clearer in what we expect in advance, however we avoid being prescriptive on purpose, so there is a delicate balance that we are trying to find here. For one, it will lead to people tailoring responses, and two, it will create an indirect centralizing effect as potential applicants try to all conform to what they think we want and we end up with a very homogeneous infrastructure setup, which is definitely undesirable. If you have a certain infra setup but you are flexible in changing it, that’s something you can (and should) include in your application. Given the amount of interest we receive, and what applicants are basically applying for here, I believe it’s reasonable expect that applicants should be applying their professional judgment and are able to discern this and not rely on us dictating it to them (after all, if we spell it out then we lose the additional value we get when it’s clear to us than an applicant has considered this on their own).

Regarding AWS specifically: using AWS or a cloud provider on its own is not a negative factor. What we try to do is find a balance in each cohort of applicants – so we may end up with some applicants who are using cloud, other operators who are running in smaller DCs, operators running in “common” DCs, etc. This is what we mean by varied setups – having some operators on AWS isn’t bad, an overall reliance on AWS (or any other cloud or DC provider) across our operator set is bad, and that’s what we try to avoid.

1 Like

Great discussion so far.

With decentralization being one of LIDOs core principles, the LNOSG is doing a tremendous job in finding a good balance in every aspect.

@Izzy “Importance of having decently sized teams to mitigate “bus factor” risk.”

Could you share what the LNOSG views as a decently sized team to mitigate these risks?

Thank you!

2 Likes

It’s a hard question to answer definitively because it depends on the staking mechanisms and risks related to the specific chain we’re onboarding for, and how many chains/protocols the operator is working on. For example, “bus factor risk” (or rather the possible impact if this happens) is higher for chains like Ethereum in the context of “pre-merge, pre-withdrawals, pre-triggerable exits”, and lower for chains like Terra where in the worst-case scenario you can delegate stake away from a validator that for some reason ends up “non-responsive”, or Polygon which doesn’t have slashing currently.

In general we try to be cognizant of operational burden intrinsic to “small shops” and the risk of something going wrong in such cases. Smaller orgs (let’s say < 5) are not automatically disqualified of course (not even one man shops). I think we just miscalculated the Terra community’s comfort-level with (well-performing + active) smaller orgs and over-compensated for possible risk.

5 Likes

Ok now at the very least you gotta remove this validator

It’s almost negligent not to

3 Likes

We’ll be releasing a second shortlist proposal for the DAO to consider. Delegations from Lido are also subject to the DAO’s approval as well as specific performance requirements, which were recently formalized in the Lido on Terra baseline. At the time of the Validators Registry update (to add operators), validators that do not meet the baseline in terms of block signing and oracle voting won’t be whitelisted.

The idea is to enforce baseline requirements at the time of the on-chain changes to the Validators Registry, and on an ongoing basis. We are currently preparing an update regarding some of the validators from the current set who are falling short of the baseline.

1 Like

Folks, a short update regarding Shortlist B. We’ve gathered the data, made all required evaluations, and currently working on finalizing the list.

Taking time to come up with the final list and double-checking everything, ETA is to have Shortlist B on Wednesday, Apr 6 at the latest.

5 Likes

Seems good!
Although the baseline you linked does not seem finalized.
Last update on February 15 (Baseline for Lido validators on Terra - #5 by kai) in response to the comments on governance participation threshold still isn’t reported in the main guideline.
Could you confirm which version is used and post an update there?

(Also side note, still interested: where cab I find any information about the LNOSG and its composition?)

2 Likes

Hello my name is Liviu from Easy 2 Stake

We fully understand that, in this area and type of service, any issue may be considered capital from the exterior, but we want to deepen the explanation of how things are in fact.
Before going further we’d like to make some things clear:

  • We do appreciate those who take a proper approach and are pointing up issues that can do damage to the community, but we will direct our efforts into answering only to objective matters

  • We do NOT appreciate unfounded jokes and comments from other individual’s personal reasons. If you consider yourself a professional, you try and act like one, as ego is irrelevant when offering services.

We have a FREE load balanced LCD/RPC public service that we also used up until now for the oracle submissions, because it was highly reliable. What happened and why it started to fail? There are multiple factors that we’ll try to briefly list. We’re hosting what we call a decent pool of nodes at https://terra-lcd.easy2stake.com for the community to use when needed. Even though it’s being spammed by bots, we trusted this service to be unbreakable due to frontend and backend redundancy, protections used and HW power. We trusted the service so much that we thought it’s a good idea to move the feeder LCD endpoint towards it. Obviously not a best practice but we were overconfident and therefore we did so. And this is how our journey started, a few days later the entire pool dies due to “wrong Block.Header.LastResultsHash” error.

Therefore, there are two consecutive unfortunate events we are talking about: Two oracle timeouts each one consisting of a few continuously hours of interrupted service.

  1. First outage the entire pool failed around midnight, most probably due to an unknown bug that we think it may be related with the memory leak issues inserted once with v0.5.16. This was documented and support was requested the next day it happened.
    With the entire pool dead, both feeder and alerting were down. We did not have a critical alert configured in this scenario and the issue was detected next day. Meanwhile, and alert was configured for this scenario also.

  2. Second outage, around 1:20 AM the pool didn’t fail, everything else was working but the feeder froze. Why did it freeze? We’re guessing that one or more request timeouts led to this, we don’t know yet. It’s still under investigation. OK so where is the alerting system? Did it work? In fact, it did, and response came 3 hours later around 4:30AM in our on-call TZ. It could’ve been better/faster, we know and agree to this, but it was a human error and there will be a process implemented in order to cover for this kind of scenario.

Our relationship with Terra is 3 years old and Easy 2 Stake has been supporting Terra since the genesis. (Countdown to Terra Mainnet Launch | by Terra | Terra Money | Medium) in this long period, and no such issues were reported.
In the past three years, our validator had a 99.99% uptime, never jailed, always in time with updates and active after every hard fork this network has seen. Regarding some comments inside Lido community, we’d like to call upon reason to stop bullying around. We notice that there are some voices spreading hate by using these events in their favor. We have nothing against pointing out these issues and we’re complying with any decision it’s taken. For those who think trashing others is the way up, we want to remind them that competition is about character and quality of services.
Complex architecture often is a big burden till everything is optimized and working correctly. It’s a continuous work, which we take very seriously.

On a personal note, we are utterly surprised with Do’s position on this, and we want to point out this wasn’t negligence in operating our services. Fast judgmental verdict without the full context, does more harm than providing a chance for improvement. This turned into a serious blow for the Easy 2 Stake team, and we feel this response from Do Kwon is disproportionate in relation to our track record.

As he mentioned a few days ago, above on this thread, we fully comply with his criteria:

  1. “picking validators that will minimize overall slashing risk and maximizing aggregate uptime - whether the selection optimized for this function well is debatable, but there are fair arguments on both sides. But I feel like the decision criteria here should be driven by empirical data (previous history of uptime, slashing events, and voting correctly on oracle votes), and not things like “24/7 monitoring”
    Using “things like 24/7 monitoring” (while fixing a service at 4:30AM proves it) just as an argument against us while Lido criteria does not include any of this is just out of context and it looks like aggression. We remind that Lido criteria is the following:

https://research.lido.fi/t/baseline-for-lido-validators-on-terra/1656

As a conclusion, we did our due diligence in coming with this factual explanation, focusing on how things really took place. As much as we take full responsibility for this, the system is still prone to human errors and absolutely converted this situation into a learning opportunity.

In the end, we will continue doing what we do with continuous improvements and providing good services, as the community benefits more from progress than from this type of situation.

Cheers,
Liviu

5 Likes

Hi Liviu,

Thanks for your input here. I really appreciate this much details.

First of all, as part of the community, If not clear already, I have no intentions to damage anything but rather improve it. Hence we are having this entire discussing and questioning Lido’s criteria in choosing what’s best for Terra and Anchor based on facts, not feelings.

No one is directly questioning your way of doing or how you have it setup. In fact, we are only questioning Lidos’s criteria as i mentioned. The 24/7 monitoring was one of Lido’s excuses when doing the final shortlist, It’s not a joke to you and definitely not an attack!

As you well described, you had outages and unfortunately impacted your Oracle voting (check past 60 days). Running a public LCD into the same layer of your validator (validator != node) is surely not a good practices, and in fact, might be the root cause of why you are missing Oracles voting daily if you ask me, I don’t know! However, We are not here to discuss your setup or implementations or even point errors at each other. Once again, I’m here to question what sort of metric Lido used and transparency.

As you know, missing oracles, directly impact Anchor and its performance. The negligence was from Lido that did not check any of these things at very firs. I guess you are take things out of context a little, its not personal.

We are all learning, it’s a constant. None knows it all. There’s no place for ego as you said and I’m sorry if I ever make you feel that way, it surely wasn’t my intention.

To wrap up, I really appreciate you coming here and explaining yourself and saying there is always room for improve, and I agree with it 100%. Although, you missed the point that what we are doing is not directly related to you or personal. You are just one of many that did not perform well but were surprised selected by Lido.

I came in peace!

2 Likes

Hi Vini,

Thanks for reminding everyone that we’re all indeed here to discuss facts rather than feelings.

Easy 2 Stake has been a long standing validator, doing tremendous work for Terra and hosting a public RPC/LCD node. The outage you mention happened on the 29th and 30th of March. The LNOSG evaluation, on the other hand, concluded one day prior, on the 28th of March. We therefore had no reason to exclude Easy 2 Stake from the Shortlist A, at the time it was put together.

Hi everyone, I wanted to contribute to this discussion and share our view as a young validator.

These kinds of discussions are not easy to have. It is great to see a structured discussion on this. In general, I think everybody acknowledges that selection is always a difficult process.

For us, as danku_zone, we would appreciate full transparency in the evaluation process, as we could use this as a benchmark for improving our current service level. Also, I firmly believe that everyone will have a better time accepting certain results, once the results are shared.

This doesn’t mean that results need to be released publicly. Sharing directly with all applicants would be a great step. Validators eager to share could still compare their results afterwards.

Best
danku

3 Likes

Hey, Danku! Thank you for your feedback. Really appreciate your comprehension of the complexity of this kind of stuff.

Sharing all the evaluation details was not planned from the beginning so the whole process isn’t built the way we can share all the details besides those principles of decision making we’ve already shared. That’s definitely a thing we will put the effort into in the next rounds.

Still, if there are validators that want to have extended feedback — we are always open to discussing it directly via Telegram (or any other communication channel) or setting up a call.

2 Likes

Announcing Shortlist B

Background

On the 28th of March, 2022, the LNOSG released a list of 21 shortlisted applicants (and 2 waitlisted applicants) for the DAO to consider onboarding onto Lido’s validator set on Terra. Following the release, we received a lot of constructive feedback from the community, which questioned our focus on infrastructure and setup, arguably at the expense of community representation.

To address these concerns, as announced on the 30th of March, 2022, we set out to release an alternative list (dubbed ‘Shortlist B’) whereby a lot more weight would be given to metrics measuring a validator’s commitment to the Terra ecosystem.

Today, we’d like to provide an update on Shortlist B, and explain our methodology, the three lists it produced and our proposition for Shortlist B.

Methodology for the Ecosystem Alignment Score

While the LNOSG’s methodology already evaluated a validator’s participation to the ecosystem, it arguably gave these aspects too little importance when compared to the community’s expectation. Measuring a validator’s participation to the community is necessarily a subjective process. To remove as much subjectivity as possible, we limited the scope of our analysis to the content of the applications themselves, and created a four-pronged criterion which we believe gives a clear picture of a validator’s commitment to the Terra ecosystem.

  • Governance participation: we measured the participation rate of all applicants using our own data as well as the Governance Participation Score from Smartstake.
  • Seniority: we valued experience and long-term involvement in Terra using age data from all applicant validators.
  • Public services: we created a composite score that takes into account open-source development efforts, IBC relaying, maintaining public RPC/LCD and other efforts such as maintaining alternative front-ends, faucets, etc.
  • Content creation, community building and user-facing tools: we rated the involvement of applicants in the production of Terra-focused educative content or entertainment in the form of videos, threads, podcasts, Spaces and articles, the active involvement in maintaining Terra-focused discord or telegram communities and the development of user-friendly tools such as alert bots, data dashboard, etc.

Note that this methodology is purely focused on the involvement of a validator within the community, and that it does not evaluate performance nor reliability. So, in order to produce a ranking that reflects both a validator’s setup, performance and its alignment with the community, we combined the alignment score with the LNOSG scores.

In our search for a well-balanced list, we calculated results for three sets of weights:

  1. 25% Ecosystem Alignment Score / 75% LNOSG Score

  2. 50% Ecosystem Alignment Score / 50% LNOSG Score

  3. 75% Ecosystem Alignment Score / 25% LNOSG Score

Unfortunately, we had to remove two candidates from the third list, despite excellent alignment scores, because said validators were not running on their own infra and/or recently faced a slashing event, which we consider redlines.

We also had to remove Easy 2 Stake and InfStones from all lists, including Shortlist A, because of recent outages that brought their performances below the baseline requirements.

Results

Below is a table with the results of our modeling. The second column contains Shortlist A, and is intended as a reference. The third, fourth and fifth columns contain the list of the top ranking validators for each set of weights (25/75, 50/50 or 75/25).

Number Reference: Shortlist A 1 (25% Alignment Score) 2 (50% Alignment Score) 3 (25% Alignment Score)
1 Chainlayer Stakely Stakely PFC
2 Stakely Chainlayer PFC Stakely
3 RockX Terran one Terran one larry stakehouse :cut_of_meat:
4 BridgeTower Capital RockX Chainlayer Orbital Command
5 Allnodes Inc BTC.Secure BTC.Secure Terran one
6 Blockdaemon Allnodes Inc Orbital Command Setten
7 RBF Staking PFC MissionControl MissionControl
8 Terran one BridgeTower Capital Solva Blockchain Solutions (CryptoCrew Validators) BTC.Secure
9 BTC.Secure MissionControl Setten Solva Blockchain Solutions (CryptoCrew Validators)
10 SkillZ Solva Blockchain Solutions (CryptoCrew Validators) Delight Labs Delight Labs
11 Coinhall AUDIT.one (Persistence Staking Pte. Ltd) larry stakehouse :cut_of_meat: Chainlayer
12 AUDIT.one (Persistence Staking Pte. Ltd) Coinhall AUDIT.one (Persistence Staking Pte. Ltd) AUDIT.one (Persistence Staking Pte. Ltd)
13 Cosmic Validator Orbital Command Coinhall SynergyNodes
14 Solva Blockchain Solutions (CryptoCrew Validators) Blockdaemon RockX Luna Station 88 - No legal entity
15 MissionControl RBF Staking BlockNgine BlockNgine
16 moonshot Delight Labs 0Base 0Base
17 Delight Labs BlockNgine Kytzu Kytzu
18 PFC Cosmic Validator moonshot Coinhall
19 BlockNgine Setten Allnodes Inc Stakebin
20 Orbital Command moonshot Fresh luna moonshot
21 0Base 0Base SynergyNodes Staker Space
22 Fresh Luna Stakebin AuraStake
23 Kytzu MANTRA DAO High Stakes Switzerland

Above is an infographic where each column represents a proposed list, with Shortlist A on the left and the third weights for Shortlist B on the right. Validators are disposed and color-coded according to the lists they appear in. For example, PFC was selected for Shortlist A, as well as by weights 1, 2 and 3 of Shortlist B, whereas Fresh Luna was only selected by weights 2 and 3 of Shortlist B.

Proposal and rationale for Shortlist B

Taking a closer look at the data behind the scores, we can see that while the B weights expectedly produced balanced results (alignment and performance scores are high across the list), the A and C weights produced quirky results:

  • 22 out of 23 selected applicants appear in both Shortlist A and the first list (25% Alignment / 75% Performance), albeit at different rankings. Thus, we deem this set of weight to be unsatisfactory, as it fails to propose a meaningful alternative to Shortlist A.
  • The distribution of scores within Result C is interesting: at the top of the list is a cluster of about a dozen validators with excellent performance score as well the highest Ecosystem Alignment Scores reached across all of the 65 applicants. Below this cluster though, there is a cliff, after which validators score relatively low on both ecosystem and performance.
  • Beyond the four validators excluded for crossing redlines, at least two more validators from the third list had such low performance ratings that they would not normally be considered for shortlisting.

Consequently, we recommend adopting the second option (50% alignment score) as Shortlist B, as it significantly promotes ecosystem alignment, without introducing significant discrepancies in performance within the validator set.

What’s next?

Now that the proposal for Shortlist B has been released, we invite everyone for review. To ensure the community has ample time for consideration, we will be letting the proposal sit until Monday, April 11.

Once this window closes, we will be carefully reviewing the community’s feedback and adjusting the proposal if needed. Shortlists A and B will then be submitted to the Lido DAO for review and discussion.

NB: shortlisting is contingent on an affirmative response from the identified applicant that they wish to continue with onboarding.

Finally, the DAO will be called upon to a snapshot vote with the following options:

  • Onboard validators from Shortlist A,
  • Onboard validators from Shortlist B,
  • None of the above.
12 Likes

Could you please share the list with all applied validators? As we didn’t get to any list, we are wondering where we stand compared to others.

The reason I am asking is that our performance in terms of missed pre-commits and oracle votes is better than the performance of 21 of 23 shortlisted validators. And yet we didn’t make it.

Thank you Kai. This looks far better than shortlist A. Much more solid. Unfortunately, SCV did not make it this time. Appreciate all the hard work and extra work into this and for gathering Terra community feedback.

5 Likes