Execution & Consensus Client Bootnodes

Summary

There is currently a (lengthy) discussion taking place (which I started) on ethresear.ch about the centralisation and reliance on third-party cloud services that current (execution and consensus) bootnodes exhibit. To avoid repeating the whole discussion thread and feedback, I recommend everyone to quickly read through the ethresear.ch thread.

This post is intended to draw attention to the ongoing discussion and to engage Lido Node Operators (NOs) in a possible solution of providing community-based execution and consensus bootnodes in addition to client team bootnodes. The preferred solution is to distribute as many bare metal bootnodes as possible across different jurisdictions.

Overview Execution Clients

Go-Ethereum

Nethermind

  • Mainnet bootnodes: nethermind/foundation.json.
  • 34 bootnodes running. 4 of the 32 bootnodes are the Geth bootnodes running on AWS (2 out 4) and Hetzner (2 out 4).
  • For the remaining 28 bootnodes, I still couldn’t find the hosting locations. However, they use the same bootnodes as in the original Parity client: trinity/constants.py. Nonetheless, all without information on where hosted.
  • In this commit Remove deprecated EF bootnodes (#5408) the 4 Azure bootnodes got removed.

Erigon

Besu

  • Mainnet bootnodes: besu/mainnet.json.
  • 14 Bootnodes running. 4 of the 10 bootnodes are the Geth bootnodes running on AWS (2 out 4) and Hetzner (2 out 4). Additionally, 5 legacy Geth and 1 C++ bootnodes are listed. Nonetheless, all without information on where hosted.
  • In this commit Remove deprecated EF bootnodes (#5194) the 4 Azure bootnodes got removed.

Overview Consensus Clients

Lighthouse

  • Mainnet bootnodes: lighthouse/boot_enr.yaml.
  • 13 bootnodes running. The 2 Lighthouse (Sigma Prime) bootnodes are currently hosted on Linode in Australia (information via this comment). Additionally, 4 EF, 2 Teku, 3 Prysm and 2 Nimbus bootnodes are listed. Nonetheless, all without information on where hosted.

Lodestar

  • Mainnet bootnodes: lodestar/mainnet.ts.
  • 13 bootnodes running. The 2 Lighthouse (Sigma Prime) bootnodes are currently hosted on Linode in Australia (information via this comment). Additionally, 4 EF, 2 Teku, 3 Prysm and 2 Nimbus bootnodes are listed. Nonetheless, all without information on where hosted.

Nimbus

  • Mainnet bootnodes (pulled via submodule): eth2-networks/bootstrap_nodes.txt.
  • 13 bootnodes running. The 2 Lighthouse (Sigma Prime) bootnodes are currently hosted on Linode in Australia (information via this comment). Additionally, 4 EF, 2 Teku, 3 Prysm and 2 Nimbus bootnodes are listed. Nonetheless, all without information on where hosted.

Prysm

  • Mainnet bootnodes: prysm/mainnet_config.go.
  • 13 bootnodes running. The 2 Lighthouse (Sigma Prime) bootnodes are currently hosted on Linode in Australia (information via this comment). Additionally, 4 EF, 2 Teku, 3 Prysm and 2 Nimbus bootnodes are listed. Nonetheless, all without information on where hosted.

Teku

  • Mainnet bootnodes: teku/Eth2NetworkConfiguration.java.
  • 13 bootnodes running. The 2 Lighthouse (Sigma Prime) bootnodes are currently hosted on Linode in Australia (information via this comment). Additionally, 4 EF, 2 Teku, 3 Prysm and 2 Nimbus bootnodes are listed. Nonetheless, all without information on where hosted.

Discussion

I would like to see Lido NOs providing community-based execution and consensus bootnodes in addition to client team bootnodes. The underlying reason is that IMHO it’s important to have globally numerous distributed bootnodes available to preserve the censorship-resistance core value and resilience of Ethereum as a whole. The exact governance process around the “vetting” of such community-based bootnodes is yet to be defined. Please share your thoughts and possible solution approaches to how Lido NOs can become part of this undertaking.

6 Likes

+1 on this

Also including list of current Lido NOs per rated.network for reference.

3 Likes

We could certainly be part of that solution. We run Lighthouse, Teku, Erigon, Nethermind and Besu currently, in AMERS, EMEA and APAC, on OVH

7 Likes

This sounds like an initiative that the DAO could support both through the participant node operators but also perhaps via LEGO (grants to cover expenses for running the nodes).

In the ethresearch thread however there seems to be a desire by the EF to tightly control the list of official nodes. Even if NOs were willing to run them, I don’t see a clear process for how they would be allowed to be included.

2 Likes

Thanks for pointing that out - I don’t think at the current stage there is any clear process behind the vetting of the bootnodes. Also, the client teams have to be part of this governance process as well. I’m pretty sure that the EF would be open to hearing any suggestions from your side. I’m tagging @Souptacular who was responsible for security at the EF I think until 2021 (and now works at Polygon). Maybe he can make the connection here?

4 Likes

I don’t see a desire for the EF to tightly control the list, I just think it isn’t an issue worth addressing (in my opinion). That doesn’t mean they wouldn’t be in favor of any clients adding bootnodes to their list, and in general technical teams at the EF make autonomous decisions.

holiman from the ethresear.ch thread is from the geth team and he seems to have indicated it isn’t a priority. My suggestion would be to focus on the other EL clients. That way, even if your attack scenario happens, it would only take geth bootnodes down temporarily (likely less than 12 hours) while the other clients remain unaffected and already synced and gossiped geth nodes will be fine.

3 Likes

Thanks for the feedback! Yes, I guess that’s the way forward for now. @Izzy (or anyone else) in case you know some folks from the other client teams, I think it would be good to bring them here into the discussion for coordination.

1 Like

I’m not particularly concerned about the state of boot nodes, personally.

When it comes to what CL clients are including out-of-the-box, it seems like we can choose between:

  1. A handful of boot nodes by already-trusted entities like the EF and client teams.
  2. A larger set of organizationally-diverse boot nodes.

(2) seems nice from a “decentralize everything” perspective, but I’m not really sure what it achieves in practice. We already trust the entities from (1), and if we don’t trust them then we shouldn’t be paying attention to the boot node lists they curate. A downside of (2) is that it gives a larger set of actors a guarantee that they’ll be one of the first nodes contacted by new nodes; this gives a potential attacker an advantage in 0-day attacks or data harvesting (e.g. mapping all nodes).

I agree that we (CL teams) could do better when it comes to cloud/regional diversity with boot nodes. We’ll look into this one our end and see if Sigma Prime can spread out a bit.

The scenario where the current list of boot nodes get compromised doesn’t rank highly on my threat matrix. In terms of probability, it seems unlikely that a majority of the EF and client teams get hacked; they’re a diverse group of experienced and security-minded engineers. Not impossible, but unlikely IMO. In terms of impact, CLs remember peers between reboots so it only affects new nodes. It seems likely that we’re going to notice such an attack pretty quickly and it wouldn’t be difficult for us to create messaging for users to start using custom boot nodes via CLI until releases can be published.

On a final note, I appreciate that Lido wants to try and help Ethereum with this initiative, but I respectfully note that there is some concern in the community over how much power Lido presently wields. I wonder if becoming client-enshrined boot nodes is something that will add to this concern.

2 Likes

The discussion here and on ethresear.ch is to ensure that the current infrastructure can survive an extreme censoring event. So, for obvious reasons, the probability is low, but there is a probability. One such example could be a supranationally coordinated censorship attack via ISPs. So in that case you might lose the peers during restart and need the bootnodes. Also, I want to generally emphasise that the argument “If we are in such an extreme situation, the world has bigger problems anyway” is not satisfactory (at least to me).

Sounds good - if possible, I’d love an update here on the forum once you’ve made some progress.

While I also see and share the concerns in general, my personal goal (and I hope you agree on this as well) is to have a multitude of players (like EthStaker, etc.) so that the “power” is not overwhelming. Anyone who volunteers to secure the base layer should get a spot as long as he/she adheres to the core values.

1 Like

For the sake of transparency, I’m also cross-posting here my analysis of the CL bootnode locations.


The below summary (on an ENR basis) can be found also here. Seems AWS (mostly US) is currently ensuring that no liveness failure is happening on Ethereum ;). I’m pretty sure we can all do a better job here when it comes to greater geographic and provider diversity.

IPs and Locations

Teku team’s bootnodes

  • 3.19.194.157 | aws-us-east-2-ohio
  • 3.19.194.157 | aws-us-east-2-ohio

Prylab team’s bootnodes

  • 18.223.219.100 | aws-us-east-2-ohio
  • 18.223.219.100 | aws-us-east-2-ohio
  • 18.223.219.100 | aws-us-east-2-ohio

Lighthouse team’s bootnodes

  • 172.105.173.25 | linode-au-sidney
  • 139.162.196.49 | linode-uk-london

EF bootnodes

  • 3.17.30.69 | aws-us-east-2-ohio
  • 18.216.248.220 | aws-us-east-2-ohio
  • 54.178.44.198 | aws-ap-northeast-1-tokyo
  • 54.65.172.253 | aws-ap-northeast-1-tokyo

Nimbus team’s bootnodes

  • 3.120.104.18 | aws-eu-central-1-frankfurt
  • 3.64.117.223 | aws-eu-central-1-frankfurt
2 Likes

Just to be clear, we (Lighthouse and I believe all other CLs) retain the list of discovered peers after a restart.

We’re spinning up some APAC OVH nodes this week, I’ll post further updates. Thanks for raising awareness on this.

I think having a group of players, including Lido, would help with the point I raised.

I would suggest putting some consideration into the privacy concerns of untrusted boot nodes. Although it’s possible to build crawlers now, they require significant technical effort to be effective (in fact, I don’t think there’s any crawler out there today that doesn’t frequently have doubt cast over its ability to discover the entire network). Geo-locating all (new) nodes would be much, much easier if you can get on that list of nodes (or comprise one of them).

I don’t oppose expanding the set of boot nodes at all, but I do have concerns about expanding it too far.

2 Likes

What I actually meant by “you might lose the peers during restart and need the bootnodes” is that you can no longer access the retained list of discovered peers due to censorship by the ISPs.

Thanks for the heads-up!

Very valid point - geo-locating would require the assumption that the IPs or GPS (or any other measurement data) used for triangulation is honest, which is a challenging assumption. I know this is a difficult question to answer, but is there a specific way you could (perhaps technically) define too far in your words?

Just to be clear, we (Lighthouse and I believe all other CLs) retain the list of discovered peers after a restart.

Prysm does not just to clarify, we would discover a new set of peers again on a restart.

On the point brought up in this discussion, in the event of a coordinated censoring event across regions ( where all bootnodes are taken down) a straightforward solution would be to simply share a new list of enrs for nodes to boot from.

Also in the event a node has allowed inbound connections via the firewall then a restart of the client will not cause issues with discovering new peers for the fact they will still receive inbound discovery pings which should allow them to kickstart discovery again. I do not think this is a threat that is unrecoverable from, it is easy enough to startup nodes with a new set of ENRs.

That being said I will ping our infrastructure team on having a greater diversity of operators from which we host our bootnodes from. Also it would be fine to expand our current set of bootnodes which are in our client defaults to add in new entities(not client teams, EF, etc)

2 Likes

Interesting, is there a specific reason why you don’t retain a list at all?

I think “simply” in theory, but not that easy in practice. Are there any predefined steps/processes in place that would need to be followed to publish those new bootnode ENRs? Where do you publish it if e.g. the GitHub repositories (including eth-clients/eth2-networks) are taken down? In such moments you don’t know whether to trust a e.g. Twitter account. I don’t wanna sound too paranoid but emergency plans are critical in such situations and I think it’s worth to discuss that matter.

Happy to hear this!

@Nishant_Das cross-posting here the feedback from Micah from the ethresear.ch thread:


I don’t have an account over there so I’ll reply here:

On the point brought up in this discussion, in the event of a coordinated censoring event across regions ( where all bootnodes are taken down) a straightforward solution would be to simply share a new list of enrs for nodes to boot from.

I think the attack vector of interest here isn’t that the bootnodes are offline/unavailable, but they are controlled and give the attacker the ability to partition the network. It certainly seems like the right thing for Prysm to do is retain its peer list from previous session and then use it just as a sort of “updated list of bootnodes” and then discover a new set of peers from there (which, on restart, would yield a new set of bootnodes).

IMO, the hard coded bootnodes should only be used on first run to establish an initial connection to the network, but the bootnode list from that point on should be dynamic based on prior runs. This makes it much harder to meaningfully capture the network.


For a concrete example attack, imagine someone has two 0-days in their pocket (neither of these are particularly far fetched for a state actor):

  1. They have the ability to takeover/control bootnodes.
  2. They have the ability to crash Prysm clients on the network (causing them all to restart).

This attacker now has the ability to eclipse all Prysm nodes. If our client diversity numbers are high enough this isn’t too big of a problem, but if they aren’t then this could lead to a fork should the attacker desire. Keep in mind, once you have successfully eclipsed a node, you need not reveal this immediately. You can sit on your eclipse and not leverage it until an opportunity presents itself (like you gain the ability to eclipse Teku as well for example, then you attack once you have 66% of stake eclipsed).

Below is an excerpt from an internal memo from Manifold Finance that we authored in May 2022.

The TLDR: try and own IP ADDRESS from RIPE, CA Cert Auth hosting via ENS. For the record I suggested this to ENS and was told to go speak to their lawyers (im being sued for something else rn kek).

Additionally, here is a data center map that we use to discern which data centers provide the desired connection attributes we are looking for: GitHub - manifoldfinance/datacenter-map: AS/IP Interchange/Datacenter Exchange Mapping

Cheers

The impending escalation of further weaponizing internet infrastructure

Recently this working paper signed by a wide spectrum of key organizations responsible for maintaining portions of the internet was published. It lays out the future groundwork needed for global inter-urisditional enforcement of sanctions at the global BGP routing level.

Below is selected quotes (emphasis mine) of the conclusion of which actions to take in the future and the proof of supporting such actions:

We believe the time is right for the formation of a new, minimal, multistakeholder mechanism, similar in scale to NSP-Sec or Outages, which after due process and consensus would publish sanctioned IP addresses and domain names in the form of public data feeds in standard forms (BGP and RPZ), to be consumed by any organization that chooses to subscribe to the principles and their outcome.

We call upon our colleagues to participate in a multistakeholder deliberation using the mechanism outlined above, to decide whether the IP addresses and domain names of the Russian military and its propaganda organs should be sanctioned, and to lay the groundwork for timely decisions of similar gravity and urgency in the future.

The two key forms of enforcing such global censorship are identified to be:

  • Blocklisting

  • Revocation of certificates associated with domain names

Whats this all for?

We (the greater Ethereum publica) cannot buy our security, our freedom from the threat of incarceration or violence by committing an immorality so great as saying to the future generations that will come after us, “Abandon any hope of freedom because to save our own skins, we’re willing to make a deal with your future slave masters.”

Our solutions cannot be limited to asking these platforms to do a better job of meeting their moral obligations – we must consider other modalities at the very least. Which is why I bring up this trust business.

Negative Events

IP Address level sanctioning

Ukraine invasion: We should consider internet sanctions, says ICANN ex-CEO

Multistakeholder Imposition of Internet Sanctions

LINX suspends peereing

RIPE NCC Response to Request from Ukrainian Government

https://www.ripe.net/publications/news/announcements/ripe-ncc-response-to-request-from-ukrainian-government

NOMINET suspends Russian registrars for co.uk

Ukraine asks ICANN to kill all russian domains

In response to Prykhodko, Erich Schweighofer, a professor at the University of Vienna and ICANN community participant, wrote:

ICANN is a neutral platform, not taking a position in this conflict but allowing States to act accordingly, e.g. blocking all traffic from a particular state

https://www.icann.org/en/system/files/correspondence/marby-to-fedorov-02mar22-en.pdf

However, CENTR, the Council of European National Top-Level Domain Registries, did choose a side. The Belgium-based non-profit, which focuses on legal, administrative, and technical policies and best practices for ccTLD registries, on Tuesday suspended the membership of the Coordination Center for TLD .RU/.РФ – administrator for those ccTLDs.

ICANN, Ukraine and Leveraging Internet Identifiers

RIPE NCC: RIPE NCC Executive Board Resolution on Provision of Critical Services

https://www.ripe.net/publications/news/announcements/ripe-ncc-executive-board-resolution-on-provision-of-critical-services

https://eump.org/media/2022/Goran-Marby.pdf

[At-Large] [ccwg-internet-governance] UA asking ICANN to introduce sanctions targeting Russian Federation’s access to the Internet: nabok at thedigital.gov.ua to ICANN: “Ukraine urgently need ICANN’s support”

https://atlarge-lists.icann.org/pipermail/at-large/2022q1/007816.html

Contribute to the revoking for SSL certificates for the abovementioned

domains.
Shut down DNS root servers situated in the Russian Federation, namely:

Saint Petersburg, RU (IPv4 199.7.83.42)
Moscow, RU (IPv4 199.7.83.42, 3 instances)

Apart from these measures, I will be sending a separate request to RIPE
NCC asking to withdraw the right to use all IPv4 and IPv6 addresses by all
Russian members of RIPE NCC (LIRs - Local Internet Registries), and to
block the DNS root servers that it is operating.

All of these measures will help users seek for reliable information in
alternative domain zones, preventing propaganda and disinformation.
Leaders, governments and organizations all over the world are in favor of
introducing sanctions towards the Russian Federation since they aim at
putting the aggression towards Ukraine and other countries to an end. I ask
you kindly to seriously consider such measures and implement them as
quickly as possible. Help to save the lives of people in our country.

Also, the above was signed by the Deputy Prime Minister of Ukraine -
Minister of Digital Transformation, the appendix is attached to this email.

Ukraine Ministry Letter

https://atlarge-lists.icann.org/pipermail/at-large/2022q1/007816.html

References

Farhat, K. (2017) ‘Digital Object Architecture and the Internet of Things: Getting a ‘Handle’ on Techno-Political Competition’; Internet Governance Project, Georgia Institute of Technology, Available at https://www.internetgovernance.org/wp-content/uploads/Karim_Farhat_IoT_IGP.pdf

Mueller, M. (2017) Is Cybersecurity Eating Internet Governance? Causes and Consequences of Alternative Framings. Digital Policy, Regulation and Governance, Forthcoming.

Kuerbis, B. and Badiei, F. (2017) Mapping the cybersecurity institutional landscape. Digital Policy, Regulation and Governance, Forthcoming.

Kuerbis, B. and Mueller, M. (2017) Internet routing registries, data governance, and security. Journal of Cyber Policy, 2(1), pp.64-81. http://dx.doi.org/10.1080/23738871.2017.1295092

Mueller, M. and Badiei, F. (May 2017) Governing Internet Territory: ICANN, Sovereignty Claims, Property Rights and Country Code Top Level Domain Names, Columbia Science & Technology Law Rev., 18, pp.435-515. http://www.stlr.org/download/volumes/volume18/muellerBadiei.pdf

2015

Mueller, M. (2015, 09). The IANA Transition and the Role of Governments in Internet Governance. IP Justice Journal: Internet Governance and Online Freedom Publication Series. Download here.

Sovereignty, National Security, and Internet Governance: Proceedings of a Workshop. organized by Milton L Mueller, Syracuse University School of Information Studies, and Hans Klein, Georgia Institute of Technology School of Public Policy, December 12, 2014 https://www.internetgovernance.org/pdf/Proceedings-publication.pdf

2014

Mueller, M. & Kuerbis, B. (2014, 03). Roadmap for globalizing IANA: Four principles and a proposal for reform Retrieved from Internet Governance Project: https://www.internetgovernance.org/pdf/ICANNreformglobalizingIANAfinal.pdf

Mueller, M. & Wagner, B. (2014, 01). Finding a Formula for Brazil: Representation and Legitimacy in Internet governance Retrieved from Internet Governance Project: https://www.internetgovernance.org/pdf/MiltonBenWPdraft_Final.pdf

2013

Mueller, M. (2013, 05). Are we in a Digital Cold War? Retrieved from Internet Governance Project: https://www.internetgovernance.org/wp-content/uploads/DigitalColdWar31.pdf

2 Likes

I can certainly see how this could be a good approach, but I don’t think it is that simple, to be honest. We have many different legal frameworks around the world, supranational agreements, different ways of enforcing the law (guilty vs. innocent or innocent vs. guilty), etc. that have to be taken into account. Just take the Tornado Cash case as an example. Just because you have a certain address space of RIPE NCC does not mean you can no longer be censored. So I still think appropriate geographical distribution is important because the world is not static, it is highly dynamic, and we don’t know how the different legal, geopolitical, and regulatory situations will evolve over time. But one thing must be clear. The less US law, the better.

I think the attack vector of interest here isn’t that the bootnodes are offline/unavailable, but they are controlled and give the attacker the ability to partition the network. It certainly seems like the right thing for Prysm to do is retain its peer list from previous session and then use it just as a sort of “updated list of bootnodes” and then discover a new set of peers from there (which, on restart, would yield a new set of bootnodes).

I don’t think doing this is helpful as it makes the average node more vulnerable to being eclisped by malicious peers during restarts. This has been covered in earlier work on eclipse attacks on bitcoin-core:

A key component of the attack is having the node restarted and dialing previously connected peers. Using previously connected peers, as a new ‘bootstrap’ on a restart can possibly lead to malicious peers being filled in your local table. There are much more peers looking for outbound connections compared to those who have configured their network to enable inbound connections. It is possible to take advantage of this asymmetry to fill up the local table with malicious peers.

If the main goal is to deal with the attack vector of having client bootnodes compromised, then it is a better idea to have a diverse set of bootnodes which are more resilient to this vector

For the sake of transparency I share Micah’s feedback (he doesn’t participate in this forum fyi) on the above @Nishant_Das:

A reasonable concern, but the best solution would be a mix, where on restart you try to connect both to bootnodes and to previous peers. That way someone controlling the bootnodes can’t eclipse you, and also you can escape a peering eclipse with a restart.

Also, keeping track of all nodes rather than only those selected can help mitigate this further. If you had a list of “all of the nodes I have ever seen” (perhaps pruned if you fail to connect to them at some point) then you are maximally protected against eclipse.

This is exactly what I try to achieve with the discussion :-D. I think the retention of previous peers and its attack vector is a related but distinct discussion for which each client team has its own strategy. Also, I actually think having various strategies makes it even more resilient at the end of the day.

1 Like