[Post-Mortem] Launchnodes DC Outage - 14 February 2026

Node Operator: Launchnodes

Status: Resolved

Severity: High

Incident Date: 14 February 2026

Duration: Approx. 3 hours 15 minutes (20:09-23:24 GMT)

Service recovery began: 21:50 GMT

Published: 17th March 2026

Summary

On Saturday 14 February 2026, our third-party data centre provider conducted maintenance on Juniper network core routers. During this maintenance, a previously unknown software bug within the Juniper operating system caused a failure in the core routing infrastructure, unexpectedly affecting part of Launchnodes’ hosted infrastructure.

Launchnodes’ 24x7 monitoring systems detected the issue at approximately 20:09 GMT. The data centre provider’s on-site Systems Administrators responded immediately, identified the root cause within their infrastructure, and applied the required software patch. Services began recovering at approximately 21:50 GMT, with a period of intermittent packet loss during stabilisation. Full connectivity was confirmed at approximately 23:24 GMT.

This incident affected 5572 Lido validators operated by Launchnodes, resulting in missed attestations during the outage window.

Timeline (all times GMT)

  • ~18:40: Data centre provider begins maintenance on their Juniper network core routers, no customer service impact was expected

  • ~20:09: Juniper OS software defect triggered during configuration change, causing core connectivity interruption; Launchnodes’ 24x7 monitoring systems detect the outage

  • ~20:09: Data centre provider’s on-site Systems Administrators begin immediate investigation

  • ~21:50: Data centre provider applies software patch; Launchnodes services begin recovering; intermittent packet loss during stabilisation

  • ~23:24: Full connectivity confirmed; incident resolved

Total outage duration: ~3 hours 15 minutes

Root Cause

During their maintenance activities, the data centre provider applied a scheduled configuration change to their Juniper network core routers. An unexpected software bug within the Juniper operating system was triggered during this change, causing the data centre’s core routing infrastructure to enter a failure state, interrupting connectivity to hosted systems. The maintenance undertaken by the data centre provider had been assessed by them as not having the potential to impact customer services.

This incident originated entirely within the data centre provider’s network infrastructure and was not related to any configuration changes or operational actions taken by Launchnodes. The bug was a previously unknown defect in the Juniper OS version in use at the data centre at the time.

Impact on Lido

  • Validators affected: 5572 Lido validators operated by Launchnodes

  • Impact period: Approximately 20:09-23:24 GMT, 14 February 2026

  • Financial loss: Calculated to be 3.836 ETH, based on missed attestations during the outage window. Launchnodes has committed to cover this loss of earnings on behalf of Lido users.

Detection & Response

The outage was detected at approximately 20:09 GMT by Launchnodes’ automated 24x7 monitoring systems. The data centre provider’s on-site Systems Administrators were immediately engaged to investigate and remediate the issue within their infrastructure. They identified the root cause as a software bug in the Juniper operating system and applied the required patch.

Resolution

The data centre provider’s team applied a software patch to the affected Juniper OS, resolving the bug. Launchnodes services began recovering at approximately 21:50 GMT. A short period of intermittent packet loss was observed during network stabilisation. Full connectivity was confirmed at approximately 23:24 GMT.

What Went Well

  • Launchnodes’ 24x7 monitoring systems detected the outage promptly at 20:09 GMT

  • The data centre provider’s Systems Administrators responded immediately and resolved the issue successfully

  • Root cause was identified and patched without requiring a further extended maintenance period

What Could Be Improved

  • The data centre provider should implement pre-maintenance OS version validation to identify known bugs before applying changes to production infrastructure

  • The data centre provider should provide advance notification of all maintenance activities, even where customer impact is assessed as unlikely.

  • Faster communication from the data centre provider to hosted customers (including Launchnodes) during active incidents

  • Rollback procedures for core router maintenance, to enable quicker recovery if a change triggers unexpected behaviour

  • Review of maintenance scheduling to avoid weekend evening windows where impact on hosted services is harder to manage

Action Items

1. Request full incident report from data centre provider including Juniper bug reference number - Launchnodes - High - Open

2. Request data centre provider implements pre-maintenance OS version validation as standard - Launchnodes - High - Open

3. Establish faster incident communication protocol with data centre provider - Launchnodes - Medium - Open

4. Request data centre provider reviews rollback capability for future core router maintenance - Launchnodes - Medium - Open

5. Request data centre provider reviews maintenance scheduling to minimise impact on hosted services - Launchnodes - Low - Open

Lessons Learned

This incident was caused by a software defect in the data centre provider’s network infrastructure, triggered during their network maintenance. Launchnodes had no involvement in the change that caused the outage. Key takeaways:

Pre-change validation

Data centre providers should validate OS versions against known bug lists before applying changes to production infrastructure

Clear communication channels

Hosted customers should receive timely, proactive updates during active incidents within the data centre

Resilience Planning

Rollback procedures and contingency plans should be in place for all core infrastructure changes carried out by third-party providers

Launchnodes will engage with the data centre provider to ensure stronger pre-maintenance checks are implemented and that incident communication protocols are improved in the future.

This post-mortem was prepared by Launchnodes for the Lido Node Operator governance forum.

Launchnodes has sent 3.836 ETH to the Lido Execution Layer Rewards Vault, to reimburse rewards that were missed due to this incident: Ethereum Transaction Hash: 0xb4224c9876... | Etherscan

1 Like