Skip to content

drykxs/Reliability

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Reliability to the Edge: Telco Teaching Cloud Native

πŸ“Œ Overview

Reliability to the Edge: Telco Teaching Cloud Native is a talk that connects two worlds that have traditionally evolved in parallel: the operational maturity of telecommunications and the fast-moving innovation of the cloud-native ecosystem.

Telecommunications has always operated under one non-negotiable rule: uptime is everything. As cloud-native platforms expand toward the Edge, many classic Telco challenges β€” such as latency, redundancy, fault isolation, and operational resilience β€” are returning to the forefront.

This session explores how technical practices and, more importantly, cultural and operational lessons from the Telco industry can inspire better ways to design, operate, and evolve modern cloud-native platforms.


🎯 Talk Objective

To demonstrate that true reliability is not achieved through tools, automation, or architecture alone, but through:

  • Operational discipline
  • Clear ownership and accountability
  • Strong teamwork across organizational boundaries
  • Engineering systems to survive the unpredictable

The talk invites the audience to rethink reliability in the context of Kubernetes, Edge Computing, and highly distributed platforms.


🧠 Target Audience

  • Platform Engineers
  • Site Reliability Engineers (SREs)
  • Cloud Architects
  • Engineers and technical leaders working with Kubernetes, Edge, and distributed systems

Level: Intermediate


🧩 Key Topics

  • What the Telco world teaches about reliability at scale
  • Parallels between Edge Computing and traditional Telco infrastructures
  • Reliability as culture, not just technology
  • Operating distributed systems under partial failure
  • The importance of clear operational boundaries and ownership
  • Engineering for worst-case scenarios

βœ… Key Takeaways

  • Reliability is culture, not just tooling
  • Distributed systems must be engineered for unpredictable failure
  • Clear ownership and operational discipline enable uptime at scale

⏱️ Session Format

  • Type: Presentation
  • Duration: 25 minutes
  • Speakers: 1 or 2

🌱 Benefits to the Ecosystem

This talk bridges the gap between Telco operational maturity and cloud-native innovation.

By sharing real lessons from sustaining mission-critical telecom environments, the session helps platform and reliability engineers:

  • Reduce downtime
  • Improve collaboration across teams
  • Build more human-centered operational practices
  • Strengthen reliability culture across Kubernetes and Edge platforms

πŸ”“ Open Source Projects Referenced

  • Kubernetes
  • OpenShift
  • OpenStack
  • OVN / OVS
  • Ansible
  • Prometheus

πŸ“„ License

This talk's content is shared for educational purposes and community knowledge sharing.

About

Telco reliability lessons learned

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published