Page de couverture de Kubernetes Shake-ups, Platform Reality, and AI-Native SRE

Kubernetes Shake-ups, Platform Reality, and AI-Native SRE

Kubernetes Shake-ups, Platform Reality, and AI-Native SRE

Écouter gratuitement

Voir les détails du balado

À propos de cet audio

In this episode of Ship It Weekly, Brian digs into 3 big themes for anyone running Kubernetes or building internal platforms.

First, Kubernetes is officially retiring Ingress NGINX and moving it into best-effort maintenance until March 2026. We talk about what that actually means if you’re still using it and how to think about choosing and rolling out a replacement ingress.

Second, we look at how CNCF is defining platform engineering and what “platform as a product” looks like in practice, plus some hard-earned lessons from running Kubernetes in production.

Third, we talk about AI as a first-class workload on Kubernetes. CNCF’s new Certified Kubernetes AI Conformance Program aims to standardize how AI runs on K8s, and recent writing on SRE in the age of AI looks at what reliability means when systems learn and drift.

In the lightning round, we hit good reads on database migrations, Postgres upgrades, and a distributed priority queue on Kafka. We wrap with the human side of incidents: fixation during incident response and using incidents as landmarks for the tradeoffs you’ve been making over time.

If you’re on a platform team, responsible for SLOs, or the person people ping when “Kubernetes is weird,” this one should give you concrete questions to take back to your roadmap and runbooks.

Links from this episode

https://kubernetes.io/blog/2025/11/11/ingress-nginx-retirement/

https://www.haproxy.com/blog/ingress-nginx-is-retiring

https://www.cncf.io/blog/2025/11/19/what-is-platform-engineering/

https://www.cncf.io/announcements/2025/11/11/cncf-launches-certified-kubernetes-ai-conformance-program-to-standardize-ai-workloads-on-kubernetes/

https://devops.com/sre-in-the-age-of-ai-what-reliability-looks-like-when-systems-learn/

Lightning round

https://www.cncf.io/blog/2025/11/18/top-5-hard-earned-lessons-from-the-experts-on-managing-kubernetes/

https://www.tines.com/blog/zero-downtime-database-migrations-lessons-from-moving-a-live-production

https://palark.com/blog/postgresql-upgrade-no-data-loss-downtime/

https://klaviyo.tech/building-a-distributed-priority-queue-in-kafka-1b2d8063649e

https://sreweekly.com/sre-weekly-issue-497/

https://ferd.ca/ongoing-tradeoffs-and-incidents-as-landmarks.html

Pas encore de commentaire