Page de couverture de Ship It Weekly - DevOps, SRE, and Platform Engineering News

Ship It Weekly - DevOps, SRE, and Platform Engineering News

Ship It Weekly - DevOps, SRE, and Platform Engineering News

Auteur(s): Teller's Tech
Écouter gratuitement

À propos de cet audio

Ship It Weekly is a short, practical recap of what actually matters in DevOps, SRE, and platform engineering.

Each episode, your host Brian Teller walks through the latest outages, releases, tools, and incident writeups, then translates them into “here’s what this means for your systems” instead of just reading headlines. Expect a couple of main stories with context, a quick hit of tools or releases worth bookmarking, and the occasional segment on on-call, burnout, or team culture.

This isn’t a certification prep show or a lab walkthrough. It’s aimed at people who are already working in the space and want to stay sharp without scrolling status pages and blogs all week. You’ll hear about things like cloud provider incidents, Kubernetes and platform trends, Terraform and infrastructure changes, and real postmortems that are actually worth your time.

Most episodes are 10–25 minutes, so you can catch up on the way to work or between meetings. Every now and then there will be a “special” focused on a big outage or a specific theme, but the default format is simple: what happened, why it matters, and what you might want to do about it in your own environment.

If you’re the person people DM when something is broken in prod, or you’re building the platform everyone else ships on top of, Ship It Weekly is meant to be in your rotation.

Teller's Tech 2025
Politique
Épisodes
  • IBM Buys Confluent, React2Shell, and Netflix on Aurora
    Dec 12 2025

    In this episode of Ship It Weekly, Brian powers through a cold and digs into a very “infra grown-up” week in DevOps.

    First up, IBM is buying Confluent for $11B. We talk about what that means if you’re on Confluent Cloud today, still running your own Kafka, or trying to choose between Confluent, MSK, and DIY. It’s part of a bigger pattern after IBM’s HashiCorp deal, and it has real implications for vendor concentration and “plan B” strategies.

    Then we shift to React2Shell, a 10.0 RCE in React Server Components that’s already being exploited in the wild. Even if you never touch React, if you run platforms or Kubernetes for teams using Next.js or RSC, you’re on the hook for patching windows, WAF rules, and blast-radius thinking.

    We also look at Netflix’s write-up on consolidating relational databases onto Aurora PostgreSQL, with big performance gains and cost savings. It’s a good excuse to step back and ask whether your own Postgres fleet still makes sense at the scale you’re at now.

    In the lightning round, we hit OpenTofu 1.11’s new language features, practical Terraform “tips from the trenches,” Ghostty becoming a non-profit project, and two spec-driven dev tools (Spec Kit and OpenSpec) that show what sane AI-assisted development might look like.

    For the human side, we close with “Your Brain on Incidents” and what high-stress outages actually do to people, plus a few concrete ideas for making on-call less brutal.

    If you’re on a platform team, own SLOs, or you’re the person people ping when “something is wrong with prod,” this one should give you a mix of immediate to-dos and longer-term questions for your roadmap.

    Links:

    IBM + Confluent https://www.confluent.io/blog/ibm-to-acquire-confluent/ https://newsroom.ibm.com/2025-12-08-ibm-to-acquire-confluent-to-create-smart-data-platform-for-enterprise-generative-ai

    React2Shell (CVE-2025-55182) https://react.dev/blog/2025/12/03/critical-security-vulnerability-in-react-server-components

    Netflix on Aurora PostgreSQL https://aws.amazon.com/blogs/database/netflix-consolidates-relational-database-infrastructure-on-amazon-aurora-achieving-up-to-75-improved-performance/

    Tools & tips https://opentofu.org/blog/opentofu-1-11-0/ https://rosesecurity.dev/2025/12/04/terraform-tips-and-tricks.html https://mitchellh.com/writing/ghostty-non-profit https://github.com/github/spec-kit https://github.com/Fission-AI/OpenSpec

    Human side https://uptimelabs.io/your-brain-on-incidents/

    Voir plus Voir moins
    16 min
  • AWS re:Invent for Platform Teams, GKE at 130k Nodes, and Killing Staging
    Dec 4 2025

    In this episode of Ship It Weekly, Brian looks at re:Invent through a platform/SRE lens and pulls out the updates that actually change how you design and run systems.

    We talk about regional NAT Gateways and Route 53 Global Resolver on the networking side, ECS Express Mode and EKS Capabilities as new paved roads for app teams, S3 Vectors GA and 50 TB S3 objects for AI and data lakes, Aurora PostgreSQL dynamic data masking, CodeCommit’s return to full GA, and IAM Policy Autopilot for AI-assisted IAM policies. This was recorded mid–re:Invent, so consider it a “what matters so far” pass, not a full recap.

    Outside AWS, we get into Google’s 130,000-node GKE cluster and what actually applies if you’re running normal-sized clusters, plus the “It’s time to kill staging” argument and what responsible testing in production looks like with feature flags, progressive delivery, and solid observability.

    In the lightning round, we hit Zachary Loeber’s Terraform MCP server and terraform-ingest (letting AI tools speak your real Terraform modules), Runs-On’s EC2 instance rankings so you stop picking instance types by vibes, and Airbnb’s adaptive traffic management for their key-value store. We close with Nolan Lawson’s “The fate of small open source” and what it means when your platform quietly depends on one-maintainer libraries.

    Links from this episode:

    AWS highlights:

    https://aws.amazon.com/about-aws/whats-new/2025/11/aws-nat-gateway-regional-availability

    https://aws.amazon.com/blogs/aws/introducing-amazon-route-53-global-resolver-for-secure-anycast-dns-resolution-preview

    https://aws.amazon.com/about-aws/whats-new/2025/11/announcing-amazon-ecs-express-mode

    https://aws.amazon.com/about-aws/whats-new/2025/12/amazon-s3-vectors-generally-available/

    Other topics:

    https://cloud.google.com/blog/products/containers-kubernetes/how-we-built-a-130000-node-gke-cluster

    https://thenewstack.io/its-time-to-kill-staging-the-case-for-testing-in-production/

    https://blog.zacharyloeber.com/article/terraform-custom-module-mcp-server/

    https://go.runs-on.com/instances/ranking

    https://medium.com/airbnb-engineering/from-static-rate-limiting-to-adaptive-traffic-management-in-airbnbs-key-value-store-29362764e5c2

    https://nolanlawson.com/2025/11/16/the-fate-of-small-open-source/

    Voir plus Voir moins
    22 min
  • Kubernetes Config Reality Check, EKS Control Planes, and GitHub Guardrails
    Nov 26 2025

    In this episode of Ship It Weekly, Brian digs into what’s new for people actually running infra: Kubernetes config, EKS control planes and networking, and GitHub’s latest CI/CD and Copilot updates.

    We start with Kubernetes’ new configuration good practices post and how to turn it into a checklist to clean up Helm/Kustomize and kill off “hotfix from my laptop” manifests.

    Then we hit AWS: EKS Provisioned Control Plane to size control plane capacity for big or noisy clusters, plus new network observability so you can see who’s talking to what across clusters and AZs instead of guessing from node metrics.

    On the GitHub side, Actions OIDC tokens now include a check_run_id for tighter access control, and Copilot adds instructions files and custom agents so you can encode platform and security expectations directly into reviews and workflows.

    In the lightning round, we touch on Terrascan being archived, Microsoft’s write-up of a 15.72 Tbps Aisuru DDoS attack against Azure, and AWS flat-rate CloudFront plans that bundle CDN and security into more predictable pricing.

    We close with Lorin Hochstein’s “Two thought experiments” and what it looks like to write incident reports as if an AI (and your future teammates) will rely on them to debug the next outage.

    If run Kubernetes in prod this one should give you a few concrete ideas for your roadmap.

    Links from episode

    https://kubernetes.io/blog/2025/11/25/configuration-good-practices/

    https://aws.amazon.com/about-aws/whats-new/2025/11/amazon-eks-provisioned-control-plane/

    https://aws.amazon.com/blogs/aws/monitor-network-performance-and-traffic-across-your-eks-clusters-with-container-network-observability/

    https://github.blog/changelog/2025-11-13-github-actions-oidc-token-claims-now-include-check_run_id/

    https://github.blog/ai-and-ml/unlocking-the-full-power-of-copilot-code-review-master-your-instructions-files/

    https://docs.github.com/en/copilot/how-tos/use-copilot-agents/coding-agent/create-custom-agents

    Lightning Round

    https://github.com/tenable/terrascan

    https://www.bleepingcomputer.com/news/microsoft/microsoft-aisuru-botnet-used-500-000-ips-in-15-tbps-azure-ddos-attack/

    https://aws.amazon.com/about-aws/whats-new/2025/11/aws-flat-rate-pricing-plans/

    https://sreweekly.com/sre-weekly-issue-498/ (Lorin's Article)

    Voir plus Voir moins
    17 min
Pas encore de commentaire