Making sense of SRE and observability, one week at a time.
What is site reliability engineering (SRE) really about? How can I make sense of it in my organisation? How do I cut through the buzzwords and actually improve the lives of my colleagues and customers?
Latest episode
Watch now
How could AI help human beings negotiate the mountains of telemetry we collect to get simple and fast insight? This week I'm joined by Ottermon AI CEO and founder Checo Pacheco about the lifecycle of observability coverage and tooling within organisations and how AI is helping to find signals amongst the noise and reduce cognitive load for SREs. We discuss... 🎂 The need for a layer of logic on top of our telemetry data 🚲 The observability lifecycle of a DevOps team 🎶 How most orgs have many observability tools, and how we might make that work 🤯 Reaching the limits of what humans can comprehend as a reason for AI 📕 How poor documentation may become AI's downfall in the future ...and much more. You can find Checo on: LinkedIn: https://www.linkedin.com/in/checopacheco/ You can find more about Ottermon AI on their website: https://www.ottermon.ai/ or on LinkedIn: https://www.linkedin.com/company/ottermon/ You can find Stephen on: LinkedIn: https://www.linkedin.com/in/stephentownshend/ Bluesky: https://bsky.app/profile/slightreliability.bsky.social YouTube: https://www.youtube.com/c/SlightReliability Instagram: https://www.instagram.com/slight_reliability/ TikTok: https://www.tiktok.com/@the_kiwi_sre
Latest episodes

About the host
Stephen has a background in SRE and performance engineering. He has worked in the industry for 15 years as both an external consultant and an internal engineer.
Our industry is full of buzzwords and exaggerations, it can be hard to know what is real or not. Stephen strives to take these complex technical concepts and to simplify and present them in a way everyone can understand and apply (and to call out when something is too good to be true).
Stephen lives in Auckland, New Zealand and currently works as a Developer Advocate for SquaredUp, as well as promoting and improving observability and SRE practices internally in the organisation.
