There is no SRE
without Team

Listen on:

This recap includes the following episodes and guests.
Guest title reflects the guest's position at the time of the episode.

Recipes:

Design for Reliability Starts with Empathy
Levelling Up Together
Build It In, Don't Bolt It On
The Art of Asking for Help

Bill Higgins, VP, AI Open Innovation, IBM Research

Episode URL

Karel Vredenburg, Global VP, Client Insights & Research | Industry Professor | Non-Profit Board VP | Podcast host of Habits for a Better World

Episode URL

📖 Art of Collaboration - There is no SRE without Team

If you’ve ever tried to make an omelette with just one egg, you know it’s not quite enough. That’s how it is with SRE too—no matter how skilled you are, reliability doesn’t happen in isolation. In this chapter of the Making of the SRE Omelette cookbook, we crack open two conversations that explore what it really takes to build resilient systems and strong teams.

In Episode 13, I sat down with Karel Vredenburg to explore how enterprise design thinking can help us design for reliability from the start. It’s not just about sticky notes and whiteboards—it’s about deeply understanding the people behind the problems and building solutions that actually work for them.

Then in Episode 14, Bill Higgins joined me to talk about the human side of engineering—how asking for help, building trust, and creating psychological safety are just as important as writing clean code. His stories reminded me that collaboration isn’t just a buzzword—it’s a skill we can all get better at.

Together, these episodes offer a recipe for something powerful: a culture where design and operations aren’t silos, but partners. Where teams don’t just coordinate—they collaborate. And where reliability is everyone’s responsibility, baked in from the very beginning. What emerged from both conversations was a shared belief: great outcomes come from great collaboration, and that starts with how we design, how we work together, and how we ask for help.

🧠 Design Thinking Meets SRE - Design for Reliability Starts with Empathy

🎯 Goal: Embed SRE principles into the design process to reduce rework, improve user experience, and prevent operational issues before they occur.

Karel Vredenburg described how enterprise design thinking can be a game-changer for SRE. Karel broke it down simply: “It’s truly like learning how to read, learning how to write… this creative problem solving is really what this is.” He emphasized that design thinking isn’t just for designers—it’s a universal skill that helps teams deeply understand problems before jumping to solutions.

Enterprise Design Thinking can be used to solve any problems including in the SRE domain.

What stood out was how this mindset can be applied to operational challenges. Karel shared, “If you’ve got a problem… trying to get to the cause of what that problem was from a variety of different perspectives” is where design thinking shines. It’s not just about fixing outages—it’s about reimagining how we approach them. And yes, even SREs can benefit from sticky notes and empathy maps. Whether it’s an outage or a recurring incident, the process of mapping perspectives, ideating quietly, and prototyping early can lead to better outcomes. “You don’t just do that out loud… the quiet person might have the best solution.” That’s a reminder to make space for every voice in the room—especially in high-stakes environments like SRE.

🎮 Levelling Up Together - There’s No SRE Without Team

🎯 Goal: Shift from transactional coordination to proactive collaboration by building trust, shared ownership, and a culture of appreciation.

Bill Higgins reminded us that SRE is a team sport. He drew a clear line between coordination and collaboration: “Coordination is when I reach out to you because I have to. Collaboration is when I reach out to you because I think… there will be a better outcome.” That mindset shift—from obligation to opportunity—is what transforms teams.

The episodes also highlighted how to move from a support model to a true partnership. Karel encouraged us to “ideate without the loudest person in the room speaking”—a reminder that the best ideas often come from the quietest voices. Bill added that great teams are built on “psychological safety, inclusiveness, and appreciation.”

One of my favorite takeaways? Bill’s team celebrates “Thankful Thursdays” in Slack. “It’s contagious,” he said. Recognition isn’t just nice—it’s strategic. It builds trust, which is the foundation of any high-performing team.

Learning and growth were recurring themes. Bill compared skill-building to video games: “Pick one or two skills in a six-month period that you want to level up.” Whether it’s emotional intelligence or better code reviews, deliberate practice and mentorship are key.

Karel wrapped it up perfectly: “Everybody should have this skill… it should be taught in elementary school.” Whether you’re designing a product or responding to an incident, the art of collaboration is what makes the difference.

🛠️ Build It In, Don’t Bolt It On - Design for Reliability and Operability

🎯 Goal: Integrate reliability, observability, and operability into the product lifecycle from day one—not as an afterthought.

Both guests made it clear: reliability isn’t something you add at the end—it’s something you design for from the start. Karel reminded us that “We don’t have to wait until something is coded to get feedback.” This is a call to bring SREs into the room during design reviews, not just post-incident reviews.

And Bill’s story of a high-stakes GitHub migration showed how deliberate practice and preparation can turn a risky operation into a smooth success.

Whether it’s through design thinking, cross-functional collaboration, or simply asking for help, the message is the same: reliability is a team effort. And the earlier we bring everyone to the table—SREs, designers, product managers, engineers—the better the outcome.

Bill echoed this with his story of a high-stakes GitHub migration. The team ran 50+ test migrations, pulling power plugs and simulating failures. The result? A smooth, ahead-of-schedule rollout. “It was scary silence… nothing went wrong.” That’s what happens when operability is baked in from the start.

🙋 The Art of Asking for Help

🎯 Goal: Normalize help-seeking behavior to reduce risk, accelerate progress, and foster psychological safety across teams.

One of the most powerful takeaways from Bill’s episode was about knowing when to ask for help. He shared a story about a project that went off the rails and how his manager’s response changed his perspective: “I’m a general manager… I ask for help when I need it.” That moment reframed asking for help not as a weakness, but as a leadership skill.

Bill explained that there’s a balance: don’t ask too early without trying, but don’t wait until it’s too late. “If you’re feeling stuck, that’s when you go to your manager.” It’s a simple rule, but one that can save projects—and careers. And as leaders, we need to model that behavior too. “When somebody asks me for help, I say thank you. I’m glad you trust me enough.”

💬 Let’s Talk About It

Here are a few questions to spark conversation with your team or community:

🧩 How might we embed design thinking into our incident response or postmortem processes?
🛠️ What’s one way we can shift from coordination to collaboration in our current workflows?
🎯 Are we creating enough psychological safety for team members to ask for help early?

Page updated

Google Sites

Report abuse