This recap includes the following episodes and guests.
Guest title reflects the guest's position at the time of the episode.
Recipes:
The Origin Story
SRE Fundamentals
Career Path
People First - The Core Ingredient
ROI - Measuring what Matters
Welcome to the first crack at our SRE Omelette! I had the pleasure of speaking with two powerhouse voices in IBM - Ingo Averdunk, IBM Distinguished Engineer, IBM Garage for Cloud and IBM SRE profession leader, and Kareem Yusuf, now SVP, Ecosystems, Strategic Partners & Initiatives. Together, we peeled back the layers of what SRE means, how it evolved at IBM, and why it’s more than just a role—it’s a mindset, a movement, and a recipe for success.
Goal: Reframe culture as an outcome.
The omelette metaphor wasn’t just about clever branding. It was about shifting the conversation from reactive firefighting to intentional transformation. Kareem explained that culture isn’t something you wait for—it’s something you create. “When you think about culture and behavior,” he said, “you are really driving for an outcome.” That outcome, he emphasized, must be rooted in the customer’s reality. “None of our customers wake up every single day going, ‘Ooh, I want to use your software today.’ They’re trying to get an actual task done.”
This clarity—about purpose, value, and impact—became the foundation for IBM’s SRE evolution. It wasn’t about adding another layer of process. It was about aligning engineering with outcomes that matter.
“When you think about culture and behavior,” he said, “you are really driving for an outcome.” - Kareem Yusuf
Goal: Shift from reactive ops to full lifecycle ownership.
Ingo Averdunk, IBM’s Global SRE profession leader, has long believed that the real challenge in software isn’t building it—it’s running it well. His background in service management gave him a front-row seat to the “day two” experience, where systems either deliver value or fall short. That’s why SRE, to him, was a natural evolution.
“We are moving away from projects to products,” he explained, “and SRE is very much in support of understanding the full life cycle.” - Ingo Averdunk
Ingo emphasized that SRE isn’t just about uptime—it’s about ownership. He encouraged engineers to build depth before breadth, advising, “Pick one area and go deep. Build the muscle of being able to decode the system.” That depth, he believes, is what builds confidence—and ultimately, resilience.
But he also challenged the idea that automation is the end goal. Sharing a story about a server that crashed every Monday, he noted, “If my application server dies every Monday, I could automate a restart—but the problem still exists.” True SRE and the engineering rigor we want to arrive at, he said, is about engineering the problem away, not just masking it with scripts.
Goal: Create space for SREs to grow, thrive, and be recognized.
One of the most transformative decisions IBM made was to formally recognize SRE as a profession. This wasn’t just a title change—it was a cultural shift. Ingo helped lead the effort, and he speaks about it with pride. “It’s not just a role—it’s a profession with a curriculum, a roadmap, and a career path.”
This structure gave SREs a place to grow, learn, and be seen. It also helped IBM attract and retain top talent. “Ultimately,” Ingo said, “we want to attract, nurture, and retain the right skill and the right talent.” It was a recognition that SRE isn’t a temporary trend—it’s a long-term investment in people and practice.
Kevin echoed this sentiment, reflecting on his own journey. “I’ve always worked in performance, scalability, and reliability,” he said, “but I often had to explain and justify my contributions. Having a formal SRE profession gives future engineers a clearer path.”
Ingo also offered advice for aspiring SREs: “Don’t try to do it all. Pick one area and go deep. Build the muscle of being able to decode the system.” Whether you’re a dev or a sysadmin, there’s a path into SRE if you’re curious, collaborative, and committed to resilience.
Goal: Center the SRE profession around human skill and collaboration.
Both episodes closed with a reflection on people and emphasized that the real secret sauce is people.
For Kareem, the most important part of building a world-class SRE function isn’t the tooling—it’s the people. “The number one ingredient,” he said, “it’s got to be people and their skills.” He spoke about the importance of meaningful work, professional pride, and creating environments where people can thrive. “Everything we do is human capital. It’s us who bring our brains to work to generate this IP every day.”
Ingo reinforced this idea, especially when it comes to incident response. “SRE is a team sport,” he said. “Knowing someone has your back at 2 a.m. matters.” It’s not just about solving problems—it’s about solving them together.
Goal: Define success through outcomes, not just uptime.
Kareem also tackled the tough topic of prioritization. His framework? Time, cost, and quality. But he added a twist: “Features are useless if customers cannot use them meaningfully to get their work done.”
When it comes to ROI, Kareem offered a refreshingly honest perspective: “A real simple measure of ROI for me is—I ain’t getting calls.” But behind that simplicity is a deeper vision. He’s not just looking for fewer incidents—he’s looking for transformation. “The real return on investment I’m looking for,” he said, “is this vision of being able to deliver our software consistently everywhere, no matter how it’s deployed.”
Ingo added that while metrics like MTTR and automation rates are useful, they’re not the whole story. “You need to find good performance indicators that measure velocity, quality, and efficiency,” he explained. Because in the end, SRE isn’t just about keeping systems up—it’s about helping teams move faster, build better, and deliver more value.
Here are a few questions to spark deeper conversations with your team or community:
🧩 How do you define the boundaries of SRE in your organization - where does it start and stop?
🧠 How do you personally define the ROI of reliability work in your organization?
🧪 If SRE is a team sport, what role do you play—and how can you support your teammates better?