Slight Reliability

The Root Cause Fallacy with Andrew Hatch (Episode 98)

• Stephen Townshend • Season 2 • Episode 98

Send us a text

This week I'm joined by SRE leader Andrew Hatch from Cisco ThousandEyes to talk about a dirty word in the resilience community... root cause. In this excellent conversation we explore...

🌌 Is the root cause of every incident the big bang?
🦖 How the value of root cause degrades as complexity increases
🫣 That if the culture is not blameless, people will hide things
🌳 Alternative approaches to root cause analysis such as branching timelines
🙋 Getting someone without skin in the game to facilitate your blameless post-mortems

...and much more.

You can find Andrew on:

LinkedIn: https://www.linkedin.com/in/hatchman76/

Check out Andrew's SREcon21 talk 'Learning from Complex Systems' which covers many of the topics introduced in this episode: https://www.youtube.com/watch?v=5pKGW61Ryvo

You can find Stephen on:

LinkedIn: https://www.linkedin.com/in/stephentownshend/
Bluesky: https://bsky.app/profile/slightreliability.bsky.social
YouTube: https://www.youtube.com/c/SlightReliability
Instagram: https://www.instagram.com/slight_reliability/
TikTok: https://www.tiktok.com/@the_kiwi_sre