The Cloudcast #301 - SRE and Infrastructure Operations
Jun 15, 2017
Aaron Delp and Brian Gracely
Brian talks with Rob Hirschfeld (@zehicle, Founder/CEO of @RackN) about the concepts of SRE (Site Reliability Engineering), the challenges of maintaining infrastructure software, emerging tools and the next-generation of operations.
Get a free eBook from O'Reilly media or use promo code PC20CLOUD for a discount - 40% off Print Books and 50% off eBooks and videos
Topic 1 - Welcome back to the show. Let’s start by talking about the concept of SRE (Site Reliability Engineering). Give us the basics and maybe explain how it differs from what people define in DevOps.
Topic 2 - Application development has been moving faster for quite a while (agile development, etc.). But now infrastructure/operations teams have to deal with faster software - especially around updates (e.g. Kubernetes releases every 3 months). How are companies managing this?
Topic 3 - Given that this pace of operations change may not slow down, how do you think about the challenge in terms of process/operations versus technology/tools?
Topic 4 - What are some of the steps that companies take to better prepare for this type of operational model? Tools, process, skills, etc.
Topic 5 - Do you see SRE as being a progression for existing infrastructure/operations people, or is this more focused on sysadmins or developers that want to get away from building applications?