Episode 1: Nick Coates, Symantec Artwork

MIM® Podcast

Welcome to the MIM® Podcast™. We have created this show to provide a central place for the global major incident community. The MIM® Podcast™ brings guests from around the world to talk about all things IT Major Incident Management. MIM® is the professional body dedicated to global best practice for IT Major Incident Management®. We are transforming the way companies, professionals and IT Operations, reduce major incident downtime. Our methods and techniques save millions for companies around the globe, supports leading IT specialists and maximises collaboration.With clients in more than 72 countries, including the world’s largest business and consumer brands, we drive major incident innovation to protect businesses.

All Episodes

MIM® Podcast

Episode 1: Nick Coates, Symantec

October 30, 2019 • MIM® • Season 1 • Episode 1

In this episode MIM founder and CEO, Adam Norman sits down with Nick Coates, Product Manager from Symantec. They talk about the 2.5 year journey that Symantec has been on, creating their first Major Incident function, what they call the Cloud Command Center. Nick shares the setup and structure of the team, some of the challenges they have faced, and the positive impact on their technical teams, their commercial customers and the business.

Watch the videocast of this episode
www.majorincidentmanagement.com
Subscribe to the MIM® newsletter
Connect with us:

on Linkedin
on twitter: @major_incidents
Youtube

Adam Norman: 0:05

Welcome to the Mim podcast. I'm Adam Norman, founder and CEO at MIM where we're on a mission to create the best major incident managers in the world. Our clients range from the largest, most well known companies through to new innovative companies across almost every industry for more than 72 different countries. You're going to hear about the best practices, strategies, and tactics, how companies deliver major incident management, excellence. Some of the challenges they face and different views on major incident management from around the world as well as how major incident management is constantly evolving. On this podcast, we go deeper interviewing professionals from the community. Our hope is that you will leave this podcast with some new ideas and a greater passion for major instant performance. In this first episode, I had the privilege to sit down with Nick Coates a product manager from Symantec. We talk about the two and half year journey that Symantec have been on in creating their first true major incident function. They call it the cloud command centre. Nick talks about the s etup a nd structure of the team, the tools a nd techniques that they use. Some of the challenges that they faced and the positive impact o n technical teams that commercial customers and the business. We hope you enjoy t he show. Nick. How you doing? All right, thank you yourself. Yeah, really good. Thank you. Really good. Um, thanks ever so much for coming in. I'm really excited to talk to you. Yeah, thank you for having me. Yeah, no, great, great. Um, so it'd be really interesting. Obviously you are a product owner at Symantec. Yep. It'd be really interesting for people who don't know you. Just to hear a bit more about your background, your career and kind of how you got to where you are at. Yeah, sure. Yeah. Um, so I started my career sort of in it, uh, around almost 11 years ago. Um, I was in my last year of sixth form. Um, hadn't really decided what I wanted to do with my life. Uh, my career, um, of the university was, was an option. Uh, but I got, I got headhunted by a local internet service provider, um, star, um, and they essentially offered me a, just a service desk role.

Nick Coates: 2:32

Um, just to get into it. Um, before that I'd ran two businesses when I was 15, 16. Um, one of them was a news company for, or a news website for teenagers. Um, and we, I think we had 22 students who volunteered. You basically did the news. Oh wow. We reached over 1.5 million, um, students across the globe. That's incredible. So yeah, it's almost you say 15. I was 15. Yeah. We had backing from the school, uh, which was, which was awesome. We generated revenue just for adverts on the website, um, which is good. Uh, and then I also did a hosting company as well, so we had over two and a half thousand, um, customers who hosted their websites. We did things for like a team, a TeamSpeak, which is like, um, a gaming, uh, like podcasting type thing. Um, so we hosted that. Um, we did some gaming servers, we did physical hosting servers in data centres. I was 16 when I started that. Oh, wow. Let's consolidate. So with my sort of scale was my expertise and everything I gained through those two companies. Um, you know, start off with me, offered me the role, you know, as a service desk analyst. Um, it was really my start into a professional IT career. Um, so I worked for staff three years, um, or just just under, uh, how, um, two roles they have the service desk analysts come in as senior. Um, and then I worked in the change management office. Um, and basically that was where we were making configuration changes to like switches, routers, firewalls, but all under change control. And that was my first step into the world of ITIL and into it service management. Um, and I sort of, yeah, change management was, was like, you know, what I enjoyed. Um, and then, uh, an opportunity came up, um, to become an infrastructure engineer for, um, Symantec, um, who are a global cyber security company. Um, and I basically decided that is a great opportunity. Uh, it was more a senior position, uh, was more technical, um, would actually enable me to use my Linux skills, uh, pick up some new windows skills, um, to do some more networking. Um, so I, yeah, I joined, joined, joined symantec. Um, and I worked in the network operations centre, uh, as an infrastructure engineer. Um, and we were, we basically monitored, uh, the email security and web security platform. So one of symantec products, uh, is the email security, um, dot cloud products. And essentially that is a cloud, um, scanning solution for emails. So generating, uh, scanning 2 billion emails a week, um, for thousands and thousands of companies across the globe. So the infrastructure, as you can imagine is pretty vast, was all on physical kits, hosted, um, in, in physical data centres. Um, so they needed, you know, um, engineers to monitor if something went wrong with the server whereby it wasn't processing mail. Obviously for business. That's, that's, you know, that's important because an email may not be, you know, getting to one of your customers or one of your partners or suppliers. Um, so my role really was to, to look at these screens and we literally had like these 5,000 screens in the NOC, and there was like eight of them. Uh, we had all these monitoring up on screen, so I went to my web monitoring, you know, flash red. Uh, you then go on and do your troubleshooting. Um, so I did that role for several years. Um, and then Symantec or for the particularly the email product, uh, we were looking to obtain our ISO 27,001 certification or to be accredited, um, for that, uh, an a big part of that is, is how do you operationally manage the infrastructure and the deployments, you know, for service management. So that was a team, they were called Optumas. Um, from transformers, it does stand for something but I can't remember. What it is. Uh, basically this team was brought in to look at our existing processes, um, all the way from event management, um, to incident management, change management. How do we communicate during an outage as well? Um, basically looking at what did we have and we have, it was all in people's heads. It wasn't documented and we didn't have one system that could do that. Um, so they, uh, they, they basically, um, had an RFP, went out to several vendors, uh, to look at a service management tool. Um, and the tool that management chose was, was Marvel service management. Um, they are a small software house company in the UK, um, but they have over 400 customers. Uh, and it's an on-prem service management tool. Symantec being a security company, um, and having this data because we're having monitoring and because we're having incidents going into there because of the deployment notes and also our asset management or configuration management database as well, having 30,000 servers in there with IP addresses and then location serial numbers, you know, for semantic, it's very, you know, it needed to be on-prem essentially. Um, I was on a training course for, for the tool, um, as an engineer. Um, I just fell in love with the tool. I was like, this is really good. Um, and what basically happened was they needed somebody to manage that and to be the technical owner of that, of that tool. Um, and I worked hand in hand with the team to build all processes. Um, you know, if their management, using, hooking the system into our monitoring system. Yeah. So that the engineers could, when there's an issue, when they needed to have a look at a server, you know, mouse server for example, rather than have nothing documented, uh, it would essentially, uh, pop into Marvel and they would do all the work in there. So we can track time, we can maybe see how many alerts that specific mail-server might be having. And also if that was a, you know, say we had a thousand alerts for all servers in Europe, um, we needed to manage outage through an incident. So we then use Marvel as the incident management tool. Okay. But because of that, we could relate the event, the events and the alerts together. Um, we could obviously run the incident three Marvel. So we, uh, for the rest of the business we communicated through Marvel. So we told our, uh, senior stakeholders that there was an issue, customer support with telling them that, you know, there was an issue. We're aware of it. Um, usual communication as a template that would use, we use Marvel for that and then it would go through a work flow. Um, and it was to begin with. It's quite complicated workflow. Um, over the years we actually refined that and we, we, we, we now use a four pillar approach for all of our service management processes. Um, so the idea of being there is there's four steps. You'd have different statuses within them, but make it very simple, you know, you identify, so you create, identify, investigate, resolve, close. Having those four pillars was there. But originally we didn't have that. It was literally this whole like I was about 50, 50 statuses and the workflow is so overcomplicated that it wasn't great. Um, but it's something that was better than what we had today. Um, you know, that we didn't have. Uh, and because of that, we also have program management. We can also have change management. So if they would need to be an emergency change that needs to happen, it'd all be logged within this tool. Um, so I basically led that for several years. Um, and then about two and a half years ago, um, Symantec, uh, two of our products, um, we had a major outage, um, with these products. And the feedback we got from our customers was that we didn't communicate very well. And it was true. We didn't, um, our email product, the portal was down for several days. Um, they were like the engineers were doing that best out of hasty mail, like page with the information, but the customers just were not getting the information they needed, which obviously, uh, increased the calls that our customer support team were getting. Yeah. Uh, cause we're queuing, you know, five, 600, 700, they were queuing a lot. Same with our VIP product and that's an identity management solution. So, um, it's like your, your um, two factor authentication token. Um, Symantec provides that full for businesses. Um, we had an issue with that whereby we can authenticate and even some issues ourselves hit us. And again, we have no communication. So I was, um, I basically was approached by my vice president, um, at the time and he said, is there a way that Symantec can better manage how we manage incidents or to communicate to our customers? Um, I said, Oh yeah, you know, there's, there's tools out there that could do this. Um, in the end we, we chose Atlassian Status page. Um, and we, because of that, um, the business decided that they didn't want it to automate the communication. So we built a team called the cloud command centre and they currently are a team of seven incident communication managers and their responsibility, which I think we may talk about later. But that responsibility is to take an incident from our engineering team and communicate to our customers. So make it, make the communication, make sense for customers rather than have all the technical jargon jargon. Yeah, of course you don't want that. Um, so we built that team and that's where I am today. Um, I moved from engineering into our customer success group. Um, and I basically am product owner for all status page, um, or some antique status. Um, and yeah, it's, you know, I think over my 10 years that it's been, I have gone through various roles, but definitely technical. Uh, but now stepping away and doing that more of the strategic planning for our instant communication strategy, uh, is really excites me. Um, see where we can go with it. Great. So, um, what actually does your role as product owner involve? Sure. Um, so essentially I am responsible for the growth and the development of Symantec status pages. Um, so we have, because of the way that smart ticketing, um, is all the products, uh, well, um, they're very, the, the different, they don't share a lot of common features. Um, so what we, what we did is we, we built, um, Symantec status with stays page, um, across 24 pages. So 24 products have that have their own status page so customers can go to o ur landing page, u m, which, which I've designed. U m, y ou k now, I'm quite creative as well. So, u m, I basically designed that and we deploy. I t's a static website, b ut o ur customers go on there and pick the p roduct that, you know, they want to see the status of. U m, s o it gives t hem a very high level overview of, you know, for example, email product. U m, if i t's operational a nd maintenance or there's an outage, they go i nto that click on the card and it takes them to the status page and that tells t hem, u m, they can then look at the incident. U m, they can sign up to the incident, they can subscribe, et cetera. Um, so my role really is to lead that technical, um, so implementation of those pages. So I work very closely with the engineering teams. Um, I also act as a support, um, person for the Cloud Command Centre who are the folks who go in and actually post the updates to, uh, to status page. Um, so I, we, we adopt the agile methodology. So because these are websites essentially, uh, web applications, um, obviously we do deployments. Um, so when we have a new page up, I do this a three week sprint, um, that, that we run. So, um, I then work with another developer and we will basically build those pages. We will write ups, I would write the user stories. I would make sure that we're on target to deploy those pages. Um, I make sure that the, um, the Cloud Command centre have the appropriate access to, they need to, um, also making sure that they are trained, receiving to know about the products that we have. Um, so it's, it's a very hybrid role. It's quite technical development, but also working with the business and, and senior stakeholders to deploy, deploy these pages out. Um, I also work very closely with, um, our, you know, partners as well. So, um, we get feedback from them. How can we improve the pages and what we do them wrong, what we're doing. Right. Um, I also run, um, several playbooks sessions with our Cloud Command Centre. So we essentially, um, just look at up past incident now, maybe in the past 30 to 60 days, what went wrong? Um, what went well and what do we need to do to improve in the future? Um, so that would be looking at the time that our engineering folks or operation teams and those products alerted the Cloud Command Centre of an issue. Um, and, and that's through PagerDuty. Um, so essentially what will happen is the engineering folks will page the Cloud Command Centre. They will pick that up, but they're 24/7. They'll pick up the incident, they will then say, right, okay, can we go on a war room? So we use WebEx. Um, so basically spin up a war room, they'll be a Slack channel. So we use Slack. Um, and they basically down would help the instant managers if that team has an instant manager or whoever the engineer is in that role at the time, they would have worked with them to get the communication. Um, so yes,

Adam Norman: 16:13

this is quite a big shift for you guys then. So really interested. So essentially prior to, um, you taking on this role and the status page as being developed in terms of your, your major incident management, it's your worst type of incident, it sounds like predominantly it was, it was down to the talented technical teams that you had to deal with that, but perhaps you didn't have the coordination and, and, okay. And so that's where this, this cloud command center team comes in and handles most of the comms. So they issue comms not just to customers but internally as well. Yes. So they handle the, they handle the technical teams, particularly when there's multiple teams working on it or are they still very much just,

Nick Coates: 16:58

they would predominately focus on just the communication out to our customers, but they could also help with the management comms that may need to go to, there's the senior stakeholders, senior leaders, um, they might act, um, for a, uh, you know, an engineering team who it might be very small, somebody a team of 10. Um, they might act as the incident manager. So, you know, spinning up the war room, um, getting anybody on the call, um, you know, coordinating at the time so they could do that. Um, I think they have done that several times. Um, the, we, you know, it's predominantly the communicating to us

Adam Norman: 17:38

So at Symantec essentially the technical teams, the technical resolving groups. They are your leads as far as actually making decisions about what's going to happen. Assessing action plans. Yeah. You can do what and then following up with the other technical team members. Yeah. That's, that's held by your technical staff.

Nick Coates: 17:57

It would be how it, yeah, by the engineering folks. Um, okay. Interesting. But, um, so if the root cause analysis will be driven, so we do RCAs and postmortem will be driven through the cloud command center. So they would work with the engineering teams to look at what is the, the plan of action, what do we need to do. But obviously engineering teams then would drive the prioritisation of that and they would drive obviously the release and when they would be fixed. But essentially the RCA would be developed by the cloud command center and they would, they would, they would really drive, sort of drive drive that obviously if the customer, um, which, which does tend to happen, they will ask us for an incident report and they would ask us for an RCA report. And obviously we want as much information to, we come from the engineering folks, but we don't want to be too technical. We don't wanna go to, you know, a lot of detail, um, you know, ins and outs of our processes. Um, you know, they would then build that report out for the customer, make it customer-friendly, essentially. Yeah. Yeah. Removing anything that's too technical. Yeah. Yes. Yeah. Certainly makes sense to someone who is not technical. Exactly. Yeah. Yeah. That's really interesting. So you talk about

Adam Norman: 19:06

RCAs, so the root cause analysis, when the engineers are driving this, um, are they looking for workarounds just to get that whatever product it may be functioning again and then looking at the root cause analysis afterwards? Or are they actually always going for the root cause every single time? Or is that slightly undefined?

Nick Coates: 19:29

So, no, I, I, uh, I think I'm safe to say that for most of the product teams it would be fixing the issue, do whatever you can to get the service back up and running, work around essentially a work route and then we're, we'll work on a permanent fix. We'll work on what the actual RCA would be. Yeah. And I think that is typically in an incident, you know, um, you can have Symantec, you know, we, we predominantly provide enterprise services, having large banks, government organisations being unavailable or down. You know, it's, it's, it's time pressured. Um, so right there is get the service up and I'm sure it's with any incident is the back of anybody's mind is we need to get the servers up and running. We need to, you know, implement the workaround or a temporary fix. And even when I was back in the NOC, I was on several incidents. Um, we used to think, you know, can we run this script? Should we just do a restart the mail service for example. Um, you know, we, we would do that to, to make sure that we can get the service back up and running and then we then worked with the development and engineering folks then to look at what the permanent fix fix can be.

Adam Norman: 20:37

Okay. So it's interesting. Yeah. Pretty standard for how a lot of companies, run. It is interesting. I'll be fascinated to hear more is as you guys have more time to settle in and see what changes, if indeed any you decide to make for the structure of your teams. Obviously there's a real mixture out there. Lots of companies that have dedicated major incident teams. The major incident managers tend to lead with input from the technical teams, but even lead, whereas you've got the technical teams and I know there are some very well, well known famous companies who essentially have no major incident managers who purely run on technical team. So I'm, I'm certainly not stressing that can't work. It works very well for some companies. Um, but that's really interesting. I love to hear how you guys get on. How long has that cloud cloud team been in place?

Nick Coates: 21:29

Two and a half years. So when we're building the page, um, I worked with, with, uh, one of our senior directors in support at the time. Um, and she's very passionate about customers. I'm very customer driven, um, but she has a lot of experience with our products as well. Um, and also, you know, instant management as well. Um, and she put her a hand up and said, you know, I'd like to build this team out. Um, and that's, have a dedicated team that can communicate to our customers. Um, you know, we have a lot of products. U m, we have a lot of incidents. U m, just due to the nature of technology, how many products? Uh, I can't tell you how many products we have, 24 products on status page. U m, but there are a lot more products, u m, see like on-prem products. Um, our cyber security services, um, you know, that there's just so much more that we have, um, which I don't see, we, we don't do anything with the consumer business. U m, they drive their own, u m, Status pages and how they communicate to consumers. O kay. U m, so they have a whole architecture and I think they have a team as well. U m, but y eah, for us that team is two and a half years w hen w e b uilt these pages and it started with one or two a is i t communication managers, including our senior director. U h, and since then it's grown as the team. We've taken on more products and t here's this be more of a standard set across the business. That one, when there is a service, if it's cloud orientated or it fits in line with one of t he, the top products that S ymantec looks after, i ts service description w ill have a status page. Like it will say that there is a status page and we have a duty to communicate during an outage or drawing or being proactive with scheduled maintenance. So it's now driven more as a standard, uh, a product will have a status page. Um, and that's certainly what we're, you know, that the feedback we get from customers, um, is we know some of the products we haven't. So at the moment I'm working on five new pages that we're onboarding. Um, and one of the, one of the pages, uh, is actually come out from an RCA, uh, and specifically has been pointed out by customers that we don't have Status pages for the specific products and it's great. It's like, okay, yep. So obviously the product didn't follow the appropriate standards for this, but now we're remediating that or actually putting in putting this in at this end. And we can take that action off to say, here we go, we've got page that there's, they're phenomenal. So obviously working well if you're getting requests for other products. Absolutely. And originally it was only two pages, like, Oh wow, only two pages. And over the two and a half years, we've added more and more to where we are today with 24 and there will be more, more to come in the future. Um, we're always looking at rearchitecting and re evaluating how we structured our pages. Um, you know, particularly with the upcoming changes, um, you know, it's mantic, um, that'd be new opportunities to see how the products are changing and the product categories are changing. So how can we build instant communication strategy around that? Yeah. So for me it's very exciting. It's taking this now and saying, right, okay, well how can we shape this? How can we make it even better? Yeah. We've transformed a major part of the customer success strategy actually. Y eah. So phenomenal. Sounds really interesting. What's, u m, what s ort of volume of, of critical or major incidents y ou're s eeing? Oh gosh, I can't tell you to, t o be honest. U m, it's, you know, probably average in one, one a day. U m, and obviously anything that we post post on status page is deemed as a critical incident. U m, it could be the, you know, we have email delays for example, that's obviously impact in, but it could be i mpact i n, y ou k now, several hundred customers w e w ould still post. But if it was impacting, y ou k now, a small subset of customers, we'd w orked directly with those customers. U h, we wouldn't necessarily post on the status page, w hich i s do you have a, a clear definition for that? So, yes. So our, um, Cloud Command Center, they have a process that they follow. So when, uh, for each of the products, when do they need to post and it would follow a classification essentially. Um, off the top of my head, I can't remember what that is, but is a process document is a version control document that they manage. So obviously all the products are different that engineered different. Um, they want a different platforms so they're not all the same. So this document is, is there, um, that how so, so it sounds like that would be linked more to technology, what fails and what it's absolutely volume of end users affected. So it sounds like you guys more really understand how your infrastructure's connected and that's what you look at as we look at that. Yeah, we would obviously do take, take, take into consideration the volume of customers that been impacted as well because we will know that say this component fails. We will know three monitoring. Um, obviously through expertise and, and knowledge we would know how many customers would be impacted. Um, but when I was, when I was in the, the, the, the, uh, the service management team at the time a few years ago, um, we actually, we built what we called the instant classification matrix. Um, and within the tool we made it very easy for our incident managers or those engineers who are acting as incident managers or that incident role. Uh, we made it very simple that they can literally select and prioritise an incident. So if it was a Sev one incident or priority one incident, they can go into the tool and it would tell them. And at that point they know them that they need to do the communication. They need to, you know, treat this as a major incident. And so we worked, we implemented that and that was adopted for several other products as well. So that's from the engineering side, not just the customer side, uh, you know, the, the communication side. Um, and that's, I mean the, I think should be rolled out further across the business and I think it'd be fantastic for all engineering teams to have this matrix that they can look at. And then, you know, that, you know, originally, gosh, when we started, um, we had a spreadsheet and I think it had like 4,000 classifications. Wow. There's a lot. There was a lot. I'm pretty, I think it was, well it was certainly, well, maybe 1001. It was a lot. Uh, I don't, we have the job then of putting it into, into the system in, into Marvel, which was, uh, you know, a challenging, and we work with the vendors, help us with that. But, um, you're having this spreadsheet, you know, and it would literally say this, you know, so we classify an incident in areas. So there's four areas, availability, accuracy, performance and security. Okay. Um, obviously security, self-explanatory. Um, for, for us, um, essentially the availability we would then do is the engineers can say, right, you know, this, um, it's totally unavailable. Like the portal is down whilst availability. And then what they then do is it will then classify the severity for them and they would know then that that's a step two or I said one incident which will be a major incident. So we classify major incidents for some of our products as a Sev one or safety. Okay. Um, typically a step one, a priority one incident is a major incident, but we have one in two due to the nature of our products. Yup.

Adam Norman: 28:33

Okay. Really interesting. I know that's, there'd be a lot of people, it'd be really interested in talking with you about that. I know that's a real challenge for companies and one of the things we talk to people about a lot is what do they deem to be a major incident and I've had a lot of consulting engagements recently specifically around that and it sounds like a copout but we always position it as it depends what in the business you are serving needs and obviously from from your guys' perspective very much what the customers who are the recipients of service through your tools need, what they are trying to do with those tools. Yeah, I mean it makes sense because of the nature of Symantec. We everyone will know who Symantec obviously, but the nature of the sorts of products, pretty much anything there that goes out on a custom size for as far as your end users would be concerned is going to be pretty high up there, which is really interesting to hear. Whether it was 1,400 yeah,

Nick Coates: 29:31

well whatever[inaudible] I said a lot. Yeah, it's

Adam Norman: 29:35

a for you to have whittle those down and been really precise and I know that's next that a lot of companies really struggle with. Also know that there's a lot of companies interesting enough, predominantly in the financial sectors who are working very hard. They're investing in major incident management teams and functions. In a sense, it's going hand in hand with some of the cybersecurity incidents we're seeing on the news every day. Often there's difficulty from them securing investment or funding for major incident teams and cybersecurity teams. And the two go hand in hand. The second there's a real problem and customers are unhappy or there's a huge impact that was, it costs a lot of money to these companies, they start to invest. So we're seeing a huge shift in more and more companies understanding that they must have some form of major instant team, whatever that looks like. And you almost can't afford not to, um, nowadays. So there'll be a lot of people listening who will be really interested in this two year journey, two and a half years journey, sorry, that you've been on and what you guys have achieved. And the starting point, I guess it'd be nice to dive into it a bit in terms of quite a big question, but, um, what were some of the major challenges you had when first starting that journey and if possible, how did you overcome them? Cause I think some people find that really interesting.

Nick Coates: 31:00

Um, so, so with specifically with status page, obviously we had a two major, you know, incidents. Um, and for us the biggest feedback was we think communicate, um, you know, to our customers. So being able to, to have those two major products and the backing of customers made it very easy for us to say that we do need to do something. Yeah. The challenge came around how do we, um, how, you know, what do we classify as an incident, what do we post on the status page? And we post, um, what is the, the content, you know, being a security company. Um, we obviously have a lot of, um, you know, intelligence, um, analysis, networking and, and you know, AI and machine learning. Um, we obviously wouldn't want to post anything that is going to alarm customers. Um, so we in last Y we formed this, this, this customer facing cloud command center to be able to work with our engineering teams to decide what do we need to post. But to this day we still post things that either are very basic or go into too much detail. It's trying to find that fine line. And it's very tricky because obviously the nature of, of, of, of the products that we have is very tricky to do that. But setting a, you know, starting very basic. So we have templates. So we, I think the team set up like four templates for all the pages and they're very similar. Yeah. They've obviously been adapted, um, as the pay, you know, the products, user pages more. But the team really started basic and then set, you know, these templates will be from, you know, identifying. So we know that there's a problem, um, investigating, um, to, you know, the, we restored the service to monitoring and they have, you know, they have their, so starting with these four pillars basically start with those four pillars. Um, and I think that's really helped the team has helped the engineering teams that I've been involved with. We're setting up, there is some management processes, um, that, you know, the team that I was working in. Um, but for us, I didn't, we didn't have too many challenges. Um, the business saw, well w w it was demand from customers that we need to communicate and we need to be better in how we manage incidents. Um, so we, I think we're in a very fortunate position to do that. Um, now obviously as time grew and we had more and more products that we needed a page for, it then started to become challenging to be, for me as an architect, trying to think, well, how can we do that on the pages? Um, probably the implemented implementation we've got now isn't, if I step back three years ago, maybe I probably would've implemented it the way we have. I probably would've done it slightly differently knowing where we are today. Okay. Um, so that telling my, you know, my future self or, you know, if I'm going to do this again, I would do it right. You know, slightly differently. I wouldn't have, what would you do differently? I wouldn't have as many pages. Um, I'd be very streamlined with pages. Um, um, using these product categories, um, that semantic does have, is having these categories through that and then actually building the pages out based on those categories or the product families. Um, I think that's what we, I would, I would do, um, I'd obviously probably look, uh, automating as well. So helping our clear command center, you know, rather than have that initial post using these templates and actually posting the initial semantic notes that there's a problem we are working on it. Yeah. Having that, because then that obviously is more quicker, as much more, much quicker than if have waiting for an engineer, you know, our communication manager to get the information from the cloud teams or the engineering teams, then a post and posting it. Um, so that's probably again is what

Adam Norman: 34:47

such a good point in terms of one of the things that we, we talk about in the best practice is that first piece, there are sub objectives to these three phases of the major incident process and that initial 15 minutes we, we we say 15 minutes because if you have a mature process and a slick team, 15 minutes is more than enough time for being notified of major incident. If you're on that, that upward trend towards maturity, 20 minutes is a bit more common for lots of companies. Um, but that whole piece is about confidence. It doesn't necessarily, you're not going to have every bit of information by their nature. Major instance are complex. There's lots of technologies developing. Yeah. And you often don't understand necessarily all the assets that are involved that are affected or why it's occurred. And so that first communication piece exactly as you said, is purely about confidence. And the worst thing you can do with stakeholders, both internally and your customers externally is to leave them in a scenario where they haven't heard a single thing from you. And their assumption is naturally going to be, it doesn't matter if you've got the best engineers and the whole operation mobilized, they don't see it. They have no idea what's happened. And that such really sorry to stop you. It's just such an important point. So it's an interesting thing you raised cause it's one of the things we was talking about is that communication piece. The whole objective is that it's just to let them know you've mobilized the operations. Exactly. Literally don't

Nick Coates: 36:22

the acknowledgement acknowledge that yes we have an issue. Yeah we don't, you know, be honest. Be, don't hide. You know, I think, you know, for us with semantic status, um, you know, we, one of the sort of the, the ethos really is, is for us to be open and transparent to the point that, you know, we're not being too detailed and we're not being too open with, with our technology stack and our products. But be open because gaining trust with your customers or your stakeholders, you know, they will then rely on you. Um, you know, you could potentially look at, you know, if you're selling services, you know, they can trust you. They know that your, you know, that you're honest and they may invest more in terms of buying more products or you know, adding more seats to the plan. And you know, that for me I think is, is key is just be open. I mean, don't be scared about just saying that we have an issue. Yeah. But we don't know yet. You know, when it's going to be restored, what the full impact is. But we just know that there's an issue. Um, and the worst thing is even if a is a consumer, you know, if you have a, you know, your bank for example, if you go onto the app and you know, in the, in the age that we're in now, you know, everything's doing, done online. Um, you look at, you look at rap and if you can't transfer money, for example, you know, I don't care what the issue is. All I want is an acknowledgement from the bank to say that there's an issue maybe with the payment S or the foster payments API. Um, there's an issue. We're looking into it for me, if I could have that as soon as I, as soon as they know that you have that I'm confident, I might call that they know about it. They're like, they're working on it. I can, I can trust them. I'll sit back then like, wait, transfer the money, whatever. Um, you know, I've, I think even using technology, um, you know, like Slack and things sometimes like isn't working but they don't acknowledge, um, straight away. You know, it should be the fact that there is an issue. We shouldn't need to wait 30 minutes for me. I think it's too, it's too long. Yeah, I agree. I think, you know, being realistically yes, you're not going to sit into your monitoring tools, know that there's an issue. You do want to validate exactly what, what it is. Um, the, you could be looking at five minutes, 10 minutes. Um, I think that's name that everybody

Adam Norman: 38:36

or any company is to communicate. Should be looking at 10 minute, just the initial posting and then you can go into detail. That might be 30 minutes. It's so interesting you're talking about this cause I think the biggest thing now is you can't, there's nowhere for companies to hide. I'm not suggesting that companies are trying tired, but in terms of social media is so prevalent and particularly if you're in our world when a lot of us are geeks, we love the technology side of things. We are very passionate about it and interested in it. People will make comments the second something doesn't work openly. Yup. And so there's two, there's two ways you deal with this. Either you control the conversation, um, from both the public relations standpoint and just a good customer service standpoint or everyone else does. And, and really is a company. It's also on you to, I think educate is different with consumers. I think it's harder in a consumer facing business. Well there are elements that mix. But with your communications, I think that they expect less consumers. They just want to know that you know, there's an issue and you're fixing it. Whereas your business customers and in the commercial world we want far more specifics. Um, and they're often, um, holding SLA and, and various contractual, which is absolutely fine. That's what they're there for, um, to ensure a level of service. But I think it's, you almost don't have a choice. And I, I think just putting that initial comms piece out there, I think companies you can't afford not to now because it is only going to look negative. And what's interesting though, we talked to people is where communication goes out. And you've touched upon this where perhaps something was overeager in terms of your communicates too early and it wasn't necessarily hitting your customers. But as we talk about in, in the global best practice in major incident management is validating your major incident anyway. And if you've got a slick and robust period, you can do that within the 15 minutes before you even issue your condoms. And that can be as simple as picking up the phones from key customers and going, this is what we're seeing in event monitoring. Is this the case you, is this working? That's a really quick phone call. Yep. And so you can kind of avoid that and that becomes a nonissue or a non reason for not issuing comms. But I know I talked to a lot of, um, like heads of service about this as well, and one of their, the current trends anyway is actually they say, well, we would rather put out communications and then have to retract and say, actually it was a blip in event monitoring. But I appreciate that varies industry to industry and also what you're communicating and the type of contract you have with your customer. Um, I think those things are, are really, but I think it just comes down to the fact that either you're going out, you're going to tell people or they're going tell me. Yep. And so you almost can't afford, not to now, but it's still pretty fascinating. The number of sophisticated and talented companies that can't get this right or perhaps haven't spent the time or invested the time or how did the time to invest in it? I think be fair. Um, but that's, some of that's it. Budgets are always, um, an issue because it's not necessarily seen as the core piece of what most companies do. It's seen as an additional expense, which is, which is a real challenge. I appreciate. Yep. I mean the, the cold facts, we are now almost no company anywhere can operate without some form of technology. You just can't, whether you're saying I can't payments in your local shop and you're reliant on technology. Yep. Um, and therefore of communication might be far simpler assigned saying unfortunately the card machine's down, but that just scale isn't it to larger companies and letting someone know in advance, so, okay, well I'm not going to be able to pay here. That's fine. I'll have to go somewhere else instead of someone getting to that point and not being able to do it. Yup. It's just massively magnified in our worlds where you're dealing with hundreds or thousands of of customers and potentially having a massive knock ons to revenue, et cetera. But really interesting that you talked about that. So you talked about templates, so your team uses templates, they communicate with customers and that kind of has, it sounds like it's evolved and alleviated some of the challenges of what we're communicating when we're committed. Communicating. Any other major challenges you found throughout that role of implementing a form of major incident management?

Nick Coates: 43:21

Um, is probably also one of the challenges I would say maybe is it a challenge? Well, I think, I don't, I say, I don't think we thought we fit too much. We've, we've been very fortunate, um, with the templates. They definitely have evolved. Um, we have, you know, we have hit some issues where we have over communicated or we've been, you know, too little. So is that learning process. So you know, these pages and our processes are evolving. Your process will never just sit there. You should always be reevaluating the process. Um, you should always be looking at, you know, going back and look in your instruments are doing similar playbooks, you know, reviewing what you've done. You know, just internally, just, you know, it's not a, you're not blaming anybody. You're looking at what can we do to improve? Um, so I think for us doing the initial templates and then learning from them, yeah. That, that was a challenge. Um, also I guess another, uh, one of the other challenges I, I've experienced personally, um, is, is the adoption of the pages with product teams who have had technology, uh, to run their own pages. Um, I'm actually demonstrating to them the moving to R, you know, are essentially standard instant communication platform is, is going to save, you know, your you re resource. Obviously the company money because we have a standard, you know, tool that we're using and it, it's SAS. Uh, we also have a team who's dedicated to communicating to customers. Um, so I've worked with a, you know, two, two to three product teams where there has been a challenge where they haven't fully appreciated what we're trying to do and they think that we're coming in and we're taking, you know, we're telling them, you've got to do this. No, no, no. We want to work with you. You know, let's take, um, you know, let's take this, this, this port or this transporter, you fail. And that's actually improve it because look at these features. So it's doing that something pieces is a marketing what we're doing. So that for me was, it was a challenge, but I've only only want, you know, two to three products and, and it's more about working with them, you know, is not trying to come in. And that's something that I've learned. You know, initially I went in and I said like, Oh, we know the business said you have to do this. And I took the tough guy, you know, I, I, I almost burned like that relationship with the product teams because I just went in with this bullish tactic that, you know, the business house said that we need to do this. Uh, but actually I suddenly realized that that's not the approach that we need to take. What you need to do is you need to work with the stakeholders. Um, you need to work with them to, to really demonstrate the benefits to the business, um, to the customers on why we're doing this. And, and, um, and we, I now have a standard like a slide deck that we do. Um, but it's a very high level and it gives them numbers on the fact that particularly for our support team, uh, when we have an incident, you know, we, we initially see a spiking call volumes or live chats or cases being logged. And as soon as we do that initial posting we see a drop significant dropping calls. But what we see is an increase in page views and subscriptions to the pages because we're seeing that. So actually we, I now and bet up into our presentation with new product teams, with senior stakeholders and we show the impact that it has, the impact the cow Komal center team has as well. The fact that they would do all the columns, you, you give them the information providing that they understand and they've been trained on the product. I mean they don't have to know all the ins and outs of the product is to high level overview of the product. As long as they know that they will then cross the comms, they would do that piece with our customers, with our partners, with the senior stakeholders, and letting the engineering and ops teams focus on resolving the incident. Yeah. Um, so for me, I think that was probably a challenge that has been, as I said, I mean I've been very fortunate, you know, the businesses seeing this as a, you know, for our customer experience and the customer journey. This has been massive. And we actually, the project was an se call quartet journey. Oh, that's really great. This is helping the customer. You, you have a lifecycle, you know, the customer has, um, and status page or smarting status is just one of those, um, you know, as part of that life cycle. And it's that we know the pages will always remain where they are today and we're gonna continue to grow them. Um, and I think as the business is changing, um, I think it's going to give us an opportunity to, to adapt what we've, what we've done and continue to grow the pages as well. Also reevaluate our processes, changing the products we needs, might need to look at Ironstone process. How do we change that? There might be an opportunity opportunity for us now to help these product teams to build their instant classifications, to maybe standardize our incident processes because, um, I think we've spoken before, but, um, semantic is very siloed. So our product teams that their engineering teams, so they have their own processes in terms of their, uh, monitoring or the tool stack is very different. So I think, um, you know, that's a, I guess I said is the challenge for us is because all of these product teams are siloed. They don't share the same processes. Um, so it's very difficult. So, you know, product team, a product team a uses this towards product team B uses this tool. W you know, the KIPP plow command center. And our team is like, okay, well like if two different tours, two different processes, um, they might classify a team in a product team as classifying this instance as one, whereas products, it might be three, but actually look at the customer in part. And as you know, that really should be a one we need to post. So I think for us that would be moving forward as something that I'm hoping we can adopt it. Symantec is having, you know this the standard processes for the silo teams. Um,

Adam Norman: 49:14

I will, it is really interesting in talking about that, I'll introduce you to a couple of contacts who work in managed service providers who deliver shared service, major incident management because actually that is very similar to what you're doing. I know it's internally facing, but a shared service model for major incident management where you've got um, let's say a team of 2030 major incident managers who are essentially providing live major instant coverage, let's say 10 20, 30 different customers who obviously rely on their infrastructure and they might even be dealing with two major incidents at the same time for two different customers. But naturally because of the requirements of each of those customers, they have very similar things. Certain certain things will impact business as usual and may be a major incident for one where they would not be for another or have severe impacts that also in different industries. So they're doing really different things. But actually I think you guys find it really interesting to have a chat with some managed service providers who do share helpful cause actually you'll have exactly the same issues. Yep. And it was um, a long ago now, um, when I did some work at computer center, um, they were very good at um, chat call Chris newbie who I used to work with, um, was doing a lot of takeoffs and new business take on and he got really, really good, um, and built quite a robust structure for when a new customer was coming into the maintenance and service, making sure these are all things we do document and this is how we account for those differences. This is how we capture it and how we build it into our overall, um, process documents and how we then align that to the overall service that's being delivered. How we train the people. So I'll, I'll introduce you, but really, really great guy. He did an amazing job there and[inaudible] have an excellent, um, shared service, major incident function and they asked him applicant and I think building this community as well is, is actually sharing, you know, knowledge like that. That would definitely help me and I'm sure many, many of the other companies as well. And you know, I'd be, that'd be amazing to yeah. To, to meet him and see how they can do and how we can protect the semantic or divert as well moving forward. Yeah, yeah, absolutely. Obviously a big part of what we're doing here is, is absolutely in you and I have spoken offline. Yep. Absolutely. Bringing this community to the front, one of the really interesting things about this is the legacy in terms of major incident management community just didn't exist. There was nothing and there was a real nervousness around talking about it. And one of the things we've done since the medium as a company was born is we very much saw that, although that's not something we were obviously selling as a product and therefore generating revenue enabled to grow the company. We saw that as our responsibility and really felt that we need to be the ones driving that forward and trying to bring it, bring the conversation about major incident management to the forefront for a couple of reasons. One, so that we generate new major instant managers. I it's, it's quite a, if you're not in that world and your organization doesn't have one or you haven't had exposure, you wouldn't even know that's a role that you can take on. So actually we need enough people knowing about it that there was always fresh, young, intelligent bodies ready to come in and, and interested because it's a really exciting role to do it or to take ownership of a major incident function or to be processed in order to take those. So that's really important and that's one aspect of the community. But the other part is so that professionals can connect with one another. And like we're talking now, the things you're talking about will really help some people leap frog some of the challenges they're gonna have. And I'm sure that we, plenty of people who reach out to you want to have conversations with you. And that's really important. The legacy is companies and professionals have nowhere to go. We hopefully filling that space and bringing our community to the forefront. We still very much rely on people like yourself to take part in the community. And so thank you on. I said thank you at the start, but it makes a difference and um, also to your organization for allowing you to come on here. Whilst I know there's no endorsement from them, um, being on here, it does make a difference and they obviously get it in the, it happens. The world knows about these things now. In fact, some of the, the huge issues that companies experience are often all over the news. So you can't afford the fact that exists. So perhaps let's all just have a really honest conversation and try to make ourselves better professionals and slicker and the operations better deliver better service for the customers. And the legacy is, I think that marketing departments and people, I've, I've worked in that industry also, so I feel I can say this, I'm so far removed from perhaps the it operation, the technology and what goes on there that they don't necessarily understand what's being said or their understand is potential, um, issue for the company that impacts customers, which are understand if you don't understand what's going on could be concern. Therefore to them they made you go, we can't talk about it, shut down. Um, I don't want anything like this discussed externally. And it's just easier for them to say no than it is to actually try and understand. It will take the time when we're all really busy. So I get that I'm not criticizing, but I do think it's incumbent upon all of us now in the industry to actually challenge our companies when they are saying, no, we'd rather you didn't talk about this or you didn't. And I'm not saying that they have to specifically talk about their companies, although I think that was very helpful for other professionals in the community. And you can about pretty much any other aspect of it and it's fine and you've got presidents and vice presidents stood up there talking about what the company's doing and even some of the challenges they have. Um, I think we're in an age now where you need to accept that you're professionals are going to have conversations in public and you can't necessarily control all of that. Um, which is really interesting in the smarter companies kind of accept and leverage that. Anyway, you look at companies like the lawyers, they really recognize this and they're trying to amplify the platform that people have in their professionals have and giving a lot of freedom, um, to their senior people to talk and allowing them to both, um, build the company's profile by doing that, but also their own profiles by doing it. And they understand it's a symbiotic relationship, but it's, it's really is something that I think the whole industry needs to work on. And when, when perhaps your marketing department, if that's indeed who, who says no, um, says no to, I think it's incumbent upon you to push it and challenge and say, well, actually this is why, and here's some examples of, of other professionals from other very well known companies. We're having these conversations. Um, and also for the management leadership teams to support them when they're, when they're having that conversation with marketing teams and trying to support instead of just taking the easy option of hap saying no. But,

Nick Coates: 56:28

no, I, I agree. I, I'm very fortunate that, you know, we have a good, you know, sort of leader, um, in, in my organization, um, and that we're able to, you know, come and come and talk. Um, obviously there's, there's, there's boundaries and you need, do you need those boundaries? Yeah. And there's obviously things that we can, we can't talk about. Um, but absolutely. You know, it's, it's what we're doing is not trade secrets. It's not, you know, we're not Ivanka, we're not exactly that. Uh, you know, we want to share, you know, we want to, we want to learn as well from, you know, it's a been able to talk, you know, as we're doing today, you know, we can learn so much more by just talking, talking together and I think[inaudible] what you're doing and the official mem, um, organizations doing is building this community and I think it's going to be right. We haven't had anything in the industry fall up before. Um, and now that we're doing this is, you know, in building this community together, I think it's going to help massively and drive those conversations and greived at drive development and growth. Um, you know, for the, you know, the best practice in major instant management as well.

Adam Norman: 57:30

Yeah, yeah, absolutely. I think it's interesting. There's a unique ecosystem that I think isn't perhaps understood that well. We're obviously trying to champion and drive it forward. And partly because I had a very long time ago, had a background in ice, specifically was a major incident manager and head functions until was really passionate and understood there was a gap. And I was very fortunate that when I was at Fujitsu, he was extremely well structured. Um, it was on a government contract, very large consortium, um, of companies and HP, um, were involved. Um, the ads and there were a number of BT were involved, but it was very well structured. And the professionals who had put that together, who had decided how the services were gonna operate was such a good basis for me. Um, that even now I look back and some of the most sophisticated processes and slick operations we developed that don't get me wrong. And I was the, I was the, I benefited massively from some very talented senior leadership who kind of put some time into me individually as well. But I think, um, I think it really needs, it really needs people willing to champion it. And I think the eco systems really interesting cause you've got us essentially driving forward things like this as a platform for people to talk, which hasn't really existed before. You've got major instant tool providers who provide specific software for major incident management and often that plugs into obviously wider it service management software. So you've got, it's not a huge ecosystem but there are large companies, some, some fantastic companies providing specific tools. We're very fortunate when we put on the first uh, major incident management virtual expo back in March, a couple of thousand kind of CEO's, it ops strengths as major incident managers, critical incident managers, technical staff, and we had some CTOs or from these amazing companies. It was really humbling for us to see those, that quality of professional and those individuals all come together and also for some of those tool providers to be sponsors of that and helped fund and allowed us to do that or at least supported us in funding that. And it's going to be really interesting cause we'll see. We're continuing to champion various events like that, both physical and virtual. And the next layer that is currently missing of companies that is supporting and getting involved with that is the, the service management tool providers, those Weiner pieces, they do quite a lot on their own and they hold lots of their own conferences, particularly the more the very large players. But I think there's gotta be a balance of, it can't all be just for corporate profits. Like if you're truly going to serve companies and professionals for decades, you do have some obligation I believe, to try and support those communities, not always for your own personal gain. And so I'd really like to see some of those wider service management platforms start to get involved, particularly in the major instant community outside of their own direct interests or essentially trying to sell their own product. And there's a balance because no company can operate without profits and revenues and you can't improve without additional revenue to reinvest. I'm not, I'm certainly not advocating that people should make money, but I do think it's going to be really interesting. So, yeah, I definitely think that the, the service management tool providers can add so much value to the community. They've obviously got huge networks of people involved and the expertise, Oh yeah. Building that building these tours as well. So they are the seeing how people use them. Yeah, I think unfortunately there's a tendency to believe and I think the managed service providers do this as well. Tendency to believe the legacy was the process was IP and yes, if you're dealing with a company who has no it operation, never has had one and doesn't understand an it operation at all in terms of it doesn't perhaps understand all tool, doesn't understand um, how best to structure it. But the reality is now we're in the information age. Like process isn't that big a deal as part of our AR, the training certification we provide process is an aspect of it of course. And there is a, a really good reason for looking at this is how the best companies in the world operate. This is how their process steps evolve. This, these are the roles and responsibilities them, but it's, that's pretty easy to find. And so to assume and stay quiet as a result of that and assume that that's like secretive IP. And I can remember being at managed service providers and everyone being concerned that an end clients is wanting information up front, particularly on processes and how you're going to do things and essentially run off with your processes. And by the way, that was a valid concern. I know that used to happen and I know she's not massively spoken about, but it did use to happen. Nowadays though, it's not that hard for people to find these things. You're not holding anything special. It's more about execution. How are you executing those things? And that typically comes down to one, the technologies that you're leveraging. How do they simplify and support your people? But ultimately that's about the professionals who are doing it and the people who are working in it. So your,

Speaker 4: 1:03:10

no,

Adam Norman: 1:03:11

that's essentially your IP as your people nowadays. Yeah, it really is. How good are you, how you do stuff, particularly that's almost the most truest place that exists is in matrix data management because it's such a unique role and there's lots of lots of stress around a hang on that works and having this done. But, but yeah, that's, that's me having a bit of a rant about we need more people supporting the ecosystem and companies being willing to share this. Certainly enough customers to go around for all of these, these tool providers. Some of them are absolute huge. Um, and they would all benefit from this. Um, I think at the moment it was, I wouldn't say a year ago now, we did a really interesting survey. We had, um, a lot of professionals take part and one of the things that became really apparent and I believe as well, might've been a gardener commissioned report, I believe. Um, something like 70% of major incident management professionals are in senior roles who are heading that had no idea that specific major incident tools existed for communication. And that's insane because the amount of work that can save an organization, the speed of mobilizing your engineers. I think that also goes hand in hand though with a lot of companies either not understanding how to quantify downtime and therefore struggling to demonstrate to the business why they should invest in tools or why they should invest in training or staff or new functions is pretty simple though. It just requires the input of the business and and someone to champion that and really understand what's happening when a major incident occurs. Like is revenue being affected, does it affect share price? Which is common. I've talked to lots of re recently I've talked to a lot of people about you're going to see the evolution of major incident management and you do indeed in some companies now where it links and we're trying to drive this with a few companies where it links into your marketing department like social media in particular. If you've got a very large company that does really well on social media, eCommerce retailers, always easiest ones, but your, let's say your site goes down, customer can purchase social media is going to be, particularly if you use it as a channel for customer service, is going to get absolutely hammered by that. If there is no direct link between your major instant team and your marketing department and there is not a predefined process there and you understand how to manage or PR that positionally,

Nick Coates: 1:05:58

um,

Adam Norman: 1:05:59

you're just going to run into trouble. So even if you've got the major incident team who do really well and they manage the stakeholders really well, however, there is a gap between what happens with your actual consumers. Um, there's a challenge there

Nick Coates: 1:06:11

and that's why semantic, we have this cloud command center. You know, we don't have an input from PR or marketing. These, the, the communication managers, they know their stuff. They know the customers, they know the engineering teams, they know what to post. Yeah. And I think having people who are[inaudible] absolutely[inaudible] you know, and these, these folks like, you know, they are a great team. Um, they are very knowledgeable. Um, and they, they're experts, they're experts in, in the field of, of what they do. But you know, I think for us is the first mom tick is we chose or, or we adopted, you know, our support members and we've trained them to become these communication, these, you know, these, the gateway for our customers and for engineering to talk to each other. Not marketing. Yes, we have boundaries. Yes. Are all things that we don't post. And yes, we might have to, if it was, uh, an impact in incident, the, there may revenue maybe impacted you to P you know, or you know, pressed, then yes, we would obviously engage PR, but we would still be open and very transparent and we would communicate. So I think actually if you're looking at major incident managers or people who are, you know, can they communicate? Stake holders are managing. Don't look at just your technical, don't look at marketing. Don't look at PR. Look at customer focused individuals who know the business, know the product, know your customers. And I think if you can employee, you know, major asset managers who, who, who, who, who were customer driven, I think that can go a long way. And yeah, I think we've proven that having our cloud command center is the right model for our business. Um, and I've done several talks with you know, conferences, um, and whether the businesses and customers, um, they're amazed that we've, we formed this team. Um, and it's because we did, we had that we had obviously the backing from the incidence that we had also that that was passionate individuals like myself included, you know, um, our senior director at the time, really passionate that we need to do better in communicating. We need to manage our incidents better. I think if you can find anybody in the company, or even if you're watching or listening, I think if you're passionate, you know, summed up, you know, do push, do you fight for it, then think about what is the right thing for the business and for the customers. And that's the model we took with, with building what we've done. So yeah, I, I think customer driven is the focus because they understand, they know the urgency of incident. We'll say they work very closely with engineering teams and development teams and ops teams. So I think that's for us, I think that has been a big one.

Adam Norman: 1:09:02

Oh yeah. I think the having that layer, regardless of the amount of involvement that have of controlling, obviously you guys handle it slightly differently in terms of the level of control your your command center team have and you still rely heavily on the technical guys. I think regardless of of that having a layer that wraps around your technical staff that almost protects them and protects their time so that their time is focused on the technical resolution of that major. And so it's so important. Yup. Um, in whatever form that looks like, I think it's really, really important.

Nick Coates: 1:09:42

It's also given them, we're making sure that the, yeah. Fixed the issue and remediate it, which then allows your engineering team, your development team, your ops team to focus on maybe new features for your product. You know, actually helping grow that product, you know, rather than having to battle all these incidents, having to, you know, do the communication. And I think you're right as in wrapping this, you know, wrapping your technical teams around, you know, with this, this, this layer I think is it, it works, works for us. Um, you know, and, and we definitely can make sure that our engineers, uh, being resourced in the appropriate areas, which obviously is growing the products and adding new features, um, rather than Acklin a fire essentially an instance is like, it's, it's a fire and you need to put it out, but you need to stop it from happening again or let them focus on that and then just take them in the way that[inaudible] the action plan, get your, you know, get this layer of incident managers to do that and let your technical teams do their jobs.

Adam Norman: 1:10:45

Oh yeah, absolutely. So I, I'd love to very briefly delve into some of the tools that you're using and also that your interest in. So you've obviously talked at length about Atlassian status page and I've, I've had a look, obviously we spent time together[inaudible] and had a look really impressive what you guys have built. And I think there'll be a lot of major instant teams who are really interested in that lesson status page who aren't currently it even very mature, well-run, major instant teams. We've been doing it for a long time. I think there's some real value in what you guys have done. I think there's a lot of people that'd be interested. So I think it's probably worth checking out for a lot of people. I'm sure you won't mind if anyone reaches out having a conversation about it.

Nick Coates: 1:11:29

Absolutely. Yeah. We want to promote, you know, promote what we're doing, promoting the status page as well. Um, but yeah, so, so yeah, we obviously use status page. Um, you know, I work, I have, I'm incredibly honored and very lucky. I work very closely with the Atlassians States pays product team. Um, we do have a lot of input and we do share, you know, I, I had been to the town hall, um, and I've spoken about how we've deployed status page, you know,[inaudible] scale at large. Um, but yeah, we definitely, you know, we, we worked well. I've worked very closely with them. I built up that relationship with them. Uh, and I always see, I never see a supplier is just a supplier to me. They are a partner. Um, treat them as a partner, um, treatment as if you know, you're their customer as well, but just that partnership. And that's worked very well. Um, now I've mentioned we use page duty as well. So obviously status page powers, the external communication, but for internal, yup. So the point that our engineering teams or ops teams have, uh, an incident, they obviously need to inform our command center. Um, and obviously, you know, they're a small team, but they're 24 by seven, so we use duty. So, um, the engineering team, so we use the email functionality. So each of our products, the 24 that we support, um, there's an email address associated with your pager duty, um, our PagerDuty account. And then what we then done is we've set up rules, um, scheduling, uh, escalation policies so that when I'm, say our email products has an issue, it will ping this PagerDuty email, raise an incident within PagerDuty. Um, the may, the instant communication manager will then knowledge that, um, through either the app on the phone or through Slack. So we also use Slack, um, as well as the communication. So, uh, from polling the PagerDuty comes in to posting and reviewing the pages, um, or the incidents within Slack with engaging with the engineering teams. Um, so we, we, we, we[inaudible] our it team, which has rolled out Slack across the business back in July. Um, and it's just changed the way that we collaborate and communicate. And I think that's key for any incident or any major incident is that collaboration and being faster. We do use Slack and having all these tools integrated, um, is a fantastic workflow. Um, and that's what I really see it as is, is this workflow is this very simple workflow and instinct comes in go, you know, it pages. Um, it was supposed to slap the conversations can start and we can post on the status page. I think that's great. We also use confluence as well. Um, that's where the, uh, or the car command center, right? The incident reports and the RCAs and well our engineering team use JIRA to track their, um, actions as well. So we, yeah, those for me is that, that's our tool. That's our incident like communicating[inaudible] management plus a great stack. Yeah. It is a really good stock and it doesn't take too long to, to stand stand these up. Um, it's just fantastic and you know, because we're able to integrate them, it's great. Uh, yes, we, we could automate a lot of, you know, the communication. Uh, we don't feel that that's right. I think you still need that human touch communicating, but also validating your monitoring isn't going to save your business. Um, monitoring is, technology is going to fail. Yeah. You're going to get false alarms. So you need to have that vetting that, you know, that the individual like human to vet it. So that's why we have that. But yeah, definitely our technology stack is, it's, it's definitely, I think it's quite, should be common, you know, it just makes it make sense for us to do, to roll this out with the text that we've got.

Adam Norman: 1:15:14

It's, it's going to be really interesting. I think PagerDuty, obviously recently IPO wasn't long ago and you've seen a lot of shifts in some of the things they're doing. There's a real drive. Um, and I appreciate, they don't purely work in the major instance space of other aspects. I, they do a lot of crisis management for governments, et cetera. Yep. But it's going to be really interesting. It looks like there's some exciting stuff going on there and it can be really interesting to see what resources, energy, and effort there are to develop the products even further and they're already very good. But it's going to be really interesting to see where that goes, um, with their IPO and, and with a lot of the resources that comes with that. And the focus. And they've obviously got a fantastic CEO, um, been really interesting following, um, some of Jennifer's kind of rise and what she'd been doing that. Yep. So it's going to be really interesting I think, and

Nick Coates: 1:16:11

definitely without, you know, definitely do check PagerDuty out. They, they have got some new features come in and again, because we're, we're, you know, we use them. Um, we've built up the relationship with, with PagerDuty and with the team. Um, and I've definitely have got some insights into what they're working on and it is very exciting. Same for it last year and there's a lot of exciting stuff. Um,[inaudible]

Adam Norman: 1:16:33

yeah, so really interesting. There's obviously other providers out there to these tools, um, but some very big names con talked about really interesting. Um, but I think how you set up your ecosystem to work together is really important as well. So understanding most of the major names are very good and making sure they integrate with pretty much anything you would want it to integrate with it. Yep. Yep. Um, but yeah, it seems really exciting and I'm, I'm, I was privileged PagerDuty where one of the sponsors at the virtual expo and it was the first ever one. And so, um, they contributed significantly to that being put on. It's gonna look tiny. The last one that we did, although it was, it was over 2000, I think in the end in 2010 these for your first one[inaudible] six in the end it might be 78 different countries who came from like say to see the names of those companies. We had obviously as a company been delivering the global best practice in major instant management for quite some time. So we had lots of links, but even for us, that moved the customer base in the awareness and the conversations we were having. And it was just exciting. That was a platform for people to set to talk. And I know we've already got a lot of people who are interested in speaking at next years, what it's gonna be about five times the size of the last one. So there's going to be going to be really exciting. But we, as I say, PagerDuty sponsored that they came in kind of last minute and we're really interested in what we were doing. Took parts, um, great team where I know lots of people were very interested in them and, and regardless of how large they are and the size of size of the organization, there were a lot of major incident managers who by the nature of the job, you rarely get a chance to look up. You're normally very busy. You don't see what's going out in the outside world. Um, and hadn't heard of Patriot and weren't aware just how good the tool is and what it can do for them. So really, really interesting. But um, thanks ever so much coming in. Really enjoyed talking with you. Um, been really, really interesting hearing about the journey that you and the Symantec team have been on and on. It was like to essentially implement your first major instant function. Um, sounds like you've got a great team and great buy in from management and you're doing some pretty special things. But, um, yeah, I'll

Speaker 5: 1:19:00

look forward to speaking to you next time. Thanks ever so much, and I keep[inaudible].