WeCyberYou! Unlocked Podcast

The WeCyberYou! Unlocked Podcast breaks down cyber security, online safety and digital risks into clear, practical conversations anyone can understand.
Each episode is designed for a specific audience, ensuring the advice is relevant, accessible and grounded in real-world scenarios - not technical jargon.

All Episodes

WeCyberYou! Unlocked Podcast

Cyber Security Frameworks Demystified Part 8 - ISO/IEC 27031

March 26, 2026 • Season 1 • Episode 8

0:00 | 19:30

In this episode, we break down what the ISO/IEC 27031 is, how it helps organisations prepare for cyber incidents and major disruptions and why ensuring ICT readiness is critical to keeping businesses running when everything else fails.

Duration: 0:19:30

Visit https://www.wecyberyou.com for more cyber security education, resources and awareness content like this.

Thank you for listening.
WeCyberYou! Team

Support the show

Like and follow us to be notified when a new episode is released on this channel.

SPEAKER_01 0:00

When you picture corporate cybersecurity, you probably imagine a giant digital fortress.

SPEAKER_00 0:04

Right, like massive impenetrable walls.

SPEAKER_01 0:06

Yeah, exactly. Heavy cryptographic gates and automated guards just keeping all the malicious actors out.

SPEAKER_00 0:11

Yeah.

SPEAKER_01 0:12

But uh what happens when those walls are breached?

SPEAKER_00 0:15

Or worse.

SPEAKER_01 0:16

Right. Or worse, what happens when the massive power grid running that entire fortress just completely fails? You know, the alarms go silent and the gates are stuck wide open.

SPEAKER_00 0:25

That is the nightmare scenario.

SPEAKER_01 0:27

It really is. And welcome to this deep dive on the WeCyber You Unlocked podcast. Before we get into today's topic, please take a quick second to follow the channel and remember to visit WeCyberU.com for more content exactly like this.

SPEAKER_00 0:40

Highly recommend checking the site out.

SPEAKER_01 0:41

Definitely. So today, for you the listener, we are looking at the official documentation for ISO 27031. Our mission on this deep dive is to understand the actual physical mechanics of how organizations keep their digital lights on when absolute disaster strikes.

SPEAKER_00 0:57

Yeah, and it really is the ultimate playbook for the worst case scenario. I mean, we spend so much time in the tech industry discussing how to prevent an incident rate.

SPEAKER_01 1:05

Oh, all the time.

SPEAKER_00 1:06

But ISO 27031 is entirely focused on the mechanics of surviving one. It forces an organization to accept that the unthinkable is going to happen. And when it does, how do you mathematically and structurally ensure the business keeps functioning?

SPEAKER_01 1:22

Okay. Let's unpack this because before we can look at how to fix a disaster, we really need to understand the overarching framework of technology readiness.

SPEAKER_00 1:31

Right, the big picture.

SPEAKER_01 1:32

Yeah. So if we look at other frameworks, like say ISO 2701, you can think of that as the anti-lock brakes on a car.

SPEAKER_00 1:39

That's a good way to put it.

SPEAKER_01 1:40

Right. It's doing everything in its power to stop the crash from happening. But ISO 27 C031 is the crumple zone and the airbags. It accepts that the crash is currently happening.

SPEAKER_00 1:49

Yeah, you're already hitting the wall.

SPEAKER_01 1:50

Exactly. And its entire mechanism is designed to absorb the kinetic impact so the passengers, which is the business, actually survive the wreck. It specifically zooms in on technology readiness.

SPEAKER_00 2:00

Aaron Powell What's fascinating here is how it acts as a mechanical bridge. It complements broader standards like ISO 27001 for information security. Right. And uh ISO 2002301, which handles overall business continuity. But ISO 27031 carves out its own critical niche by focusing purely on the technology side of that equation. In the documentation, the core concept driving all of this is IRBC.

SPEAKER_01 2:28

IRBC. Wait, which stands for what again?

SPEAKER_00 2:30

Right. So that is ICT readiness for business continuity.

SPEAKER_01 2:33

Aaron Powell ICT readiness for business continuity, which sounds like really heavy corporate jargon, but the underlying concepts are actually quite elegant.

SPEAKER_00 2:41

Aaron Powell They are. I mean, IRBC is really the architectural heart of ISO 27031. It basically means engineering your IT systems to possess three specific qualities. They must be available, they must be resilient, and they must be recoverable.

SPEAKER_01 2:54

Hold on, let me push back on that for a second. Availability and resilience. Those sound like the exact same thing to me. Like if a system is available, isn't it inherently resilient?

SPEAKER_00 3:02

Aaron Powell Not necessarily. And the mechanical difference between those two is actually where a lot of infrastructure design completely fails. Yeah. So availability is about redundancy. It means having a secondary load balancer or a backup power supply so that if component A fails, component B takes over.

SPEAKER_01 3:18

Aaron Powell The system just stays up.

SPEAKER_00 3:19

Right. Resilience, on the other hand, is about how a system behaves under extreme stress or partial failure.

SPEAKER_01 3:26

Oh, okay.

SPEAKER_00 3:26

Like if a massive denial of service attack hits your network. A resilient system doesn't just crash, it degrades gracefully.

SPEAKER_01 3:34

Degrades gracefully.

SPEAKER_00 3:36

Yeah. So maybe the high resolution images stop loading on the application, but the core text-based transaction database keeps functioning. It bends instead of snapping.

SPEAKER_01 3:46

Well, that makes total sense. So availability is having a spare tire.

SPEAKER_00 3:49

Right.

SPEAKER_01 3:49

Resilience is having run-flat tires that let you keep driving at a slower speed. And recoverability is how fast you can get the car to a mechanic and replace the axle entirely if you hit a massive pothole.

SPEAKER_00 4:00

That is a highly accurate way to look at it. The ultimate goal here is to engineer an environment that prevents disruptions where possible responds effectively when the environment is compromised and recovers the underlying data quickly.

SPEAKER_01 4:11

Just maintaining critical operations during the chaos.

SPEAKER_00 4:14

Exactly.

SPEAKER_01 4:15

So we have our crumple zone, but how do we actually design it? Because wait, so not all systems are created equal, right? Definitely not. Like if my company gets hit by a massive power outage, surely the IT team isn't trying to save the employee cafeteria menu server at the exact same time they're trying to save the primary customer transaction database.

SPEAKER_00 4:36

No, they absolutely shouldn't be. You cannot save everything all at once. And honestly, if you try, you will just end up saving nothing.

SPEAKER_01 4:43

It's like emergency room triage.

SPEAKER_00 4:45

Exactly like that.

SPEAKER_01 4:46

If you walk into an ER, some agents need a heartbeat instantly. They get rushed straight to the back.

SPEAKER_00 4:52

Right.

SPEAKER_01 4:53

Other people with a sprained ankle can sit in the waiting room for hours. But here's my question. In a hospital, a doctor makes that call. In a massive corporation with a complex distributed architecture, every department head thinks their specific application is the patient bleeding out.

SPEAKER_00 5:09

Oh yeah. Everyone thinks their project is tier one.

SPEAKER_01 5:12

Right. So if every department claims their system is tier one, who breaks the tie?

SPEAKER_00 5:16

And that is exactly why the standard introduces two core mechanics: alignment with business continuity and risk management. An IT department cannot operate in a vacuum. That makes sense. Nor can they be the ones making the final triage decisions.

SPEAKER_01 5:40

Aaron Powell So how does that actually work in practice? Do they just put the sales director and the IT architect in a room until they agree?

SPEAKER_00 5:47

No, it is much more empirical than that. A proper BIA relies on financial and operational modeling. The business units have to quantify the exact cost of an outage.

SPEAKER_01 5:57

Like literally assigning a dollar value to the downtime.

SPEAKER_00 5:59

Exactly. They calculate that if the CRM goes down, they lose $50,000 an hour in sales.

SPEAKER_01 6:04

Wow.

SPEAKER_00 6:04

But if the internal HR portal goes down, it costs them maybe a few hundred dollars in lost productivity. The numbers dictate the triage order.

SPEAKER_01 6:11

Aaron Powell Oh, so it completely removes the emotion from the equation.

SPEAKER_00 6:15

Yes. The technology serves the business, not the other way around.

SPEAKER_01 6:18

Aaron Powell Okay. The documentation highlights two crucial metrics that come out of this business impact analysis. And I want to make sure you, the listener, really crasp the mechanics of these because they are basically the foundation of any recovery architecture. They are RTO and RPO.

SPEAKER_00 6:32

Yeah, let's start with RTO, the recovery time objective. This dictates how fast specific systems need to be back online before the business suffers unacceptable, irreversible damage.

SPEAKER_01 6:43

Okay.

SPEAKER_00 6:44

So for a crapel financial transaction system, your RTO might be five minutes. For that internal HR portal, your RTO might be a week.

SPEAKER_01 6:52

Okay, so the BIA gives us our RTO. It's our math. We know the bank app needs to be up in five minutes. Now the second metric is RPO or recovery point objective.

SPEAKER_00 7:00

Right.

SPEAKER_01 7:00

If RTO is about time moving forward to get back online, RPO is about looking backward at your data. It defines how much historical data the business can afford to lose.

SPEAKER_00 7:09

Exactly. It dictates your backup point.

SPEAKER_01 7:11

So if my RPO is 24 hours, I am backing up my data once a day. If the system crashes, I lose yesterday's work, and that is deemed acceptable by the business.

SPEAKER_00 7:21

Correct. But if you are a major financial institution, an RPO of 24 hours is catastrophic. You cannot tell millions of customers that their transactions from Tuesday simply vanished into thin air.

SPEAKER_01 7:32

Oh yeah, that would be a nightmare.

SPEAKER_00 7:33

A total disaster. For those Tier 1 systems, the RPO must be near zero, meaning data is backed up continuously in real time.

SPEAKER_01 7:42

Wait, an RPO of zero, backing up data continuously in real time sounds insanely expensive and resource heavy.

SPEAKER_00 7:48

It is incredibly expensive.

SPEAKER_01 7:49

How does a company even pull that off without slowing their entire network to an absolute crawl? Every time I save a file, it has to instantly copy to another server.

SPEAKER_00 7:57

Well, which is why the BIA is so vital, right? You only apply an RPO of zero to the most critical databases. Mechanically, to achieve near zero RPO without tanking network performance, organizations don't use standard backups.

SPEAKER_01 8:10

What do they use then?

SPEAKER_00 8:11

They use synchronous replication over dedicated high-speed fiber channels. When a customer makes a deposit, the primary server writes the data. And before it even confirms the transaction to the user, it waits for a confirmation from a secondary server miles away that the data was also written there.

SPEAKER_01 8:28

Aaron Powell So they are literally writing the data in two places simultaneously.

SPEAKER_00 8:32

Exactly. Or they use asynchronous replication with continuous delta syncing.

SPEAKER_01 8:36

Delta syncing.

SPEAKER_00 8:37

Yeah. Where only the tiny block level changes in the data are streamed to the backup site milliseconds after they happen. It requires massive bandwidth and incredibly sophisticated storage arrays.

SPEAKER_01 8:48

Aaron Powell Here's where it gets really interesting though. We've established the risk and we know our objectives with RTO and RPO. But math doesn't reboot a server.

SPEAKER_00 8:57

No, it certainly doesn't.

SPEAKER_01 8:59

What actually happens when the nightmare becomes reality and the data center is literally underwater? How do we move to the action phase?

SPEAKER_00 9:05

Right, the response.

SPEAKER_01 9:06

Yeah, the standard outlines the key components of an incident response. And one thing that really caught my attention was the communication plans. I find it fascinating that a highly technical IT standard explicitly demands communication protocols.

SPEAKER_00 9:21

Well, because a disaster is rarely just a technical failure. It is almost always a human coordination failure as well. Oh, for sure. Think about it. If a massive outage hits, the IT engineers are frantically digging through code to restore routing tables, the executives are demanding answers for the press.

SPEAKER_01 9:39

And the customer service team is giving clients completely inaccurate information.

SPEAKER_00 9:42

Exactly. The disaster multiplies exponentially due to chaos.

SPEAKER_01 9:46

But practically speaking, if the network is down, the company email server is down.

SPEAKER_00 9:56

That is exactly why ISO 27031 requires out-of-band communication plans. You cannot rely on the infrastructure you are trying to fix to communicate about fixing it.

SPEAKER_01 10:06

Oh, that's a great point.

SPEAKER_00 10:07

Mature organizations will have entirely separate cloud-hosted communication platforms, like a standalone Slack workspace or dedicated emergency cellular devices that do not touch their primary network.

SPEAKER_01 10:19

So it's totally isolated.

SPEAKER_00 10:21

Totally isolated. The communication plan dictates exactly who talks to whom on what isolated platform and at what intervals.

SPEAKER_01 10:28

So it's like the core switch is down, we are failing over to the secondary site, notify the client success team, we will be degraded for two hours. It basically manages the panic.

SPEAKER_00 10:38

This raises an important question about the actual restoration, though. Communication is just the wrapper. Inside that wrapper, the standard requires an ICT continuity strategy and ICT recovery plans.

SPEAKER_01 10:49

Aaron Powell What's the mechanical difference between a strategy and a plan in this context?

SPEAKER_00 10:53

So the strategy is the overarching architectural approach. Are we using redundant physical systems in a hot site, or are we relying on cloud failover? Okay. The recovery plans are the tactical, step-by-step technical procedures. It is the literal runbook an engineer executes.

SPEAKER_01 11:08

Aaron Powell When you say cloud failover as a strategy, what is actually happening mechanically? We hear that term all the time. Is there a literal heartbeat signal between the primary data center and the cloud backup?

SPEAKER_00 11:20

Aaron Powell In many architectures, yes. Quite literally. You will have automated monitoring systems constantly sending a health check or a heartbeat to the primary application. If that primary application fails to respond to three consecutive heartbeats, the automated failover mechanism kicks in.

SPEAKER_01 11:36

And how does the user traffic know to go to the cloud instead of the dead server?

SPEAKER_00 11:39

Aaron Powell That usually happens at the DNS level. The system automatically updates the domain name system records, essentially changing the internet's address book on the fly. Oh wow. It drops the time to live or TTL of the routing record to maybe 30 seconds. So all incoming internet traffic is instantly rerouted from the dead physical data center to the standby environment in AWS or Azure.

SPEAKER_01 12:01

That's incredibly fast.

SPEAKER_00 12:02

Yeah. When designed perfectly, the end user might just experience a slight page load delay.

SPEAKER_01 12:07

That is incredible. But you know, the sources also give us some very practical examples of real-world threats ISO 27031 prepares you for. And not all of them are simple hardware failures.

SPEAKER_00 12:20

No, definitely not.

SPEAKER_01 12:21

We are talking about ransomware attacks shutting down entire networks. And this is where I kind of get hung up. If you get hit by advanced ransomware, how do you even recover? If everything is connected and syncing continuously to hit those RPO targets we talked about, wouldn't the ransomware just sink to the backup and corrupt that too?

SPEAKER_00 12:38

That is the exact nightmare scenario. And it happens frequently. Wow. This is why the ICT continuity strategy must account for the specific nature of the threat. To combat ransomware, you cannot just rely on standard continuous replication. You need immutable backups.

SPEAKER_01 12:54

Immutable meaning they cannot be changed. Correct.

SPEAKER_00 12:56

The storage architecture is designed so that once a backup snapshot is written, it is cryptographically locked. It physically cannot be modified or deleted by any user or administrator, even if they have top-level network credentials for a set period of time.

SPEAKER_01 13:09

So even if the ransomware encrypts the primary network and the administrative servers, it just hits a brick wall when it tries to infect the immutable storage array.

SPEAKER_00 13:18

Exactly.

SPEAKER_01 13:19

So the recovery plan runbook for ransomware wouldn't just be restore the backup. It'd be step one, isolate the network. Step two, verify the immutability of the backup repository. Step three, forensically scrub the hardware. And step four, initiate the restoration.

SPEAKER_00 13:35

Exactly. It turns an existential company-ending crisis into a highly stressful but entirely manageable engineering workflow.

SPEAKER_01 13:43

So what does this all mean for day-to-day operations? Because a brilliant architectural strategy and an immutable backup on a piece of paper are completely useless if the engineer just freezes when the alarms actually go off.

SPEAKER_00 13:54

If we connect this to the bigger picture of the standard, that is why the final major component of ISO 27031 is testing and continuous improvement. You cannot assume your plan works. You have to actively prove it works through rigorous testing.

SPEAKER_01 14:09

What does that actually look like? Are IT directors running around like chaos agents, literally pulling fiber optic cables out of server racks to see what happens?

SPEAKER_00 14:16

Honestly, in the most advanced tech companies, yes.

SPEAKER_01 14:19

Wait, really?

SPEAKER_00 14:20

Yeah, that is a practice known as chaos engineering. Companies like Netflix famously develop software like Chaos Monkey that intentionally terminates production instances randomly during the workday.

SPEAKER_01 14:30

Just to see if the failover works.

SPEAKER_00 14:32

Exactly. To ensure the automated failover mechanisms we discussed actually work under real-world conditions. But for most standard enterprises, testing starts with tabletop exercises.

SPEAKER_01 14:42

Walk me through a tabletop exercise. How do you simulate a digital disaster in a conference room?

SPEAKER_00 14:48

Aaron Powell The security leadership will draft a highly specific worst-case scenario. Let's say a novel ransomware strain has just compromised our Active Directory environment.

SPEAKER_01 14:58

Okay.

SPEAKER_00 14:59

Active Directory is the system that controls all user permissions and passwords. So they say the attackers have changed all administrative passwords, you cannot log into your laptops, you cannot access the digital runbooks, the manufacturing floor has halted. Go.

SPEAKER_01 15:13

Wow. That immediately exposes the human flaws. If the recovery plan is saved as a PDF on a server, you can no longer log into your plan as useless.

SPEAKER_00 15:22

Exactly. The tabletop exercise forces the team to realize they need physical printed copies of the runbooks stored in a secure offline safe. It forces them to realize they need an out-of-band communication method because their corporate email uses Active Directory to authenticate. Right. The entire goal of the simulation is to break the plan, identify the gaps, and continuously improve the architecture based on those lessons learned. Technology changes, cloud environments shift, and threat actors evolve daily. And an untested plan is an obsolete plan.

SPEAKER_01 15:54

So taking a step back and looking at everything we've covered, from crumple zones to immutable backups to tabletop simulations, why should the listener care?

SPEAKER_00 16:03

It's a valid question.

SPEAKER_01 16:04

Yeah, why does going through the rigorous, expensive process of aligning with ISO 27031 actually matter to the bottom line of a business?

SPEAKER_00 16:12

Well, the documentation makes it clear that this isn't just an IT checklist. It is a core business survival tool. First, it drastically reduces downtime and financial hemorrhage. When transaction systems are offline, the company is burning money by the second.

SPEAKER_01 16:25

Every single minute of an RTO has a dollar amount attached to it.

SPEAKER_00 16:29

Absolutely. Second, it fundamentally changes your posture against cyber attacks. If you have a tested, immutable backup architecture and a zero trust recovery plan, ransomware loses its leverage.

SPEAKER_01 16:42

Because you can just ignore the ransom.

SPEAKER_00 16:44

Right. You don't have to pay it because you can confidently rebuild the environment yourself. Furthermore, it supports strict regulatory and compliance requirements that governments are increasingly mandating for critical infrastructure and financial sectors.

SPEAKER_01 16:56

And I think the point that really hits home is that it protects reputation.

SPEAKER_00 16:59

Oh, massively.

SPEAKER_01 17:00

If an airline scheduling system goes down for an hour, it makes the evening news. If a bank app is offline for two days, people are moving their direct deposits to a competitor. Trust takes decades to build and a single poorly managed IT incident to destroy. Yep. ISO 27031 proves that disaster recovery is no longer an IT problem. It is a core business operation.

SPEAKER_00 17:23

Trust is the ultimate asset you're protecting when you implement these frameworks.

SPEAKER_01 17:27

So to summarize, the core takeaway from the source text. ISO 27031 is the standard that helps organizations engineer their IT systems to handle extreme disruptions and keep the business running no matter the circumstances. It moves an organization from hoping they survive a crash to mathematically engineering the crumple zone so they know they will.

SPEAKER_00 17:46

It does. But uh, I want to leave you with one final slightly provocative thought to ponder based on one of the practical examples the standard addresses. Cloud service outages.

SPEAKER_01 17:56

Oh, this is the cloud paradox.

SPEAKER_00 17:57

Exactly. We discussed cloud failover. The idea that if your local systems fail, the massive, highly redundant infrastructure of a major provider like AWS or Microsoft Azure will seamlessly take over. Right. And many modern organizations build their entire disaster recovery strategy around this exact assumption. But what happens when the disruption isn't local? Oh man. If a massive cascading infrastructure event takes out the major global cloud providers themselves, which we have seen happen in limited capacities due to routing errors or massive localized weather events, where is the ultimate fail-safe?

SPEAKER_01 18:32

Wow. If your backup plan relies entirely on someone else's computers, what is your strategy when their computers go down?

SPEAKER_00 18:41

It really forces us to question the extreme limits of external redundancy. At a certain point, the cloud is just another physical data center vulnerable to the laws of physics.

SPEAKER_01 18:51

That is a chilling thought. And definitely something to keep you up at night if you are a chief technology officer. Just when you think your digital fortress is secure and your crumple zones are perfect, you realize you might not control the actual ground the fortress is built on.

SPEAKER_00 19:05

A very unsettling but necessary realization.

SPEAKER_01 19:08

Well, on that slightly terrifying but incredibly important architectural note, we are going to wrap up this deep dive on the Way CyberU Unlock podcast. Thank you so much for joining us. Please make sure you follow the channel so you never miss a deep dive. And head over to WeCyberU.com right now to keep exploring the complex frameworks that keep our digital world turning. Until next time, keep your backups immutable and your plans ruthlessly tested.