Everyone Uses Attack Rates… But They Fail When Exposure Isn’t Shared Artwork

Data Science x Public Health

This podcast discusses the concepts of data science and public health, and then delves into their intersection, exploring the connection between the two fields in greater detail.

All Episodes

Data Science x Public Health

Everyone Uses Attack Rates… But They Fail When Exposure Isn’t Shared

April 29, 2026 • BJANALYTICS

0:00 | 4:39

Attack rates are one of the most common tools in outbreak epidemiology. They seem to offer a quick answer to a simple question: how many exposed people got sick? But what if the exposed group was never truly sharing the same exposure in the first place?

In this episode, we break down why attack rates often fail when exposure is uneven, how denominator assumptions distort outbreak interpretation, and why summary measures can hide the real structure of transmission.

👉 Enjoyed the episode? Follow the show to get new episodes automatically.

If you found the content helpful, consider leaving a rating or review—it helps support the podcast.

For business and sponsorship inquiries, email us at:
📧 contact@bjanalytics.com

Youtube: https://www.youtube.com/@BJANALYTICS

Instagram: https://www.instagram.com/bjanalyticsconsulting/

Twitter/X: https://x.com/BJANALYTICS

Threads: https://www.threads.com/@bjanalyticsconsulting

SPEAKER_01 0:00

You ever look at a public health statistic on the news? 30% of people of this event got sick and just think, wow, that is a clean, hard fact.

SPEAKER_00 0:07

Yeah, it sounds so definitive.

SPEAKER_01 0:09

It really does. Welcome to the deep guive. Today our mission is exploring this um this really fascinating paper called The Illusion of Uniformity: Why Attack Rates Fail. And it argues that this exact metric, the attack rate, is often just, well, fundamentally flawed.

SPEAKER_00 0:26

Yeah, and I mean, the attack rate is it's the go-to metric for epidemiologists. When they need to summarize an outbreak fast, they use it. It's simply the number of people who got sick divided by the total number of people exposed.

SPEAKER_01 0:37

Okay, let's unpack this because on the surface, that sounds like exactly what you'd want, right? Like if there's a foodborne illness at a banquet or a virus spreading in a shelter, you just do the math and immediately know what you're dealing with.

SPEAKER_00 0:47

The illusion is hiding in the denominator. So that pool of quote unquote exposed people. The paper introduces this idea of false exposure uniformity. It treats everyone in that denominator as if they faced the exact same risk, but in reality, they didn't.

SPEAKER_01 1:03

So it's basically like measuring the attack rate of getting soaked in a rainstorm at a park, and you just group everyone who was at the park together.

SPEAKER_00 1:10

Yes, that is a perfect way to put it.

SPEAKER_01 1:13

Because you're completely ignoring like who had an umbrella or who stood under an awning or who maybe just showed up after the rain actually stopped.

SPEAKER_00 1:21

Right. What's fascinating here is that to take your rainstorm analogy a step further, the attack rate mathematically averages everyone together. So it might conclude that a localized downpour was actually just like a light drizzle across the entire park. When you collapse all that variation into a single exposed group, you completely flatten the reality of the transmission.

SPEAKER_01 1:40

Right, because I mean, in a real indoor space, some people are sitting right next to the AC event while others are crammed into some poorly ventilated corner.

SPEAKER_00 1:48

Think about that famous 2020 choir practice outbreak.

SPEAKER_01 1:52

Oh, the one in Washington State.

SPEAKER_00 1:54

That's the one. If you just look at the raw attack rate for that space, the virus looks almost supernaturally contagious.

SPEAKER_01 2:00

Right, because so many people got sick.

SPEAKER_00 2:02

But that single number completely obscures the actual mechanism. I mean, they were projecting their voices in a closed, unventilated room for what, two and a half hours? The attack rate doesn't tell you how the transmission happened, it just averages the room.

SPEAKER_01 2:16

But wait, if this is such a known variable like that averaging destroys nuance, why do highly trained public health investigators keep falling into the trap? Like they know how ventilation and proximity work.

SPEAKER_00 2:28

They do, yeah. But it's really a matter of practical reality on the ground. What do you mean? During an active crisis, investigators just don't have perfect granular data. They don't have this like magic 3D heat map of exactly where everyone stood or who breathed on whom.

SPEAKER_01 2:44

Right. They just have a guest list.

SPEAKER_00 2:45

So they just use the denominator they actually have.

SPEAKER_01 2:48

So what does this all mean then? If they're just using the best data available in a crisis, are we just splitting academic hairs here, or does this actually do damage?

SPEAKER_00 2:57

Yeah, because a fast, dirty summary measure quietly becomes a false certainty measure. When you dilute the severe risk of that unventilated corner by averaging it with the well-ventilated patio outside, you actively distort the whole narrative.

SPEAKER_01 3:12

Which means public health interventions might end up targeting the wrong things entirely.

SPEAKER_00 3:16

Precisely. It hides super spreading conditions by just burying them in a broader group. It makes one setting look wildly dangerous and another look safe, purely based on how arbitrarily the data was grouped. We end up fighting the average instead of the actual threat.

SPEAKER_01 3:30

So we're just asking too much of a compressed number. But if we can't get that perfect 3D heat map in real time, what does better practice actually look like?

SPEAKER_00 3:38

Well, it means investigators really have to interrogate the denominator. Instead of using a blanket, shared a building exposure pool, they need to stratify the data. Timing, proximity, airflow, we just have to start treating these rates as simplified snapshots, not the full architectural blueprint of how a disease moves.

SPEAKER_01 3:56

That is such a massive takeaway for you listening. Like the next time you see a clean percentage summarizing a risk or an outbreak, you really need to remember to ask what variations are hiding inside that exposed group? You know, who actually had the umbrella in the rainstorm?

SPEAKER_00 4:11

Right. The number is just the start of the investigation, not the end. If we step back, we really have to stop letting comforting averages replace critical thinking.

SPEAKER_01 4:19

Absolutely. Which leaves us with this to mull over. If our most urgent public health summaries are this vulnerable to the flaws of averaging, what other daily statistics are doing the exact same thing? I mean, in business, in tech, in economics, where else is a comforting, clean number hiding a complex truth just to give us a false sense of certainty?