Data Science x Public Health

You’ve Been Using Secondary Attack Rates Wrong — Here’s What Actually Happens

BJANALYTICS

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 5:04

Secondary attack rates are often used to estimate how infection spreads among close contacts. They seem to provide a focused measure of transmission in households, schools, workplaces, and other settings. But what if the number is being shaped just as much by contact tracing and testing rules as by the pathogen itself? 

In this episode, we break down why secondary attack rates often mislead, how inconsistent contact definitions distort interpretation, and why outbreak metrics cannot be separated from investigation design.

👉 Enjoyed the episode? Follow the show to get new episodes automatically.

If you found the content helpful, consider leaving a rating or review—it helps support the podcast.

For business and sponsorship inquiries, email us at:
📧 contact@bjanalytics.com

Youtube: https://www.youtube.com/@BJANALYTICS

Instagram: https://www.instagram.com/bjanalyticsconsulting/

Twitter/X: https://x.com/BJANALYTICS

Threads: https://www.threads.com/@bjanalyticsconsulting

SPEAKER_00

You know, when we look at a photograph, we tend to just trust what's in the frame. Like if there are five apples on the table, we believe there are just five apples.

SPEAKER_01

Right. Yeah. You don't usually stop to ask if the photographer crept out a whole orchard just outside the frame.

SPEAKER_00

Aaron Powell And today we are bringing that exact idea into the world of public health. So we've gathered a stack of recent epidemiological papers, public health data reports, and research notes to bring you this deep dive into what we're calling the surveillance trap.

SPEAKER_01

Aaron Powell Yeah, our mission today is really to unpack why a seemingly perfect public health metric, one that actually shapes everyday rules, is hiding a massive blind spot.

SPEAKER_00

Aaron Powell And the metric at the center of all these papers is the secondary attack rate, or SAR, which sounds incredibly straightforward.

SPEAKER_01

It does. I mean, on paper, it sounds like a direct, unobstructed window into a pathogen's biology. It basically asks if one person gets sick, how often does that infection spread to their close contacts?

SPEAKER_00

Aaron Powell Okay, let's unpack this because here is the trap. To understand why SAR is flawed, you really have to look at the map.

SPEAKER_01

Yeah, it's basically just a fraction.

SPEAKER_00

The numerator is the number of contacts who actually get infected, and the denominator is the total pool of close contacts. The assumption is that you are measuring the disease.

SPEAKER_01

But really, you are measuring the people conducting the investigation. What's fascinating here is that the first major crack actually appears in that denominator.

SPEAKER_00

Wait, so just in how we count the pool of contacts?

SPEAKER_01

Yeah, because to count them, you have to define what a contact is. And we see in these research notes that the definition is just wildly inconsistent across different health departments. One jurisdiction might define a contact as a full 15 minutes of exposure, while another uses a strict six-foot physical distance rule.

SPEAKER_00

And sometimes investigators literally just rely on human memory, like asking a patient who they remember standing next to last Tuesday.

SPEAKER_01

Which means the denominator is completely unstable.

SPEAKER_00

Aaron Powell It's like comparing crime rates between two cities, but one city counts jaywalking as a crime, while the other only counts bank robberies.

SPEAKER_01

You just can't compare those two rates and declare one city is inherently more dangerous.

SPEAKER_00

And these discrepancies usually aren't malicious either. I mean, they often just come down to resource constraints, don't they?

SPEAKER_01

Oh, absolutely. An overwhelmed health department dealing with massive caseloads might adopt a much narrower definition of a contact simply because they don't have the staff to track down every brief interaction.

SPEAKER_00

Aaron Powell So once you have that unstable foundation, the way investigators go out and build the numerator actually finding the infected cases, that warps the data even further.

SPEAKER_01

Yeah, the testing distortion is huge here.

SPEAKER_00

Aaron Powell But hang on, isn't some data better than no data? I mean, even if a health department is overwhelmed, shouldn't we still rely on the transmission rates they report for, say, households?

SPEAKER_01

Well, it tells you something about a household when it is watched closely, but that's the catch.

SPEAKER_00

Like a 40% household infection rate still tells us something useful, even if it's slightly overindexed.

SPEAKER_01

Only to a point. The mechanical problem is how different testing protocols alter that final percentage. Imagine one group of contacts is tested universally, regardless of symptoms, but another group is only tested if they run a fever. If a disease spreads asymptomatically and you only test the sick people.

SPEAKER_00

All those asymptomatic cases are left out of your numerator entirely.

SPEAKER_01

Precisely. Suddenly the disease looks far less transmissible on paper. But because you are only counting the sickest people, it artificially looks much more severe.

SPEAKER_00

So you are no longer looking at the biological reality of the virus?

SPEAKER_01

No, you are looking at the investigation design.

SPEAKER_00

Wow. So what does this all mean for you listening at home? Think about the protocols at your office or the rules your kids' school sets up. This isn't just abstract math.

SPEAKER_01

No, not at all. These distorted numbers dictate whether a specific environment is labeled high risk or safe.

SPEAKER_00

Because a household often looks incredibly transmissible, registering a massive secondary attack rate.

SPEAKER_01

But we have to ask if it is actually that dangerous. Or if public health officials just monitored that family relentlessly because a family unit is geographically captive and easy to track.

SPEAKER_00

Which makes sense. They are all right there in one place.

SPEAKER_01

Conversely, a busy factory floor might look statistically safe, but exposure ascertainment in a giant warehouse is incredibly difficult and weak.

SPEAKER_00

Yeah, the tracking is weak, so the resulting rate just looks low. We end up cracking down on the environments we watch closely, and we ignore the ones we don't.

SPEAKER_01

Major safety interventions are justified based on the illusions created by surveillance limits. Good epidemiology and really just good critical thinking requires us to ask how the pool of contacts was constructed before trusting what the final rate appears to say.

SPEAKER_00

Because the surveillance process itself is permanently baked into that denominator. It all goes back to the person holding the camera, which leaves you with this to mull over. If our perception of a high risk environment is warped by how closely we watch it, how much of our everyday safety intuition is based on actual pathogen behavior, and how much is just a reflection of who happened to be holding the magnifying glass.