Data Science x Public Health
This podcast discusses the concepts of data science and public health, and then delves into their intersection, exploring the connection between the two fields in greater detail.
Data Science x Public Health
You’ve Been Using Secondary Attack Rates Wrong — Here’s What Actually Happens
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Secondary attack rates are often used to estimate how infection spreads among close contacts. They seem to provide a focused measure of transmission in households, schools, workplaces, and other settings. But what if the number is being shaped just as much by contact tracing and testing rules as by the pathogen itself?
In this episode, we break down why secondary attack rates often mislead, how inconsistent contact definitions distort interpretation, and why outbreak metrics cannot be separated from investigation design.
👉 Enjoyed the episode? Follow the show to get new episodes automatically.
If you found the content helpful, consider leaving a rating or review—it helps support the podcast.
For business and sponsorship inquiries, email us at:
📧 contact@bjanalytics.com
Youtube: https://www.youtube.com/@BJANALYTICS
Instagram: https://www.instagram.com/bjanalyticsconsulting/
Twitter/X: https://x.com/BJANALYTICS
You know, when we look at a photograph, we tend to just trust what's in the frame. Like if there are five apples on the table, we believe there are just five apples.
SPEAKER_01Right. Yeah. You don't usually stop to ask if the photographer crept out a whole orchard just outside the frame.
SPEAKER_00Aaron Powell And today we are bringing that exact idea into the world of public health. So we've gathered a stack of recent epidemiological papers, public health data reports, and research notes to bring you this deep dive into what we're calling the surveillance trap.
SPEAKER_01Aaron Powell Yeah, our mission today is really to unpack why a seemingly perfect public health metric, one that actually shapes everyday rules, is hiding a massive blind spot.
SPEAKER_00Aaron Powell And the metric at the center of all these papers is the secondary attack rate, or SAR, which sounds incredibly straightforward.
SPEAKER_01It does. I mean, on paper, it sounds like a direct, unobstructed window into a pathogen's biology. It basically asks if one person gets sick, how often does that infection spread to their close contacts?
SPEAKER_00Aaron Powell Okay, let's unpack this because here is the trap. To understand why SAR is flawed, you really have to look at the map.
SPEAKER_01Yeah, it's basically just a fraction.
SPEAKER_00The numerator is the number of contacts who actually get infected, and the denominator is the total pool of close contacts. The assumption is that you are measuring the disease.
SPEAKER_01But really, you are measuring the people conducting the investigation. What's fascinating here is that the first major crack actually appears in that denominator.
SPEAKER_00Wait, so just in how we count the pool of contacts?
SPEAKER_01Yeah, because to count them, you have to define what a contact is. And we see in these research notes that the definition is just wildly inconsistent across different health departments. One jurisdiction might define a contact as a full 15 minutes of exposure, while another uses a strict six-foot physical distance rule.
SPEAKER_00And sometimes investigators literally just rely on human memory, like asking a patient who they remember standing next to last Tuesday.
SPEAKER_01Which means the denominator is completely unstable.
SPEAKER_00Aaron Powell It's like comparing crime rates between two cities, but one city counts jaywalking as a crime, while the other only counts bank robberies.
SPEAKER_01You just can't compare those two rates and declare one city is inherently more dangerous.
SPEAKER_00And these discrepancies usually aren't malicious either. I mean, they often just come down to resource constraints, don't they?
SPEAKER_01Oh, absolutely. An overwhelmed health department dealing with massive caseloads might adopt a much narrower definition of a contact simply because they don't have the staff to track down every brief interaction.
SPEAKER_00Aaron Powell So once you have that unstable foundation, the way investigators go out and build the numerator actually finding the infected cases, that warps the data even further.
SPEAKER_01Yeah, the testing distortion is huge here.
SPEAKER_00Aaron Powell But hang on, isn't some data better than no data? I mean, even if a health department is overwhelmed, shouldn't we still rely on the transmission rates they report for, say, households?
SPEAKER_01Well, it tells you something about a household when it is watched closely, but that's the catch.
SPEAKER_00Like a 40% household infection rate still tells us something useful, even if it's slightly overindexed.
SPEAKER_01Only to a point. The mechanical problem is how different testing protocols alter that final percentage. Imagine one group of contacts is tested universally, regardless of symptoms, but another group is only tested if they run a fever. If a disease spreads asymptomatically and you only test the sick people.
SPEAKER_00All those asymptomatic cases are left out of your numerator entirely.
SPEAKER_01Precisely. Suddenly the disease looks far less transmissible on paper. But because you are only counting the sickest people, it artificially looks much more severe.
SPEAKER_00So you are no longer looking at the biological reality of the virus?
SPEAKER_01No, you are looking at the investigation design.
SPEAKER_00Wow. So what does this all mean for you listening at home? Think about the protocols at your office or the rules your kids' school sets up. This isn't just abstract math.
SPEAKER_01No, not at all. These distorted numbers dictate whether a specific environment is labeled high risk or safe.
SPEAKER_00Because a household often looks incredibly transmissible, registering a massive secondary attack rate.
SPEAKER_01But we have to ask if it is actually that dangerous. Or if public health officials just monitored that family relentlessly because a family unit is geographically captive and easy to track.
SPEAKER_00Which makes sense. They are all right there in one place.
SPEAKER_01Conversely, a busy factory floor might look statistically safe, but exposure ascertainment in a giant warehouse is incredibly difficult and weak.
SPEAKER_00Yeah, the tracking is weak, so the resulting rate just looks low. We end up cracking down on the environments we watch closely, and we ignore the ones we don't.
SPEAKER_01Major safety interventions are justified based on the illusions created by surveillance limits. Good epidemiology and really just good critical thinking requires us to ask how the pool of contacts was constructed before trusting what the final rate appears to say.
SPEAKER_00Because the surveillance process itself is permanently baked into that denominator. It all goes back to the person holding the camera, which leaves you with this to mull over. If our perception of a high risk environment is warped by how closely we watch it, how much of our everyday safety intuition is based on actual pathogen behavior, and how much is just a reflection of who happened to be holding the magnifying glass.