Data Science x Public Health
This podcast discusses the concepts of data science and public health, and then delves into their intersection, exploring the connection between the two fields in greater detail.
Data Science x Public Health
Everyone Uses Incidence Rates… But They Fail When Time at Risk Is Wrong
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Incidence rates are one of the most common measures in epidemiology. They are used to describe how quickly disease is appearing in a population and to compare risk across groups. But what if the rate looks correct while the underlying time at risk is completely wrong?
In this episode, we break down why incidence rates fail when person-time is misdefined, how denominator errors distort epidemiologic findings, and why this problem matters for surveillance, cohort studies, and public health decision-making.
👉 Enjoyed the episode? Follow the show to get new episodes automatically.
If you found the content helpful, consider leaving a rating or review—it helps support the podcast.
For business and sponsorship inquiries, email us at:
📧 contact@bjanalytics.com
Youtube: https://www.youtube.com/@BJANALYTICS
Instagram: https://www.instagram.com/bjanalyticsconsulting/
Twitter/X: https://x.com/BJANALYTICS
So what if I told you that a multimillion dollar public health intervention could be deemed a massive success or just a complete failure, not because of what the drug actually did, but basically because of a simple accounting error at the bottom of a fraction.
SPEAKER_00Yeah. I mean, it happens way more than you'd think. Like think about the last time you read a headline saying a new virus variant is spreading 50% faster.
SPEAKER_01You see those alerts everywhere.
SPEAKER_00Exactly. But that alarming rate, it entirely depends on getting the underlying math exactly right.
SPEAKER_01Wow. Well, welcome to today's deep dive. We are exploring a really insightful paper today called The Denominator Trap: Mastering Epidemiologic Time at Risk.
SPEAKER_00So prevalence basically just tells you how much disease exists in a population right now. But incidence that tracks how fast new cases are emerging over time. Incidence is really the go-to metric for tracking outbreaks.
SPEAKER_01Okay, so it's sort of like a leaky roof. Prevalence is the puddle on the floor, and incidence is how fast the drips are actually falling. So why do we mess up counting the drips?
SPEAKER_00Well, the problem isn't actually counting the drips. The numerator, the cases themselves, that gets all the attention. The real mistake hides in the denominator.
SPEAKER_01Wait, the bottom of the fraction?
SPEAKER_00Yes. It's the time at risk. If people aren't truly observable or susceptible or eligible to get sick at that specific moment, the whole rate is completely distorted.
SPEAKER_01Okay, let's unpack this. Calculating the denominator sounds like it shouldn't be that hard, right? You just count the people?
SPEAKER_00I mean, you'd think so. But real human lives don't fit neatly into a spreadsheet. Summing up what we call person time is incredibly messy.
SPEAKER_01Right, because people are constantly moving or changing jobs or whatever.
SPEAKER_00If people enter and leave cohorts, they move away, or they die of other causes. Plus, administrative reporting delays completely misalign with the actual biological exposure windows.
SPEAKER_01But I mean, if we're tracking millions of people in these big databases, doesn't the law of large numbers eventually smooth this out? Or are we actively skewing the data here?
SPEAKER_00Oh, we are absolutely skewing the data. It doesn't smooth out at all. It actually creates these mathematical artifacts and introduces things like immortal time bias.
SPEAKER_01Immortal time bias? Wait, what is that? That sounds like a sci-fi movie.
SPEAKER_00Yeah, it sounds crazy, right? But it's basically when the math looks super precise on paper, but it completely violates epidemiologic logic, counting someone's survival time while they were just waiting to receive a treatment.
SPEAKER_01Oh wow. So the drug gets credit for the time the person survived before even taking it.
SPEAKER_00You got it. They had to survive long enough just to get the pill, so the math takes that waiting time and gives it to the drug's success rate.
SPEAKER_01That is wild. So this sloppy denominator isn't just some academic headache. It directly impacts real-world health interventions.
SPEAKER_00Oh, 100%. A bad denominator will exaggerate or dilute intervention effects. It distorts seasonal trends and it misattributes drug safety entirely.
SPEAKER_01So what does this all mean then? Are we shaping massive public health funding and messaging campaigns based on mathematical illusions?
SPEAKER_00If you connect this to the bigger picture, yeah, pretty much.
SPEAKER_01So how do we actually fix the math?
SPEAKER_00Aaron Powell It requires really strict design discipline. You have to clearly define entry, exit, censoring, and true susceptibility. Strong epidemiology has to align the math with the real world risk process.
SPEAKER_01Yeah, that makes a lot of sense. The real story and data often just hides at the bottom of the fraction. Well, if the world's top health metrics can be skewed simply by miscounting when people are actually at risk, here is something for you to think about. How might this same denominator trap be secretly distorting the performance metrics you rely on every day in your own career?
SPEAKER_00That is a great question to leave on.
SPEAKER_01Are you looking at a clear picture of your success, or are you getting caught in the trap? Thanks for joining us on this deep dive, and we will catch you next time.