Everyone Uses Risk Scores… But They Fail When Care Is Unequal Artwork

Data Science x Public Health

This podcast discusses the concepts of data science and public health, and then delves into their intersection, exploring the connection between the two fields in greater detail.

All Episodes

Data Science x Public Health

Everyone Uses Risk Scores… But They Fail When Care Is Unequal

April 01, 2026 • BJANALYTICS

0:00 | 4:46

Risk scores are used everywhere in healthcare and public health.
They are designed to identify who is most at risk and where interventions should be targeted.

But what if those scores are quietly reflecting unequal systems of care rather than true need?

In this episode, we break down how bias enters risk models through utilization, access, and data structure—and why even high-performing models can fail the populations that need help most. You will learn why fairness cannot be solved by metrics alone and what better model design and evaluation actually look like.

👉 Enjoyed the episode? Follow the show to get new episodes automatically.

If you found the content helpful, consider leaving a rating or review—it helps support the podcast.

For business and sponsorship inquiries, email us at:
📧 contact@bjanalytics.com

Youtube: https://www.youtube.com/@BJANALYTICS

Instagram: https://www.instagram.com/bjanalyticsconsulting/

Twitter/X: https://x.com/BJANALYTICS

Threads: https://www.threads.com/@bjanalyticsconsulting

SPEAKER_01 0:00

You know, when you uh when you punch numbers into a calculator, you just expect pure objectivity.

SPEAKER_00 0:05

Right. Yeah, it feels totally clean.

SPEAKER_01 0:06

Exactly. It's rational, you trust the result completely. But um when you apply that same calculator logic to healthcare algorithms, well, that math gets surprisingly twisted.

SPEAKER_00 0:16

Oh, it really does.

SPEAKER_01 0:17

Yeah. So today we are doing a deep dive into some excerpts from the hidden bias in algorithmic rest scoring. And the mission for you listening today is to uncover how, you know, mathematically sound algorithms can actually quietly inherit and amplify real-world inequalities.

SPEAKER_00 0:34

Aaron Powell And it is such an urgent issue to look at. I mean, hospitals and insurers, they rely heavily on these risk scores to efficiently rank patients and figure out where to allocate limited resources.

SPEAKER_01 0:45

Because they lean on the numbers, since numbers just naturally feel fair to us. But if they feel fair, why do these, you know, highly efficient scores fail so badly in practice?

SPEAKER_00 0:53

Aaron Powell Well, it really comes down to what they actually measuring. The research points out that a lot of these risk models they rely heavily on prior utilization.

SPEAKER_01 1:01

Aaron Powell Like claims history and that sort of thing.

SPEAKER_00 1:03

Yeah, claims history, past diagnosis patterns, and just overall healthcare costs. The algorithm basically looks at how much a patient interacted with the system in the past to predict future need.

SPEAKER_01 1:14

Aaron Powell There's this analogy that really clicked for me while reading this. It's like uh trying to measure a person's hunger by only counting their restaurant receipts.

SPEAKER_00 1:22

Oh, that is a great way to put it.

SPEAKER_01 1:24

Right. Because if someone lacked the money to eat out or they lack the transportation to get to a clinic, they just wouldn't have any receipts.

SPEAKER_00 1:29

Exactly. The data just assumes they simply aren't hungry. Or, you know, in the case of healthcare, it assumes they aren't sick. Which is wild. It is. And you see that play out systematically. If two people have the exact same biological illness burden, but one has, say, less historical access to care, the model assigns a lower risk score to that underserved person.

SPEAKER_01 1:52

Aaron Powell So it's not actually measuring pure biological need at all.

SPEAKER_00 1:56

It's really measuring system visibility and uh clinician coding behavior, basically.

SPEAKER_01 2:00

Aaron Powell But wait, let me push back on that for a second. If the algorithm is just accurately reflecting the historical data it was given, aren't we just blaming the mirror for why fault the math if it's just being completely accurate to the past?

SPEAKER_00 2:12

Aaron Powell Well, because it's not just a mirror. It is a mirror that dictates the future.

SPEAKER_01 2:17

Oh, I see.

SPEAKER_00 2:18

Yeah. While a model might be statistically accurate relative to history, you know, boasting strong calibration or a high AUC, it deeply misaligns with the actual goal.

SPEAKER_01 2:30

Aaron Powell, which is equitable intervention, right?

SPEAKER_00 2:31

Trevor Burrus Exactly. By relying on that reflection, the model underestimates the need in historically ignored groups, which actively denies them future outreach. It basically quietly turns past inequity into future policy.

SPEAKER_01 2:45

Wow. Okay. So the standard alarms just don't ring for data scientists because on paper the model looks perfectly optimized.

SPEAKER_00 2:52

Right. Algorithmic bias here doesn't look like malicious code. It just looks like optimization.

SPEAKER_01 2:57

Aaron Powell So how do we fix a tool that just blindly copies history like that?

SPEAKER_00 3:00

Aaron Powell It requires a massive shift in how we build them. So the researchers call um epidemiologic thinking.

SPEAKER_01 3:07

Aaron Powell Meaning what exactly in a data science context?

SPEAKER_00 3:10

Aaron Powell It means heavily scrutinizing your target design. Like you have to ask, are you optimizing for preventable harm or are you just optimizing for cost?

SPEAKER_01 3:19

Aaron Powell Because historically we spend less money on marginalized groups, right?

SPEAKER_00 3:23

Aaron Powell Exactly. So if your algorithm uses cost as a proxy for health need, it mathematically assumes those groups don't need care.

SPEAKER_01 3:30

Just simply because less money was spent on them in the past, that is so flawed.

SPEAKER_00 3:35

It really is. You have to check if your variables are secretly acting as proxies for structural disadvantage.

SPEAKER_01 3:41

Aaron Powell Like using a zip code might quietly map to a systemic lack of access rather than like any biological reality.

SPEAKER_00 3:48

Aaron Powell Yes, perfect example. And then you need to look specifically at who the model is missing. This is what we call the subgroup false negative burden.

SPEAKER_01 3:55

So like if the model is wrong 10% of the time, is that 10% disproportionately made up of one specific demographic?

SPEAKER_00 4:01

Aaron Powell Precisely. You have to measure the errors across different groups, not just the overall population.

SPEAKER_01 4:07

Aaron Powell Which is such a good reminder for you listening. Next time you look at a risk score, you know that number is never neutral just because it's a number.

SPEAKER_00 4:14

Absolutely. Algorithms sit inside human workflows. And fairness depends heavily on what actions actually follow the score.

SPEAKER_01 4:22

Aaron Powell Right. It forces us to interrogate our own systems and honestly have the humility to ask if we are measuring what actually matters.

SPEAKER_00 4:30

Aaron Powell Instead of just what is easy to count.

SPEAKER_01 4:32

Which leaves you with this to chew on. If our current predictive algorithms are perfectly designed to mimic our flawed historical data, well, how can we mathematically encode the equitable future we actually want to build rather than the past we are trying to escape?