EP114: Dr. Todd Horowitz on medical image perception, AI, and collaboration

There is so much that we don't know in medicine, plus there is human error, and it would be great if AI could help. Dr. Todd Horowitz is an expert in attention and research on medical image perception. We talk about what you can see in a quick glance, how computer algorithms can fail, and how best to figure out how AI can help us. Dr. Todd Horowitz, PhD, is a prominent cognitive psychologist with a keen interest in understanding how the human mind processes visual information and the complexities of perception and attention. He has made significant contributions to our understanding of visual memory, visual search, and attentional mechanisms with over 70 peer-reviewed papers. He is currently Program Director in the Behavioral Research Program’s Basic Biobehavioral and Psychological Sciences Branch, located in the Division of Cancer Control and Population Sciences at the National Cancer Institute.

[00:00:00] Christine Ko: Welcome back to SEE HEAR FEEL. Today I'm very happy to be with Dr. Todd Horowitz. Dr. Todd Horowitz, PhD is a prominent cognitive psychologist with a keen interest in understanding how the human mind processes visual information and the complexities of perception and attention. He has made significant contributions to our understanding of visual memory, visual search, and attentional mechanisms, with over 70 peer reviewed papers. He is currently Program Director in the Behavioral Research Program's Basic Biobehavioral and Psychological Sciences Branch, located in the Division of Cancer Control and Population Sciences at the NCI, the National Cancer Institute. Welcome to Todd.

[00:00:42] Todd Horowitz: Thank you for having me.

[00:00:43] Christine Ko: I'm so glad to have you. First off, would you be able to share a personal anecdote?

[00:00:48] Todd Horowitz: Yeah. A few years ago, along with my fellow Melissa Trevino, we were collaborating with some people here in the NCI clinical center who do work on diagnosing prostate cancer via MRI. They were teaching radiologists from all over the country to read MRI for prostate. We were allowed to observe, basically. The people teaching this course were the top people in prostate MRI, right? And one of the preceptors was helping a student with a case and another one just walked by, glanced at the monitor, and said, Oh, that's a PIRADS 5, which is the most severe rating and indicates that there's probably cancer there. She could just tell at a glance that there was something wrong there. And that really tells you a lot about the perceptual processes involved. As a diagnostician yourself, you know what a difficult perceptual task this is. A lot of what experts do just gets automated. Like, it's not, okay, I'm searching for this thing. I'm searching for that thing. There is that level, there's that conscious level, but I think a big part of it is just this kind of pattern recognition that you develop over years and years. And that pattern recognition is amazingly good.

[00:02:03] Coming from the basic vision science literature, there's a large literature on what's called gist perception. Basically, you flash an image for half a second, a quarter of a second, a 10th of a second, and you ask people to make certain judgments on it. Is this an open or closed scene? Is that natural, artificial? That kind of thing. And people are really good at that. Researchers started doing these kinds of experiments with diagnosticians. Parenthetically, we use the word diagnostician to cover radiologists, pathologists; a professional expert who's looking at a medical image.

[00:02:36] So you can do this with diagnosticians. You can flash a brief image of a mammogram, right? And experts can tell you if that is normal, abnormal within a couple hundred milliseconds. They're not perfect, but they're way better than chance. And so that tells you there's a lot of information in these images, and there's a lot of processing power in people's heads that can pull that information out.

[00:03:01] Christine Ko: Absolutely. I would love to get some of the references from you, if it's easy for you. I have found articles that facial recognition or visual recognition of an object is like 200 milliseconds really. Seeing a slide, it's like facial recognition. You see the slide and you know it, and it's just like the facial recognition of someone very well. They can cut their hair very short or dye it or be wearing funny glasses. You still know who they are. You're not tricked. So the more experience, similarly, we have with a given diagnosis, whether on the slide or on the patient, you're not tricked if there are like a couple things that are different or atypical, in medical speak. Yeah. So, can you talk about some key things you've learned about medical image perception?

[00:03:49] Todd Horowitz: Yeah. We've already brought up the importance of this sort of non explicit information, right? Like you don't necessarily know what it is you've seen, but you've seen it right? And then I think a lot of the interpretive process that diagnosticians go through is basically coming up with a justification. They might change their minds. There's more information that you get if I stare at a case for 10 minutes or an hour than in 200 milliseconds. But I think that first 200 milliseconds or 500 milliseconds is really providing a lot of the information that people use.

[00:04:22] Another thing is the role of technology. Medical image perception is very technologically mediated. A lot of what diagnosticians do, since over the last century, is mediated by X ray machines and MRI machines. And now we have, of course, the advent of AI assistance. When I explained to people my interest in medical image perception, they respond by saying, Oh, computers are going to do that. Computers are going to take that over. We don't have to worry about that. That comes up a disturbing amount. And of course, I don't think that's true or I wouldn't be doing this work.

[00:05:01] A lot of the developments in AI are very exciting and and very promising, but I think a lot of them are almost gimmicky, where they're coming at it from a computer science point of view. They train their algorithm, and they get something that looks really cool in demos. But it's not going to have a real effect in the clinic unless it's designed in a way that humans can really take advantage of the information. And so I think there's a lot of science to be done in figuring out the best way that information can be conveyed to the human in a way that takes advantage of the amazing powers of the human and the computer rather than making them fight each other.

[00:05:40] Around the turn of the century, they developed these very sophisticated machine vision algorithms that were used to assist in detection of cancer. So it was called computer aided detection. The radiologists are reading the case, and there's also a computer vision analysis of that. And it can highlight areas that it thinks are suspicious. And these did really well in the sort of trials that the FDA requires. These algorithms got approved, and then the Center for Medicare services says, okay, we will reimburse you for using these technologies. And then the technology just spread like wildfire. This was at the same time as they were transitioning to digital radiology between, I think, 2000 and 2012. In 2000, all mammograms were just read using film. And by 2012, 90 percent were digital using computer aided detection, using CAD. And then a few years after that, people started studying whether the CAD was actually improving detection of cancer in the field, and there's some evidence that it actually hurt things. That is always going to be an issue. You could spend billions of dollars and end up either with no improvement or actively making things worse.

[00:06:59] Christine Ko: Yeah. Have you figured anything out yet?

[00:07:02] Todd Horowitz: I wouldn't say it's figuring things out, but you really have to pay attention to prevalence. A lot of the problem with the CAD case, for example, was that when you test these systems, you're testing them at fairly high prevalence. You want to see how well it works in picking up a cancer. So you have a lot of cases with cancer. But the problem is when you go into the field, if you're doing mammography in a screening population, you're gonna have to look through 300 cases before you find a single cancer. And so that means that most of the time your CAD system, even the most highly accurate CAD system, it's giving you false alarms most of the time. You can learn to discount that. Prevalence changes the way the human processes information. A lot of this stuff isn't explicit; your brain learns, all right, so the CAD says there's a cancer here, but it's not very reliable, even though it it is highly accurate.

[00:07:59] Conversely these systems also can sometimes lead the human to over rely on them. If the CAD, it's saying, all right, these are the important areas of the image where the algorithm thinks something's wrong. The human will look there and reduces their tendency to look elsewhere. There are people who are working on trying to figure out what's the best way to present information to get around these problems. You always have to consider these problems.

[00:08:23] Christine Ko: When you talked about prevalence, all that makes sense to me. And you said that a human being like me without the computer aided detection, say, or even with it, that I am taking into account prevalence. But there is this cognitive bias called base rate neglect, right? Human beings are prone to that. So isn't that the same thing? That the computer is struggling with in terms of prevalence? Or not?

[00:08:46] Todd Horowitz: Base rate neglect is definitely an important cognitive bias. It turns out that as prevalence gets lower, you need more evidence. The exact same image, right? You present it in a high prevalence context. You're going to be more likely to say, yes, there's an abnormality here. In a low prevalence context, you're going to be less likely. And, what we found is that there are at least two different processes going on there. One is even when your eye actually falls on the abnormality, It requires more evidence to say, yes, this is an abnormality when it's at low prevalence, just because of the base rate.

[00:09:26] Christine Ko: Yeah.

[00:09:26] Todd Horowitz: But the other thing is, at low prevalence, you're just more likely to give up early, right? If you've been looking through a hundred cases, and you have not seen a cancer, the hundred and first case, you're just not going to spend as much time searching every corner of that image for a cancer as if you'd been in an environment where half the cases you'd examined had cancer. So these are at least two processes that sort of push you into missing abnormalities at low prevalence.

[00:09:56] And conversely, it goes the other way. There are also high prevalence contexts in medicine. As a pathologist, you might be a little more familiar with that because nobody takes a biopsy unless somebody's seen something. In a lot of cases, the radiologist sees something suspicious, they take a biopsy, and they send it to the to the pathologist. So oftentimes the pathologists are getting stuff that there's probably something there. And that makes them more likely to over call.

[00:10:22] Christine Ko: Yes. Absolutely. What you said goes along with how we practice. For example, your high prevalence versus low prevalence, there's a really good, I think, example or analogy in dermatopathology with malignant melanoma. So pre pubertal melanoma is pretty vanishingly rare, especially at a non tertiary care center. So if you're a dermatopathologist in like a pretty routine general practice, you probably won't see melanoma in a two year old. At least you're definitely not going to see it every day. You're definitely not going to see it once a year. You may not see it even once in 10 years. Maybe you'll see it once, maybe twice in your lifetime. If the same exact slide is from a 90 year old... we will be much more likely to call it, oh, that's for sure melanoma. No problem. But if we are told, Oh, this is from a two year old, you're absolutely right. We often will require more evidence, okay, like really, how many cells are dividing, how atypical are these cells? Was it really symmetric? How large was the lesion? How concerned was the clinician? Was there anything weird about it clinically? We do require more evidence. But like you just said, when a biopsy is done, and it's sent to the pathologist, the pathologist is biased to think there's going to be something wrong here. in the U. S. small moles, small nevi, are often taken off. The dermatologist who sees the patient will be concerned and so they'll take it off, do a little shave. And so we do get a lot of those in dermatopathology, and one of the problems is over diagnosis as related to melanoma in adults. And I think we don't have a really good, at least I don't, maybe, have a really good, sense of what the base rate is. We have different senses. People have different sort of internalized prevalence ideas or something, and then we become maybe more like the computer that doesn't have a sense of the prevalence.

[00:12:19] Todd Horowitz: First I want to unpack this idea of prevalence and base rate. So, base rate being like the objective background rate, the objective prevalence, if there were data. I can tell you 90 percent of these are normal, right? That's explicit knowledge, and you can take that into account. But there's also the prevalence that you're experiencing, right? As you're looking through case after case, how many times do you find that this is malignant, right? You've read through a hundred of these cases and maybe you found malignancies in 20 percent. And your recent experience of prevalence is dominating your perceptual experience. You can shift people's behavior around, make them more or less sensitive, by hitting them with a burst of a bunch of cases.

[00:13:04] Your baggage screener at the airport. They face the exact same problems. You could go years without seeing a gun in somebody's bag. And you can go your entire career without seeing a bomb, right? But you still have to be able to detect those things. So with my colleague, Jeremy Wolf, we did a study of airport screener trainees, where before they did their regular shift, we would give them a sort of quick burst, where they'd sit at a laptop and look through a bunch of fake bags that we had prepared that had a higher prevalence of guns and knives than what they would see. And that made them more likely to say yes, basically. I don't know if we could implement that kind of thing in a dermatology or radiology clinic. But it shows that your very recent experience of prevalence has a strong effect.

[00:13:50] Christine Ko: Yeah. Can you touch on the value of interdisciplinary collaboration?

[00:13:55] Todd Horowitz: Yeah. I was trained as a cognitive psychologist and I spent years delving into sort of like very narrow, fascinating, but very narrow, scientific questions where you could just manipulate everything. But then in order to understand medical image perception, you really have to understand not just the experimental factors and the cognitive factors, things about how the workflow goes in the clinic and what the diagnosticians are looking for. And it really highlighted for me the way you need broad interdisciplinary teams. I have all sorts of great, clever ideas about what might be going on when you are reading a microscope slide. But, without talking to you, those ideas are going to be useless. So you need the diagnostician to be part of the loop. I certainly think you need cognitive psychologists. I think we bring a lot of information about how the perceptual and cognitive system works in general, because your brain and my brain are relatively the same. We also need computer scientists because this is a highly technologically mediated area, and you need to understand what's going on at the level of the software. I complained about the way AI is being developed and the way to get around that is to bring people who are developing these AI systems into the loop, to make sure that they understand the cognitive issues and that they're involved in the research project as well. I think you need interdisciplinary collaboration.

[00:15:20] Christine Ko: Yeah. Do you have tips on how to collaborate across disciplines?

[00:15:25] Todd Horowitz: Yeah. I think one obvious tip is just to be open to understand that you don't have all the answers. You have your blind spots, other people have their blind spots. And, maybe if you work together, you can cover some of these blind spots. Another aspect is translation. Every discipline has its own special jargon that people from outside the discipline aren't going to understand. Sometimes words have different or even opposite meanings across two different disciplines. Learning how to translate between disciplines is very important. And really, I think one of the most important things is to be in the same place. We really need to have some kind of intellectual mixer where you get people together of different strengths and different disciplines to just talk. And that's a thing, actually, that we're pretty good at NIH. That's a thing that that we can do. But, just reading papers isn't gonna isn't gonna cut it. You need to be in the same space, and talking about their problems, and we need to be bending over your microscope and you showing me things and stuff like that.

[00:16:22] Christine Ko: Yeah. Do you have any final thoughts?

[00:16:25] Todd Horowitz: I think my final thoughts are that we really need to think about medical image perception as fundamentally a human enterprise, right? You have a visual system. It's a very highly trained visual system. But it is essentially the same visual system as everybody else's. So we can use what we've learned about the visual system in general to figure out what is going on in your very specialized visual system and how to improve it. And I think we always just have to keep that sort of like human centered perspective in mind, no matter how fancy the gadgets get, no matter how advanced our AI, no matter what the display technologies are, we always have to center the diagnostician who is doing the actual work.

[00:17:12] Christine Ko: Yeah, that's cool. Thank you so much.

[00:17:15] Todd Horowitz: Thank you. This has been fun.

See, Hear, Feel

EP114: Dr. Todd Horowitz on medical image perception, AI, and collaboration

Listen to this podcast on