Dirty White Coat
Mel Herbert, MD, and the creators of EM:RAP, UCMAX, CorePendium, and the collaborators on "The Pitt" and many of the most influential medical education series present a new free podcast: “Dirty White Coat.” Join us twice a month as we dive into all things medicine—from AI to venture capital, long COVID to ketamine, RFK Jr. to Ozempic, and so much more. Created by doctors for clinicians of all levels and anyone interested in medicine, this show delivers expert insights, engaging discussions, and the humor we all desperately need more of!
Dirty White Coat
AI Updates and Regressions
We examine how clinician-built AI can safely support emergency care, where consumer tools fall short, and why planning, context, and evaluation matter more than model hype. We also share a patient-facing approach to unify records and recordings for safer, clearer answers.
• differences between consumer and medical‑grade AI, HIPAA and BAAs
• model regressions, sycophancy, and hallucinations
• context engineering and planned prompting for safety
• ambient clinical decision support at the bedside
• evaluations, benchmarks, and model selection
• medico‑legal uncertainty and state regulations
• education risks of over‑reliance on AI
• human oversight, prioritization, and tactile care
• patient empowerment via unified records and encounter recordings
• interoperability gaps and practical workarounds
You may remember a number of months ago we talked to Gabriel Siegel from Glass Health, who was doing some really interesting stuff on the AI front and the integration of AI into the medical record. So I got to speak with Gabriel again and with Alex Lupi, who is a PGY3, who also works at Glass Health, and they have a new company they're working on. So we had a wide-ranging discussion about AI, about things you shouldn't do, like uh take your medical record and put it in ChatGPT, and the utility of some of these new sort of agentic models and some of the regressions that have occurred. So let's get into it. Gentlemen, it's been months since we spoke. And in AI land, that is the equivalent of hundreds of years. So I think since uh we last spoke about your work, we were on ChatGPT four, and since then we went to four mini and then five, and then like 20 different variations of that. And I wanted to catch up with what you were doing personally and where AI is, particularly when it pertains to um looking after patients. Gabe, I think you said that you think that there'll be a time where not using AI in patient care will be malpractice. Was that you that said that?
SPEAKER_02:Yes, that does sound like me.
SPEAKER_00:So please explain.
SPEAKER_02:Yeah. Um, so Mel, thanks for having me on. I really appreciate it. Um, so I'm Gabe Siegel. I'm graduated from Denver Health Emergency Medicine Residency and then did an ultrasound fellowship where I specialize in deploying AI at the bedside with neural networks and doing machine learning and ultrasound imaging. And then I met Alex, who's also on the show with me, really trying to focus on working at Glass Health, which is a Y combinator-backed healthcare company that focuses on building AI systems for um clinicians and built by clinicians. So, how can we safely deploy AI at the bedside? And that's really the goal of Glass Health. And so Alex and I have been really focused and interested in AI both for our work, academic purposes, um, and then you know, at the bedside. And so there's a lot to cover here. Um, and it's like such an amazing question. You know, I think about this as a framework of, you know, the regulatory aspect of you know, what are the federal laws and the state laws that are affecting how we use AI at the bedside? And so that is changing and changing every day. And I'm excited to talk about that with you, Mel. And then, you know, thinking about you know, as clinicians, what AI products can we safely use at the bedside? And so we're catching our mistakes and we are using AI safely so that patients get better care and we're not introducing bias and harm.
SPEAKER_00:First of all, can we talk generally about ChatGPT 5 versus ChatGPT 4? I'll say that for me, I've seen huge regressions actually. One of our people asked me to do a new bio, and just for fun, I'm like, put it into chat GPT four, do a bio for Mel Herbert, and then uh chat GPT-5. And actually, let me pull up what it said. So the first half of it was fine. The second half, and I won't read the whole thing, a proud alumnus of the University of Pittsburgh's emergency medicine residency program, the pit. I did not go to Pittsburgh, I went to UCLA. Dr. Herbert credits its rigorous training and unparalleled clinical exposure for shaping his expertise. At the pit, he honed his skills across diverse Pittsburgh hospitals, mastering a high-stakes emergency medicine care, embracing the program's ethos of leadership and innovation. Now a celebrated educator, TEDx speaker, never have, and mentor, and on it goes. So it completely hallucinated the fake world of the pit, where I'm a writer, and made me a resident of the program and training on the pit. So Chat TV 4 didn't do that. What's up with that? Please explain, Alex.
SPEAKER_01:Yeah, and just by introduction, uh, my name's Alex Lupi. I'm a PGY3 at Denver Health, and like Gabe said, I've been working in the AI space for the past several years together. Um, so yeah, I think that this is a phenomenon where these models, as they become trained to be more agenc or independent in what they're doing, uh essentially autonomous to do different tasks, they want to be helpful. And we see this a lot when we're trying to build AI models to code for us, for example. If they don't have the exact solution or the right data with them to do the steps that we want to do, they just make things up and they say, Hey, I made you test data that could simulate this pipeline uh just so you can see what it would look like. And then it'll plug all that together and make like a simulated idea of what you were asking for, even though that's quite the opposite of what you're looking for. All you wanted it to say was, actually, I don't know, and uh come back to me with the data that I need to make this happen. And so that's what I suspect was happening with you is that it didn't, if it had your bio, you know, if you uploaded maybe your resume, gave it a couple websites of like these are the websites I want you to look at, maybe then it would uh really nicely put all those together. But it seems like these agentic models just want to be so helpful that they will sometimes extrapolate and in this case hallucinate insanely. I've never seen it that bad, honestly. That's quite impressive. Um, and it seems like it just ran with this uh pit little tidbit and then sort of fabricated a story that would, in theory, be this mythical Mal Herbert who has interactions with the with the pit um to make a really helpful bio, um, which is definitely problematic. So we can get into more of this idea of context engineering, which is really important with these models and others going forward of how to actually trust them and and rely on them in the clinical context.
SPEAKER_00:Is it um just my bias, or is this gotten worse? Is it because we're moving to more of these agentic models that it wants to be this sigaphant and make shit up for you? Is it worse than it was before?
SPEAKER_01:The models are just they're just more powerful, I think. So they just they just take it further and further. Um that's at least my impression. You know, they've the the foundational model providers have gotten wind of this, that actual professionals do not want sycophancy in their in their work. They want to be told, no, I'm wrong, check me. Um, this is not you know what I was looking for. And so they're trying to train that out of the models, and we we see that, you know, sort of in the model uh sort of um lineage, some are more sycophantic, some are less. Um, like you're saying with uh 4.0 versus GPT-5. You see on the CLOD models, anthropic uh sonnet uh 3.7 was insanely sycophantic, and then Sonnet 4 improved that. Sonnet 4.5 seems to have progressed a little bit again. Um, so it's definitely something that they're on track with. Um, and I think it's different. You know, consumers maybe want a bit more sycophantian when they're talking to JetGPT, but professionals really do not. Uh so the foundational model companies are also, I think, in tune with that.
SPEAKER_02:Yeah. And and just some level setting here. I think we kind of focus on, you know, there's consumer grade AI products, right? And that's products that we want to use as a consumer every day. And then there's medical grade AI products. And so that are AI products that are, you know, FDA regulated as you know, software as a medical device, or it's called AI clinical decision support products, which are products that you use every day at work. And it's really important for clinicians to know which products they're using and to know that sometimes when they're using a Chat GPT or off-the-shelf product, it may not be as regulated and have guardrails. So the outputs may not be as safe for patients, and you may get some of those hallucinations where the AI makes things up, or it just appeases you and fills in the blanks. And what's really important to know about the consumer aid products are you know, when you press ChatGPT 5, they're not all equal, right? You have auto, which selects the model for you. There's ChatGPT, I believe it's instant or fast, and then you have a thinking version, and then you have a pro version. Each of those models have different grades of intelligence and levels of reasoning. And so it's really important to pick the model for the use case that you are trying to get utility out of the model for. And then lastly, we know now that the prompt matters more than ever. You need to use a prompt, and a prompt is what you tell the AI to do. So if you inform the AI about your goals, your constraints, your use case, it'll produce you a safer output. And so those are really, really important, just level sending concepts to understand when using AI every day, especially in healthcare.
SPEAKER_00:So, how does a normal person, not you guys, how does a normal person know what to do, which model to use, how to prompt it? Because there's people like me that are just out there playing around with it, getting these ridiculous results. How do I, where do I go to find out how to do this properly?
SPEAKER_01:I think it's a great question. I would say the basic principles uh that Gabe is describing apply to anyone rather regardless of your field. So rather than quote unquote one-shotting everything and just saying, don't be the best bio of all time. Here's my name. Um, you can ask for it to plan and make it make uh a stepwise approach. Um, you would we do this a lot with coding. So you make uh what's called a product requirements document and actually plan out how it's going to do this, what data it needs, what are the steps it's gonna go through, and make sort of a to-do list. And that makes more of a guardrail for the AI to actually walk through these things step by step instead of jumping steps when it actually doesn't know the information. So planning, I think, is a is a critical step. Um and then, like we mentioned before, adding context. So, you know, I really want to make a recipe tonight rather than just asking for, you know, an Italian recipe, maybe give the last five that you use from New York Times Cooking, uh, and then it's gonna, you know, rip review um similar ones and and kind of bounce ideas off that way. Um, so it has a little bit more context about what you want. Um, and then finally I'll talk about evals uh and this concept of evaluations. I think everyone for their specific use case of AI will rather than like you're saying, like, oh, there's a new model, how do I just randomly judge it? Um, will have to compare these things to evaluations. So there's now more and more uh field-specific evals where the experts in that field have established uh sort of a rubric or a criteria of what is the right answer here and um how can we grade the model reliably against that when we flip in a new model every three weeks. Uh so in the healthcare space, you see things like Healthbench from OpenAI, uh, which asks the model to perform different useful tasks, like calling a patient and updating them on results or interpreting a differential diagnosis, et cetera. Um, and you can, when that new model comes out, see its performance on the on the eval that you care about, and that be sort of your first step of, okay, is this actually going to be useful in my workflow?
SPEAKER_02:Yeah. And Alex, that was a great summary. And just to add, I think one of the things I tell patients and you know, my parents or other doctors that want to learn from Alex and I about how to use better AI products is use AI to learn about AI, but really thinking about how where's your starting point. So, one, thinking about is your AI product connected to the internet? And so, not all AI products are connected to the internet. So, making sure if you want up-to-date information, you're using an AI product that's connected to the internet. And so, whether that's Chat GPT, whether that's Claude, whether that's using a Perplexity AI, which is an AI tool that actively searches the internet, it's important to get up-to-date information because this field changes so rapidly. And then two, just level setting with AI when you're asking it questions. So you ask AI, hey, what's a prompt? And then you say, I know nothing about AI. I'm a physician and I want to learn more. If you level set with the AI and give it access to the internet, it can help be the best educator so that when you keep going back and using it, you become more knowledgeable.
SPEAKER_00:So let's catch up on what you're doing at Glass Health.
SPEAKER_01:Things have things have definitely progressed at Glass. So uh in the AI sort of landscape, you've probably heard about the AI scribe companies, the abridge, Nabla, Suki, Freeds of the world. And then you've probably heard about the clinical decision support or CDS. So things like open evidence or Doc 70 GPT that people are now using more and more. Glass is trying to come at the intersection there. So we've been in the CDS space for a long time, answering single questions, one-offs of, you know, what antibiotics should I do for this patient on this scenario? What is the differential diagnosis for this complex case? Now, trying to combine the rich context that we talked about, or that context engineering of giving the full picture. So recording the encounter with the patient, what was actually said, uh, what are that, what is that nuanced thing they mentioned that maybe we overlooked as the physician in the busy ER, plugging that entire transcript in with the power of a really expert tested prompt that's been evaluated against different data sets, and we've chosen the best model for the medical purposes, uh, combining those two things together to get a really powerful output. Uh so that's where we're at right now, is trying to be at the intersection of the scribing and the clinical decision support.
SPEAKER_02:Yeah. And what makes glass so special and what we're really working on there is um this what we call an ambient uh clinical decision support. So can you get you know guidance in real time as you're interviewing the patient? You know, what should your next question be? What should your differential be? What should what plan should you be thinking about? But actually searching every abstract, journal article, clinical practice guideline from ASAP and bringing that in real time as you're discussing that uh case with the patient and getting that history. And so it's really thinking about how can we kind of blend the open evidence, which you know is searching the clinical guidelines. It's you can think of it as an up-to-date on your phone using AI, but also pairing that with the scribe tools that you guys love every day. And so GLASS is really also working with bringing our technology to other companies, and so other companies that are using AI products can use our what's called, you know, clinical decision support engine. So that you know, large body of knowledge that has been brought into a system that AI can understand and allowing other companies' AI products to use that uh engine to you know improve the knowledge and reliability and safety of the of other companies' AI products.
SPEAKER_00:So, how accurate is it? How much hallucination is going on? You're obviously using some different models than I've been using, but I actually used one of these, I'll I won't say which one, but a very prominent one just a few weeks ago about muscle relaxance in pregnancy, and the answer was rock uronium, which is technically correct. It will certainly relax your muscles, but you will also die. So, what's the error rate and how are you keeping that low? What tricks are you doing?
SPEAKER_02:Oh man, that's that's such a good case. Um you know, I think we when you think about this, you know, going back to what Alex talks about with context engineering. So to level us here, context engineering is just if the AI doesn't know the context, you know, the department you're in, the resources you have, the patient information, you know, it sometimes will just hallucinate and give you the sponsor you want to hear, or it just assumes, makes so many assumptions that the response is, you know, not necessarily wrong for the specific assumption-based answer that it's giving you, but it's wrong for your use case for the patient that's in front of you. And so what we're doing at GLASS is evaluations. And so it's taking, you know, hundreds of cases, thousands of prompts, thousands of user data that um people have asked ChatGPT, and then using actual clinicians and then validating those outputs. And so I wish I had like a error rate just on the top of my uh my brain, but I know that you know our model gets over 90% on the New England Journal of Medicine, you know, most difficult cases test set. And we're constantly re-evaluating every single model against the USMLE and other, you know, AI benchmarks. And so what's really thinking about going forward is as you have an AI product that comes off the shelf, you don't know its validation. And so, really thinking about use an AI model that you trust and you're comfortable with. And then when the AI models swap out, knowing that there might be some learning curve and some growing pains here.
SPEAKER_00:One of the criticisms or one of the concerns is outside of these models hallucinating, is that a lot of what we do as clinicians is sort the wheat from the chaff in real time. So, how many times have you had that patient that has come in and said, My hair hurts, my tongue is itchy, I've got a lot of wax, I had crushing rich external chest pain yesterday, my beard, and you're like, What? You're sorting through this information. Most of it is like extraneous, doesn't matter. And then there'll be like crushing rich external chest pain. Okay, that's a problem we need to talk about. How does AI help with that yet? Can it help with that yet? Is it smart enough, or does it still need the human to say, focus on this thing?
SPEAKER_01:I think it still needs the human. I think that's a great example. Um, you know, in this future now where we're plugging in the ambient solution, so it's listening to everything that patient is saying and then going and searching for all these things, it could 100% get down a crazy rabbit hole with some diagnosis that is that is completely irrelevant and some workup that is way too big for that patient that day. So we envision this sort of human-machine interaction where instead of like talking to your note, you're almost talking to the AI of like, this is what I'm worried about based on what this patient is saying. Anything I'm missing here, what should I be thinking about? Um, so it's both building those systems in automatically, like how can we, as physicians on the building side, guide the AI and like have that sort of automatically built in with prompting to you know, ex pay attention to like threats. Uh, if there's you know a lot discussed, maybe that's in our prompt, uh, to focus on the most important uh patient concerns of that day. But then with the with the end user of that clinician using it, it's also on them to say, okay, I know you just heard all that stuff. Here's what I'm paying attention to, and can you help me with that? Um so kind of like what Gabe is saying, just level setting with the AI is is ultimately the most helpful thing you can do. And saying, like, here's what I'm thinking. If that is unclear in my in my ask of you as an AI model, ask some questions. You know, what is what is coming up? Um, and before you generate a response, uh can you give me some feedback or or ask questions yourself that I can answer and hone in to give you your answer, make your answer better? Uh, and that actually increases the uh accuracy and utility of these outputs a ton. You see that if you've used Deep Research, which is this like 10-minute thinking model from ChatGPT that'll go and read the entire internet of available sources. And every time it does that, it'll ask, okay, before I go off on my on my journey, can I ask you a couple clarifying questions? And that's because that model company has found out that you're gonna potentially go down a crazy rabbit hole if it doesn't clarify. Hey, are you sure you want me to pay attention to this one bit of the question? Um, so I think implementing that in our behaviors is probably the best way to address that.
SPEAKER_00:That's uh really interesting because we had one of their project managers ask ChatGPT 5, and I don't know exactly which model, to go to every state in the United States and find out what the various CME requirements are. And it came back and said, Are you sure you want me to do this? This is a lot of work. And it seemed like a lazy advert. It's like, are you sure you want me to do that rectal exhib? Um she had to prompt it a number of times to do that. So is that what it's doing? It's trying to find out exactly what you want, or is there is there also part of that model that says, you know, we're burning a lot of electrons and a lot of electricity here, and you it feels like you might just be screwing around. And that's gonna cost this company a lot of money. Is it doing both of those things? Is this also what I'm saying, a protection against people just giving it these crazy things to do all day, which costs them probably, you know, many dollars per prompt.
SPEAKER_01:It's a great point. Yeah, there's there's actually a trend going around right now on on Twitter of people talking to the voice agent in ChatGPT and saying, count to one billion. And it'll be like, aha, all right. One, two, three. That was a good joke. Um, so it's definitely in their prompts of like if there's a completely unrealistic uh ask to you know, reevaluate that and and to not just go off on this crazy tangent. Um and it's also about like picking which subtool of the model is gonna be most helpful, right? Like Gabe was saying, is it the model that is automatically internet search enabled? You know, if you're using one that is not uh internet search enabled, it's probably going to recognize its limits and say, hey, I actually don't know unless I can go and search the internet. Do you want me to go and do that? Um do you want me to perform that deep reasoning or deep research to go and read all those, all those um websites or even these more agenc like web computer controlling uh tools that are now coming out, where it can actually like search and then navigate multiple steps of a website, click into boxes, click drop-down menus, um, navigate more of a website and even do like shopping and things like that for you. Um, as those models and tools become more available, it'll be selecting that subtool within the model itself.
SPEAKER_02:And you touch on a really interesting point of personality in AI. And so um, you may have been following with ChatGPT 40's rollout, where they released this model and then really realized after you know hundreds of thousands of interactions with users, that the personality of the model had a very core problem. And it really like played into the human desire where you know it and reinforced like godlike fantasies of the users and really didn't have enough safeguards. And so they actually had to pull that model back and then re-release it. And so we see that across all A models that each AA model has slightly different personality. Sometimes these AI models will give up, and sometimes it's actually a good thing where they call out the user and say, I actually cannot do this. And so these are like really important distinctions to understand is are you using an AI model with the correct personality for your use case? And if you have a particular problem that requires a lot of safety controls, you want a model that will call you out if you're doing something that's unsafe or unreasonable. And every model is different in on those parameters, and it's really important to think thoughtfully. And that's why the AI landscape, you know, it's this incredible powerful tool, but it does require some education to use it. And I think, especially in medicine, we haven't really built those educational safeguards yet. So clinicians know how to use AI every day in the hospital safely and effectively.
SPEAKER_00:So, what happened? We heard a lot of reports about is that people got their AI boyfriend or girlfriend with ChatGPT 4, very sycophantic, you're so wonderful, you're so great. Then they got rid of that model, and people were just so distressed. Like they were in an actual relationship with the AI, the personality complete change. So it's like the person you've been with for 10 years had a stroke and had a complete personality change. They're like, please bring back the model. Um, so that's sort of an aside from what we're talking about, but that's one of the dangers of trying to form a relationship with a computer. If the people who are making that model change it, the personality is going to change. Uh, medical legal question, and a lot of our listeners have asked this. So you take in your iPhone, say, and you collect the uh information as you're getting the history, doing your physical. What is the medico-legal implications of that if you do miss something? Is that recorded and stored forever? Do I get to edit that transcript and that's the thing that can be put into the chart? Or can the lawyers come in behind and say, aha, gotcha. This patient said their hair hurt, and that should have been the trigger for this incredibly bizarre disease that nobody would have picked up.
SPEAKER_02:That is a great question, Mel. And I think, you know, just to preface, like, neither an Alex and I are lawyers or have like a specific like domain experience in the medical legal space and AI, and this field is constantly changing. And just some level setting here as well. Like we have state-specific laws, and so Colorado specifically launched a law called, I believe it's like uh Colorado AI Act that's you know gonna start in February. And it is one of the first states to really regulate AI on a state level, in addition to Texas and California. And so we really think about there's this new medical-legal paradigm of as AI gets more regulated on a state level and a federal level, what is the medical legal liability of each user? And so we think about you know, whether you're you're using an AI product that has been, you know, designated as a safe product that has an AI governance plan within your institution. And so that's a product that you'll use, you know, in your EMR. That's either a company that's been approved by your hospital, and you can do chart summaries and build notes and use an AI scribe. And so those and that data is largely institution-specific, but will belong with the institution that is partnering with that company. However, there's all these AI products that you can bring to the institution. And I think clinicians just have to be aware that you know, those products need what's usually called a BAA, um, a business agreement with the hospital to safely use that AI product and have it be HIPAA compliant and store that data in a HIPAA-compliant way. And that's why we recommend to not just pour actual clinical data into Chat GPT because it is not a HIPAA-regulated product that is that is safe for clinical use from an FTA's perspective. And so from a medical legal standpoint, I think we don't really know the landscape. We see a lot of lawsuits against open AI from, especially in the in the area of suicidality and suicide, where you know families are trying to hold open AI accountable for um its AI outputs. And so we really will see, I think, in the next few years, how if clinicians use AI outputs or the diagnosis was missed and it's recorded with an AI product, how that's going to affect uh liability, especially in the emergency medicine setting.
SPEAKER_00:So I have watched residents uh do exactly what you said don't do. They took their history and physical, copy, paste, put it in chat GPT, and asked it to help them with their differential diagnosis. So you would suggest that that's probably not a good idea.
SPEAKER_01:Yeah, the the main reason to not do that is that in the consumer-facing Chat GPT for your personal use, number one, that has memory. So the way that they're engineering that product to be useful for you asking about what recipe you want or about your bio is that it it learns a little bit of pieces about you, every question you you put into it and every conversation you have. So it's gonna, it's gonna store that on purpose and keep that in your sort of history and context. So you really don't want that patient's PHI in there associated with your personal account. And as we've been saying, like the consumer versus business grade AI solutions, they're purposely set up on different purposely set up differently, right? So those are not going to um retain that information for patients. It's going to be a stateless interaction where the information comes in, you get the model response, and then it whoosh vanishes after a certain period of time that has been established by that uh BAA, like Gabe said. So you really need to be using one of these um enterprise grade AI tools or even the consumer-facing HIPAA compliant tools. So the open evidence, the glass health, the doximity GPT that are actually set up to not retain as information and protect that patient's um PHI. But all of them still ask you to establish that BAA before using it.
SPEAKER_00:So that's a really important point and uh something I hope that the residents listening and the attendings uh will understand. I found that that some of the AIs remember and have memory for what you did yesterday, the day before. Many of them don't, and that's frustrating as well. It's like we talked about this yesterday, and you can't access the thing from this today, whereas some some do and some don't. And uh you have to have some knowledge around that. But that's a very good point that um you should be using industrial strength stuff at work and the one that your little system has said it's okay to use.
SPEAKER_01:Yeah, exactly. And it's and it's a challenge because sometimes the hospital approved model is going to be model specific. So maybe they've just gone with a Microsoft suite that only has OpenAI models. And now Anthropic or Gemini just leapfrogged open AI by 10 paces, and you really want to use the new Gemini model, but it's not available in your hospital. Um, and so that's, I think, causing people to pull out their phones and use the latest and greatest, um, which is a understanding, understandable pressure, um, but again, requires that collaboration with the hospital steering committee. Put in place something that can really rapidly switch between these because the field moves every three to four weeks. Um, to say on top of it, you really need uh to be flexible with the model providers.
SPEAKER_02:Yeah. And I'm actually touching on, I think, a huge problem that, you know, I haven't seen openly discussed a ton where, you know, we have AI products. You're lucky if you work at an institution, like, you know, many county hospitals, states and institutions are not going to have the you know, financial budget to spend a significant amount on AI products for its clinicians. So clinicians are gonna want to bring an AI tool with them, especially if they're used to using it every day to answer questions. And so some institutions may have financial um the incentive to provide AI tools. And but those AI tools may be behind what's currently available to the consumer. And you may feel like the AI is dumber when you're working it with an when you're working with an AI tool inside of Epic than when you're working it at home and just asking questions. And so we really feel like there's this frustrating clash of clinicians want to use AI products, but the consumer grades are best. Hospitals are slow to adopt the latest models. Sometimes you're lucky if you even get an AI tool that's paid for by your hospital, especially for the residents. The residents aren't just getting provided an open evidence prescription or a glass uh subscription to use every day in the hospital. And so trying to find a solution that is not only safe and effective, but also, you know, is latest and greatest AI models at your fingertips is something that is not happening yet, um, widespread across residents and attendings alike.
SPEAKER_00:So let's go back to what I talked about at the beginning, which is this idea that you plus an AI is probably smarter than you alone. Although there were some studies that said that wasn't true, we're actually dumber because we should have been listening to the AI and we wouldn't. But Medico legally, can you sort of tell us where you think this is going? It seems to be inevitable that we should be using AI to help us as long as we know how to use it. When do we know we're there or are we there now?
SPEAKER_01:I don't think we're there now. Uh we're still in the Wild West, we're still in the early days. It's gonna take a landmark study. And uh there's some groups doing this, like the Amy group uh at Stanford is putting uh AI with clinicians or AI independent of clinicians uh to do different clinical tasks. Um other groups are starting to do this. Uh it's gonna take, I think, a randomized controlled trial or at least some sort of clinical validation to say it is without a doubt clinically better, not just like we plug these notes into AI versus what the what the doctor said retrospectively. Um it's gonna have to be this perspective validation uh that shows definitively that you should be using AI, or like Gabe said, that is malpractice. I think that's coming. It is going to have to be a part of medicine. Um we're not quite there yet, but it's it's definitely upstream.
SPEAKER_02:Yeah. And all of us is we're all educators, right? Um Alex is educating the GRU residents and medical students, and you know, MRAP is educating the next generation of emergency medicine physicians. But if you follow them what was published in the Landsat about, you know, AI-assisted colonoscopies with gastroenterologists, I feel like is a great kind of like level setting story for kind of the future of AI in education and the medical legal risk of using AI every day. And so we see that when AI was used with colonoscopies, we were better at performing colonoscopies and I believe identifying cancerous polyps. But when they took the AI tool away, gastroenterological and gastroenterologists got worse at identifying what those cancerous polyps look like. And so if we become reliant on our AI tools and use them as crutches every day, especially as medical students and residents, you know, don't learn the knowledge, they just get used to looking the knowledge up. Are we cheapening the medical education system and producing doctors who have a less robust foundational understanding of emergency medicine or medicine as a whole? And so I really think that this is gonna be a really core challenge that um, you know, Mel is gonna have to face as a leading educator of how do we best train our students and our residents to use AI tools but not become reliant on them.
SPEAKER_00:Yeah, this is very analogous, I think, to video laryngoscopes. They're just better. I mean, you can see shit that you can't see with direct, but when they were first coming out, we were really encouraging the residents to learn to do both because if the machine breaks and you don't know how to do it any other way, you're in serious trouble. So that seems to me like that is very analogous. Use the AI tool, but you also need to know what antibiotics to give for pneumonia yourself. Yeah.
SPEAKER_02:And we've reached a point where if the AI is giving you a differential in real time and a plan while you're talking to the patient, you know, where is there a role for clinical decision making, you know, clinical reasoning, diagnostic evaluation, physical exam. And so this is really a new paradigm, which is why you know it's so exciting to think about these essential questions of how do we reshape medical education in the era of AI.
SPEAKER_00:A lot of anxious physicians saying, is this gonna take my job? Is this gonna take my job? And I think I it might one day, but your job is more than just talking to patients and then and retrieving information. Your job is support, but also your job is examining patients and doing procedures and doing things. And the robots are coming. But if you watched Optimus, which is supposedly like the best robot there is, it is so far away from being able to do dexterous things. It might be able to pick up boxes, but can it do a subclavian line in a crashing patient? Not yet, not for many years to come, I don't think.
SPEAKER_01:Agree. We'll still be around. There, there's there is more and more AI coming. Uh, we need to learn how to work with it rather than fight against it. But uh, I think we still we still have our jobs for at least a couple years here.
SPEAKER_02:Yes. Until AI can press on a belly and can have that tactile feedback and you know, listen to long sounds, I think we're not there yet. And we really see AI excel on very static data sets like the New England Journal of Medicine test cases, or you know, Microsoft released a you know large AI trained data set of like real EMR data to help validate uh the performance of an AI system in healthcare. But really, AI at the bedside, AI with all the different variables, the distracting information, the incorrect information, it's just not able to parse through that information like a clinician is. And so we actually think emergency medicine is really well positioned to be a collaborator with AI and not have its job taken from AI because of those tactile tasks that we have to do every day, those quick decisions. And we're not just a cognitive and clinical reasoning specialty, we are a hands-on specialty.
SPEAKER_00:So, what else have you guys been up to? I think maybe perhaps there's some spin-offs that you've been doing. Are you allowed to talk about that?
SPEAKER_01:Yeah, definitely. We've been leaning more into the patient-facing space. So in the ED in the past few months, we've been seeing so many of our patients walk in with Chat GPT up on their phone and tell us, hey, I uh I put everything in you said, and yeah, checks out, checks out. That I agree with you. Or I disagree with you, and this is what my Chat GPT is telling me. Did you think about this? Um, so we're seeing this movement where patients, number one, have access to their clinical information, right? This 21st Century Cures Act, your lab result is ready to view after it comes back every moment uh before your clinician can come back in and explain to you uh that your MCV of 68 is not an emergency today. So patients are seeing that information and now they have a supercomputer in their pocket. And so they can put in little bits and pieces of this information, but still that's just a sliver. And we think that it has power, but without that full context engineering that we've been talking about this whole episode, that that is potentially dangerous. So what we've been building is that context engineering system for patients. So giving patients AI access to their medical record, see all the notes, all the labs, all the medical history, all the allergies, put all of that information in so it can actually see the full picture instead of just that one one-off lab value. Let patients start recording their encounter. So now all the docs have been pressing record on their scribe. Why can't the patient record what happens in this surgical consultation that they've been, you know, nervous about for the last three months? So when patients are able to put both of those contexts in, their whole medical record, maybe their wearable data, what's happened on their Apple Watch, um, and what is actually said in the room, we think that is like the most empowerment a patient could have uh to actually safely use AI because they're gonna use AI no matter what. So let's make let's put the guardrails up a little bit. Uh so that's what we've been working on. Uh, we started a company called Atlas Care, and we're building an app called ARC that is designed to do all this.
SPEAKER_00:This is inevitable. Dr. Google has just gotten smarter now. It's Dr. Google Chat GPT. When will you have something that could go to market? Because it'd be great to point patients like, don't use just general solutions, use this solution that's really going to help you.
SPEAKER_02:Yeah, we have uh Alpha of our prod probably coming out in the next few weeks. And so uh we will be rolling this out to uh some beta testers and some actual real life patients that can be able to use this. And then I think the exciting thing is as emergency room doctors, we all know that patient where you're like, we really want this piece of data from your medical record, and we can't get it. Like, what was this MRI? What did it show? We really don't want to repeat this test, and that's so frustrating us every day. And what ARC enables us to do is we can get every piece of data on you from the uh physician side. So the sorry, from the patient side, the patient initiates kind of that interaction, and so it's able the patient's able to download records from Epic, Cerner, from the VA, and have that all on their phone. And then they can actually show that to the clinician. And so that is a really powerful thing where the patient can actually bring more data to the table and the physician can actually see and interact with that information in a safe way, and that's really exciting.
SPEAKER_00:Yeah, that was the promise of the electronic medical record, which still hasn't happened yet, which is like, oh, I had a CT scan at this hospital two days ago, and you can't get the results. Like, well, we have to do it again. So this is the patient allowing, bringing that data into a third-party app, giving permission for it to do that, and then being able to share that with their clinicians. Is that how that works?
SPEAKER_02:Yeah, and we really think that's gonna be a new paradigm where it really helps kind of bridge the gap in the disconnected, fragmented system of all these hospitals. You know, I work at a big hospital that has Epic, but we still can't really see the VA records that well. And so a patient from the VA, it's super frustrating because we just can't get a sense of what their healthcare needs are, what's been done, especially when they've had like a 40-day inpatient stay. How are you expected to read through that paperwork in you know, 20 minutes when you're interacting with that patient in the busy emergency department? And so by allowing the patient to bring that data and then having an AI tool that can summarize that information for you safely from the patient side can help bridge that gap, especially when you have these large academic medical centers or these county hospitals that don't have the data integration built in yet. Um, and so that's what's really exciting for us.
SPEAKER_00:Programmers listened to internationally, and there's a lot of people, a lot of positions in other countries like, what you can't pull that up? Uh, because they have fully integrated healthcare systems that are national. And it's really hard for them to explain that the American system is so it's many different systems that often don't talk to each other, which always makes me laugh when we're told we have the best healthcare in the world. I'm like, have you been to the world? The world.
SPEAKER_02:And there's this huge disconnect between like what legislators and you know patients think we can do and what we can actually do. You know, they think we have this, you know, telescope into all of the medical records across, you know, every state, but what they don't realize is epic, you know, care everywhere is limited, and we can't always see every single record. And sometimes that depends on registration, a missed type social security number, or just the information that's just buried in that ether. And so there isn't that interconnected system that I think the legislators really expect. And so we haven't really gotten there from an interoperability standpoint.
SPEAKER_00:Look, I've taken enough of you guys' time. Any other points you want to make before we finish? And I reserve the right to grab you again in the months ahead because it's gonna continually change, and you're my you're my gurus now. So anything you want to finish with?
SPEAKER_01:Oh, this is great. We're we're really excited about the the direction of AI. Um, we think that you know there's there's so much opportunity potential here. And just uh if I could give any advice, it's just to to lean in to try to use these AI tools, but to not expect them to be perfect on the first time. Have a conversation with them, treat them like your interns still, go and and and check their work, uh, go back and forth with them a few times, make that planning uh interaction with it before going off and and doing your your bio or your dinner for the night. Um, and you'll be much more pleased with the output.
SPEAKER_02:Yeah. Thanks, Mel. It's been great to talk about this. And just remember, you know, we're emergency medicine physicians at heart. We're trying to find that life-threatening diagnosis that we don't want to miss. And sometimes we're the best at doing that. And using AI can only help us, but just remember that's our core talent and AI may not be there yet.
SPEAKER_00:Now is not the time to bury your head in the sand. Now is your time to play with these things and become very comfortable with them. Not necessarily use them at work, but be be very facile with them because uh this you're not stopping this. We couldn't stop the internet, we couldn't stop the wheel. It's this is happening, so we should be ready for it. So thanks, guys, and we'll talk again soon.
SPEAKER_01:Thanks, man.
SPEAKER_02:Thanks, Val.