Digital Pathology Podcast
Digital Pathology Podcast
181:Can AI Read Clinical Text, Tissue, and Costs Better Than We Can?
What happens when artificial intelligence moves beyond images and begins interpreting clinical notes, kidney biopsies, multimodal cancer data, and even healthcare costs?
In this episode, I open the year by exploring four recent studies that show how AI is expanding across the full spectrum of medical data. From Large Language Models (LLM) reading unstructured clinical text to computational pathology supporting rare kidney disease diagnosis, multimodal cancer prediction, and cost-effectiveness modeling in oncology, this session connects innovation with real-world clinical impact.
Across all discussions, one theme is clear: progress depends not just on performance, but on integration, validation, interpretability, and trust.
HIGHLIGHTS:
00:00–05:30 | Welcome & 2026 Outlook
New year reflections, global community check-in, and upcoming Digital Pathology Place initiatives.
05:30–16:00 | LLMs for Clinical Phenotyping
How GPT-4 and NLP automate phenotyping from free-text EHR notes in Crohn’s disease, reducing manual chart review while matching expert performance.
16:00–23:30 | AI Screening for Fabry Nephropathy
A computational pathology pipeline identifies foamy podocytes on renal biopsies and introduces a quantitative Zebra score to support nephropathologists.
23:30–29:30 | Is AI Cost-Effective in Oncology?
A Markov model evaluates AI-based response prediction in locally advanced rectal cancer, highlighting when AI delivers value—and when it does not.
29:30–38:30 | LLM-Guided Arbitration in Multimodal AI
A multi-expert deep learning framework uses large language models to resolve disagreement between AI models, improving transparency and robustness.
38:30–44:30 | Real-World AI & Cautionary Notes
Ambient clinical scribing in practice, AI hallucinated citations, and why guardrails remain essential.
KEY TAKEAWAYS
• LLMs can extract meaningful clinical phenotypes from narrative notes at scale
• AI can support rare disease diagnosis without replacing expert judgment
• Economic value matters as much as technical performance
• Explainability and arbitration are becoming critical in multimodal AI systems
• Human oversight remains central to responsible adoption
Resources & References
- Digital Pathology Place: https://www.digitalpathologyplace.com
- Digital Pathology 101 (free PDF, updates included)
- Automating clinical phenotyping using natural language processing
- Zebra bodies recognition by artificial intelligence (ZEBRA): a computational tool for Fabry nephropathy
- Cost-effectiveness analysis of artificial intelligence (AI) for response prediction of neoadjuvant radio(chemo)therapy in locally advanced rectal cancer (LARC) in the Netherlands
- A multi-expert deep learning framework with LLM-guided arbitration for multimodal histopathology prediction
TRANSCRIPT:
00:00:03 - 00:00:39
Welcome trailblazers. Happy 2026. Welcome for the first Digi Path Digest in 2026. This is number 34. Uh and I'm proud of us. Let's do this. I'm going to say hi in the chat because uh we have some computer trouble. Let's see if I can say hi. And if you can see me and hear me, and I see you joining already, let me know in the chat.
00:00:34 - 00:01:06
Just say hi in the chat and say where you're tuning in from. What time is it for you? And it's 6:00 a.m. for me in Pennsylvania. Couple of updates while I wait for the rest of you. So, obviously, happy 2026. Um the last thing that we did last year and there was a recap on the podcast and probably a video on YouTube channel as well uh was the recap of 2025.
00:00:59 - 00:02:00
That was December in London. You may see me walking through London and you will see the double-decker buses in the background as I recount what happened in 2025 in digital pathology. And I'm super excited about 2026 and the new developments. and and of course our uh new papers right and ah thank you so much greetings from London and I see that um comments from LinkedIn are going through so if you are there on YouTube Facebook let me know as well so that we can have a discussion and we can have a conversation what else happened since the beginning of the year so I am very I know you've heard that several times. H but I am working on the new version of the book and everybody who has the book and by the book I mean digital pathology 101
00:01:58 - 00:02:28
all you need to know to start and continue your digital pathology journey if you're new to the field. Uh there is a free PDF on my website and I'm going to give you a QR code right now. Let me click the QR code. You should see it on the bottom of the screen now. Uh the QR code to get the free PDF on the book.
00:02:24 - 00:02:55
If you're connected on LinkedIn, you may be getting messages from me asking, "Hey, did you get the book already?" Uh cuz there is a new version coming. I'm already on chapter three out of five. So, it's actually happening. One and two are updated. Um, and obviously it's taking me longer than I wanted it to take me, even though I'm leveraging as I AI as hard as I can.
00:02:49 - 00:03:16
Uh, because if I wasn't, it would take me even more time. So, um, when you get this book, it's going to be the previous version, but don't worry because everybody who has signed up, who had, uh, subscribed for the free PDF of whichever version is going to automatically get new versions. So, um, and I see you guys are scanning the code. So, thank you so much.
00:03:11 - 00:03:42
Um, that's an update. Then, uh, I was mentioning a couple of times something I was planning to do, and actually I'm committing to doing this, uh, on Valentine's like as as a Valentine um, special thing is um, how to create uh, expert presentations with AI. I have um gathered like different tools.
00:03:38 - 00:04:07
I have described my process. So at this uh conference in London organized by uh global engage digital pathology and AI um I was speaking as well and I used this process for prepare to prepare my presentation to prepare the topic slides and also to practice it. I did practice. I like dread practicing once my slides are ready.
00:04:03 - 00:04:39
I'm like, "Oh, let let's just be done with it. Let's wing it." And I know that winging it is okay. And I'm just going to say hi to people. Hi to India. It's 4:30 in India right now. And who else is joining? Verona. That's so nice. We have some publications um out of Italy today. Anyone else joining right now? Uh, oh, get them borg as well.
00:04:32 - 00:04:59
Nothing from Sweden. Maybe next time. Um, okay. So, what I was saying, the AI presentations, right? I have a process and I actually practice and it went better than without practicing. And in every presentation book, you're going to see like, oh, practice because then it's going to be so much better.
00:04:52 - 00:05:21
But I know that we all are busy people um doctors, researchers, digital pathology trailblazers who when they are done with the slides often like uh 3 hours if we're lucky before we actually are presenting then we're good to go. Um and we have enough experience. But there is a process that lets you practice without actually investing too much time in it.
00:05:18 - 00:05:46
So that's going to be out there around Valentine's. If you're interested, uh, let me know in the chat. AI presentations. AI presentations. Um, obviously if you have any other questions, you let me know in the chat. But we need to start. We need to start discussing what happened in the digital pathology space. Let me share my screen.
00:05:42 - 00:06:21
Oh, actually, I'm already sharing, but I blew myself up. Come on. Okay. Um, is the code still visible? I don't see it anymore. I'll put it up later. Okay, now I see it. Now I don't see it. Okay, good. Because we need to show the paper. Apologies for like having trouble with the computer.
00:06:14 - 00:07:12
I'm getting a new one next week, so we're going to be good again. But I want to see myself as well because I make faces. I want you to see me making my faces. It's part of the process. Stay with me. Chat in the chat and I'll figure it out. Okay, let's do it one more time. Stop sharing. Start sharing share and please. No. Okay.
00:07:09 - 00:07:44
You're not going to see my faces today. I need to make the paper big so you're going to be seeing my voice and then I'm going to And hello Scott Atlanta welcoming in the morning right Atlanta 6 a.m. as well. Okay, let's start. Let's start uh automat autom automating clinical phenotyping using natural language processing.
00:07:37 - 00:08:21
Let's see if my pen works. It works. Okay, good. So, this one is super cool. I think it's this one. You're going to see. But basically um so clinical phenotyping what is that even? This is a group from Germany and New York so international collaboration. Um so often there are studies real world studies based on electronic health records h and they require manual chart review.
00:08:17 - 00:09:19
And when we see the word manual, we know it's going to be slow and labor intensive and it's not going to be scalable, limited scalability, right? So they developed and uh compared computable phenotyping uh based on rules using the spa framework. this spacey or spa sigh framework uh is a python framework um to like coordinate large language models and um they had so they had this framework and a large language model GPT4 for subphenotyping of patients with Crohn's disease uh considering age diagnosis and disease behavior. So, we have GPT4. They worked with GPT4. Uh, when you go to chat GPT, there's already chat GPT 5.2. Um, and you guys are so sweet. I was having trouble and people
00:09:16 - 00:10:18
are saying, "No rush. Take your time. Thank you so much." And um, and yes, this is the cool paper because you can already see this little heart here. And but okay. So, so they had this framework um let's call it spacey framework. I don't know if I'm pronouncing this acronym correct. uh for the LLM based approach they use chat GP uh they use GPT4 model and they had data that included almost 50,000 clinical notes and 2,24 radiology reports from um 584 Cronhn's disease patients and then they had this test set of 280 clinical texts uh that was the and uh these texts were labeled at sentence level in addition to patient level ground truth data. Um and then they evaluated the algorithms for
00:10:15 - 00:10:52
recall precision specificity values F1. So all the parameters that we have for evaluating these models and the results were similar or better performance using GPT4 compared to the rules. uh on a note level the F1 score is at least 0.9 for disease behavior and 0.82 for age at diagnosis patient level this is a little lower 0.6 six for disease behavior, 0.
00:10:44 - 00:11:22
71 for age diagnosis and the conclusion that they have here, this is the first study to explore computable phenotyping algorithm based on clinical narrative text. uh where prior interan annotator agreements range from 0.54 to uh 098 and we know that um all these interrator interhuman um comparison they always range from like 0.
00:11:18 - 00:11:53
54 not too high sometimes they agree more but in the pathology world I always say hey if it's 0.7 then we're so in agreement and then there's still 30% that were not in agreement. So um this underlines the potential of LLMs for computable phenotyping and may support largecale cohort analysis from electronic health records and streamlined chart review process in the future.
00:11:46 - 00:12:23
And now the heart is for plain language summary and trailblazers. Let me show myself now. So plain language summary. How cool is that? So, they included it here and I'm going to read it to you. I'm going to I'm going to show it to you in a second. But basically, what I I read this one and I went there are two other abstracts that are a little bit more complex and I'm like, hm, how about I ask my LLM and I use chat, I use GPD, I use Claude.
00:12:17 - 00:12:54
Uh these are domain sometimes. I ask perplexity, which is like a LLM powered search engine. But basically I did a screenshot of this and I'm like can you explain it in plain language and it did. So uh if you see an abstract that you would like to get explained in plain language or you need to explain to somebody else in plain language screenshot it and put it in chat GPD and we have people from Hamburg at around 12. Yeah 123 probably now. Right.
00:12:52 - 00:13:24
Perfect. So let's do the plain language summary. It's beautiful. Doctors and researchers often need to group patients by specific medical features. And this these are the phenotype phenotypes. And this is the phenotyping basically grouping patients by specific medical features. And much of this information is free text clinical notes rather than tabular or structured data.
00:13:19 - 00:13:49
And I'm going to give you a comment on um free free text plus structured data efficiency. So uh here they had the patients with Crohn's disease and uh the free text notes can include important details describing the disease course over time such as bowel narrowing structures abnormal openings fistulas problems with the area around the anus and agent diagnosis.
00:13:43 - 00:14:29
So how cool is that? They explain what the structure is. They explain what a fistla is. Um, and this plain language summary, it's like um kind of smartly written because it explains without uh like dumbing it down. Um, and it aligns with a principle that I once learned called never I want to show myself again and I can't now I can.
00:14:22 - 00:15:24
So never underestimate your audience intelligence but never overestimate their prior knowledge. Right? So if you're listening to this, you're a highly intelligent uh healthcare professional or somebody involved in healthcare in one way or another. Um so by default you're a super intelligent person, but you may not have the prior knowledge, right? So that's my goal uh to explain things in a way that everybody understands them and same counts for me, right? I do need to go back and check the computer vision terms. I'm a pathologist. My expertise and main domain is pathology but I do talk about uh things that uh are from computer vision from AI from you know whatever is happening in this space. So I do need to um update my knowledge base as way as well on a regular basis. So anyway hard for plain language
00:15:20 - 00:15:51
explanation. I love that. Um so yes because when we only use structured data we often miss these details like uh whatever happened there whatever was written in the text uh and reading notes by hand is very slow and costly. Um so we use NLPs uh natural lang NLP natural language processing and LLM's large language models.
00:15:46 - 00:16:09
Um, and they built two NLP approaches and created a new sentence level data sets to test them. Uh, and they say this method could save time in research and help clinicians flag people who may need extra care. I love this plain language summary. I don't know if that was a requirement of the journal.
00:16:05 - 00:16:41
What is the journal? Let's check it. Um, this is communication and medicine London. Um, I don't know. I love it. I like it. Sometimes they also ask for visual abstracts. That's cool. That That's cool as well. But the plain language summary is the highlight of today. Okay. Let me know what you think for using LLMs.
00:16:36 - 00:17:09
Oh, I need to tell you a story. So, uh, using LLMs for clinical notes. That brings me to my latest visit. Um, let me maximize myself for a second. Sorry for like commenting on my clicking actions. Uh, so related to that paper, uh, using LLMs or using AI in, uh, healthcare. So, I live in Fairfield, Pennsylvania.
00:17:02 - 00:18:02
Fairfield is not a big town. It's I think it has 500 uh inhabitants. So you can imagine what kind of infrastructure we have in a town for 500 inhabitants. And then I go to the doctor for just a regular checkup. I got a spot like I think on the 31st of December, so last day of the year. Uh and I go and and just a regular checkup, right? because I needed to um I needed a refill for my prescription and they wouldn't give me the refill because I didn't have the checkup for too long, right? So, I go and I learn that our little um practice well, it's it's a part of a chain, but that they use ambient scribe, which is this AI transcription thing where you can actually you as a healthcare provider can have a natural conversation with the patient. You don't have to like type and
00:18:00 - 00:18:27
look at your screen and then pretend that you're actually paying attention to like the face expressions of your patient which you are not because you're typing what they're saying. So now they use this uh transcription service ambient scribe and I had a conversation with uh my provider and it filled the clinical notes everything without her having to type it.
00:18:22 - 00:19:00
So that was super cool. I felt proud that uh Fairfield is so advanced in terms of uh AI. Uh one thing that I would improve is to actually like inform people that this is happening. Uh did I ask about it? I think I asked about it but but you know I'm obviously interested in the topic. So we had a lengthy conversation that was probably longer than my checkup itself uh about AI and that was cool.
00:18:51 - 00:19:38
So AI in healthcare is happening. LLM for making the bringing back the patient provider relationship and contact are happening as well. And now let's go back to the papers. Zebra bodies recognition by artificial intelligence. combin a computational tool for fabri nephropathy. So this is a so greetings to Italy.
00:19:32 - 00:20:17
This is a group from Italy, different universities in Italy. And fabri disease is a rare lizosomal storage disorder caused by mutations in the uh gla gene gla gene. H and there is an accumulation of lizooal substance. It's called globo triioil ceramid. Accumulation of globot triioil ceramid. Yeah, it has a very characteristic appearance on electron microscopy which I'm not going to show you right now cuz my computer is going to freak out and then we're not going to be able to um see anything.
00:20:12 - 00:21:01
But you know what? I'm just going to draw it electro microscopy, right? But that's like an advanced imaging technique. So So they like these lizosomes look like zebra like they they have stripes. That's why they're called zebra bodies. But here we are talking, let me remove my thing. Um so the diagnosis uh especially in females is more difficult and um the we need a renal biopsy to uh to to assess it, right? that this remains essential and obviously interpretation requires expert pathologists.
00:20:55 - 00:21:23
So here we did not use electron microscopy which is like pretty straightforward because I thought oh zebra bodies with AI on EM images that's going to be a no-brainer. No, no, no. We used whole slide images from renal biopsies uh of fabric nephropathy patients to develop and validate foamy podite screening AI tool.
00:21:21 - 00:22:04
Fomytoytes are a lot less characteristic and you do need expert pathologists uh to recognize them than these zebra bodies in electron microscopy. So they are basically like foamy cells. Um and but they developed this AI tool that first it classifies glomeili and then segments uh podocytes and and they evaluated the performance using standard metrics.
00:21:58 - 00:22:33
Um and they designed a new zebra score uh to quantify disease burden burden and correlation with hisystological scores and clinical parameters. Um and they had a classification accuracy of 79% in identifying foamy picosytes. So this is a new AI screening tool that is supposed to support the pathologist. So um let's highlight that um here.
00:22:27 - 00:23:28
The AI sorry wrong highlighter the AI assisted zebra pipeline highlights high-risk fibbrin neopropathy features to support nefropathologist as a screening tool. Right? So this is important because it's pathologist support. Um and you know if you've been here or at any digital pathology conference um more than once uh you you will still hear the questions hey is it going to take away uh the jobs this is like part of a conversation every time there is a new uh AI tool is it okay? are we going to be uh less proficient in in what we know like it's a technology discussion right so here important thing it's a it's a screening
00:23:25 - 00:24:10
tool that helps the pathologist and it recognizes these foamy podocytes okay any questions any comments let me know in the comments I see our comments we have Two more to go. Let's see. Oh, yeah. This one is interested as well. Interesting as well. Let me make it bigger for you because it's about Oh, can we save money with AI? And um there is no discussion about healthcare without discussion about cost effectiveness or money in healthcare.
00:24:07 - 00:24:32
Um so here we have cost effectiveness analysis of uh artificial intelligence for response prediction of neoaduvent radio chemotherapy in locally advanced rectal cancer lark locally advanced rectal cancer in the Netherlands. This one is from the Netherlands and also we have authors from Poland.
00:24:26 - 00:25:05
So um greetings to Belgium, the Netherlands, Poland and Italy again. Italy is very represented. So if there is anybody from Italy, let's say hi in the comments. Okay. So um this study aims to provide insights into the potential cost effectiveness of AI tool in the response prediction to neoaduvent chemotherapy of stage 23 uh the rectal cancer and this is the comparison to usual care.
00:24:59 - 00:26:00
So this is a hypothetical study, right? Uh it's not like it's a model. It's modeling, right? So this study included a state transition mark of model from a Dutch societal perspective. Um quality adjusted life years and costs were simulated over a 10year horizon. And you know the important uh word here is that they were simulated, right? We're we're hypothesizing about it, right? Um and then uh sensitivity analysis and a threshold analysis were performed and we're going to see in a second what that means. Um and the results like in the best case scenario when everything works well AI is uh performing well the there was an incremental cost saving of uh 2.5
00:25:56 - 00:26:36
million euro per uh quality adjusted life years gained per thousand patients right so this is thousand patients and 2.5 million euro cost savings Um main drivers of co cost effectiveness were the clinical complete response incidents and specificity of the tool uh and cost effectiveness was maintained if the cost of AI was 1,100 euro and €2,100 euro.
00:26:27 - 00:27:17
So they simulated, okay, excuse me. What if this like deploying this tool uh would cost over a€,000 euro? It was still cost effective to use it over 2,000. It was still cost effective h and uh performance at uh 0.85 and uh 0.90 which is very like good performance. Right. So then the question arises okay if the deployment uh costs more then at some point you lose the cost effectiveness if the performance goes down then uh obviously this is not a viable case anymore.
00:27:11 - 00:27:59
Uh so they say findings of this study present the economic impact of a hypothetical hypothetical AI based approach to treatment and uh treatment uh treatment response prediction uh in stage 23 uh locally advanced rectal cancer patients who received neoaduvent um chemotherapy uh chemotherapy sorry neoaduventant chemo radiotherapy and are eligible for consecutive surgery and the results of the study highlight the complexity of healthcare decision- making.
00:27:53 - 00:28:25
I think, you know, this is kind of no-brainer, but I think in general, it's so much easier to think in black and white and so much easier to, you know, make decisions when you only have like good and bad option, but there's always way there's often good and better or bad and worse and things like that.
00:28:21 - 00:28:52
So uh there are a lot of nuances specifically to healthcare and you know these models help and if we're lucky and we have a great case scenario where everything is performing over 0.85 whichever metrics they chose then we have a fantastic tool to save money. Um and but if not then we don't but great exercise I think. Okay let me know.
00:28:45 - 00:29:46
Oh, and leads UK is saying hi as well. Hello. And we have some comments that AI will increase productivity and efficiency in automation. Um, and this has already impacted clinical labs. I think so. I think there are a lot of um how do I call them? I don't call them I do call them lowhanging fruit fruits uh kind of AI tool to tools and I'm referring to workflow uh improvement tools workflow redesign tools um maybe we're going to get some papers next week or or soon when um aentic AI approaches are going to uh get published. So basically where you're trying not to replace or help the diagnostician in the
00:29:43 - 00:30:14
diagnosis but help them in their workflow where they like don't have to click millions of windows and uh lose time and patients and things like that. So definitely AI entering these areas of clinical labs is super valuable without threatening the expertise of the physician uh doctor healthcare provider.
00:30:10 - 00:30:55
So let me make this one big. This is our last one but stay till the end. I'm going to give the QR code for the book again so that you can scan it if you have not yet. Okay. And that's an interesting one as well. A multi-expert multi-expert deep learning framework with LLM guided arbitration for multimodel histopathology prediction.
00:30:45 - 00:31:32
Um, so and this is Cincinnati, Ohio, USA. So we all know that we have advanced a lot. The deep learning like changed the landscape of computer vision of u computer aided healthcare. Let's call it like super broad computer aided healthcare. deep learning. Um, one of my podcast guests, Andrew Janoik, called it uh as if we were given fire like the invention of fire, deep learning in the broadest uh broadest sense.
00:31:26 - 00:31:56
And now transformers and uh transformer-based architectures like made this fire even bigger. So um this helped with improving the accuracy of computational pathology. But um there are a lot of models being created and conventional model and sambling strategies often lack adaptability and interpretability.
00:31:52 - 00:32:25
What does that mean? These assembling strategies. So let's say you are uh developing multiple models and one model says one thing and the other model says a different thing. Uh and then you have several more and then you um let them vote or or like you ensemble you like uh you collect all the results and maybe you do um and they mention what this is like majority voting or aggregating this in a classical way.
00:32:26 - 00:32:54
So um because these these models multiple AI models can provide complement uh while they can provide complimentary perspectives um sorry the aggregating of their outputs is often insufficient for handling intermodel disagreement. So not only pathologists disagree the models disagree as well.
00:32:50 - 00:33:20
Uh so what do we do with that? To address these challenges, they propose the authors proposed a multi-expert framework that uh integrates diverse vision-based predictions and a clinical featurebased model with a large language model acting as then intelligent arbitrator. So we have like a third party. We have these models that go and assess something.
00:33:16 - 00:34:01
They have two data sets. And then we have the large language model that kind of looks at the reasoning of these models and picks the decision in a more intelligent transparent way than just like some mathematical operations or um whatever was done so far. So um they leverage the con contextual reasoning and um explanation capabilities of LLMs uh and their architecture dynamically synthesizes insights from both imaging and clinical data resolving model conflicts. This is cool.
00:33:54 - 00:34:56
H and then it provides transparent rational decisions. Uh and what they used were two cancer hystopathology data sets. Uh there was one HMU GCH E30K and this is for gastric cancer. Uh that only had pathology images. So this is important because um that got me confused because they have these two data sets and one of them only has pathology images whereas the second data set uh BCNB which is breast cancer biopsy data set is truly multimodel contains pathology imaging and clinical information. So their proposed multi-expert LLM uh arbitrated framework and they call it Melma Melma multi-expert LLM arbitration right Melma H outperforms convolutional neural
00:34:54 - 00:35:33
networks and transformers which are currently the facto um and state-of-art classification ensemble models and um their method has better overall results and they tested different LLMs as arbitrators. So they took Lama, GPT variants and Mistral and they pro uh their proposed framework outperforms strong single agent CNN um vision transformer baselines on the data sets.
00:35:27 - 00:35:58
uh and they show that um and ablations show that learner per agent trust materially improves the arbitrator's decisions without altering prompt or data. Um and LLM guided arbitration consistently provides more robust and explainable performance that individual models conventional and sambling with majority vote.
00:35:54 - 00:36:30
So here are these different conventional ensembling methods the majority vote uniform average and metalarners. So their LLM uh arbit arbittor arbitration outperforms these and um so so we have an an option they a promise an option that LLMdriven arbitration for building transparent and extensible AI systems in digital pathology.
00:36:24 - 00:37:25
Uh so for this first database uh of gastric cancer I was like okay so what what was it arbitrating if there was just uh image if there were just images? Uh so what it was looking at uh for these different models that um that it was arbitrating between. So there were different computer vision models uh that that were giving some answers and what what it was looking at was okay how often was um each of these models right uh how consistent it was what was the um like confidence in which this particular model predicted and not just uh the majority voting or whatever the approaches were. So it can even like go into unimodel models and figure out okay which reasoning of which model was uh
00:37:21 - 00:38:22
better more times and uh propose a kind of arbitration strategy which I thought was pretty cool because um we are limited in doing that and by we I mean human observers. So you need to do it like how did I learn that by taking part in a project where and these models have a lot of parameters right and we were only trying to um do combinations of um couple of parameters and couple of performance metrics and very quickly I was like is there a software to do it because I am not able to visually decide like they're kind of good enough but I am not looking at every cell I'm not looking at like every field of view. I'm then looking at the metrics. How do I know uh which one is better? Uh and now we have an LLM powered framework that can help us arbiter and provide some
00:38:20 - 00:39:21
kind of justification and then um a human expert can go and figure out okay is this justification does it make sense or not? Uh and with that I want to add something to okay take an abstract and uh ask an LLM to explain it to you in plain language. Um ask like clarifying questions because that was one of my clarifying questions like oh so how come how is this model arbitrating multimodally on a unimodel uh data set? So so that's what it explained to me but not in the first place. So if you have any um questions, anything is confused and sometimes the LLM is going to tell you, oh no, most of the time it tells me, oh great catch. Oh, and there was another thing I needed to tell you about. So obviously I have like some pop-up notifications on my phone. Let's see if I can find this one cuz when I saw this, I'm like, I need to tell it to the trailblazers because we
00:39:19 - 00:39:54
were talking about it. uh we were talking about it where when I had this problem I think I told you the story that I was submitting um like part of a publication and uh like I read through the through my part it was really solid I knew it was based on my presentation and then I like quickly used AI for references so h I don't feel that bad anymore I felt very bad when so nobody really caught it but they gave me.
00:39:49 - 00:40:17
Maybe they caught it, but uh maybe they like wanted to give me the benefit of the doubt because they told me, "Oh, can you format the references differently and I went in and formatted them differently?" And in the process of formatting them differently, I realized that they were fake. So, it wasn't only me.
00:40:08 - 00:40:50
Let me give you uh January 2026. So this um this month um an analysis by the AI detection startup GPT0 revealed that more than 100 AI hallucinated citations were included in research p papers accepted for the neural IPS 2025. Um so without you know going into the details I just screenshot screenshot this yesterday because I thought I have to tell it to the train blazers.
00:40:42 - 00:41:28
So yeah over 100 AI hallucinated citations. So for us this information says yes you can leverage AI for citations but you need to scrutinize it because it hallucinates uh and I think it's getting so at least from since I wrote the book it was first edition 2023 second edition is going to be 2026 when I wrote it it would hallucinate in this like normal te text output a lot more than uh right now right now is I feel it's uh it leveled up.
00:41:21 - 00:41:48
Uh but in these like very specific tasks um like like the reference finding, reference generation or or stuff like that, it still hallucinates. If you find a tool that is actually optimized for that, let me know because I was trying to find one couple of months ago last last year. Obviously last year is a couple of months ago because we're in January.
00:41:44 - 00:42:19
Um but I didn't find anything specific. There are different ones for um writing different things. If you find anything, let me know. And thank you so much. Oh, and I have one more comment. Let's do this comment. Um by nowadays there's a lot of fog in how AI can be implemented in order to help or boost effectively on pathology workflows.
00:42:14 - 00:42:50
I think improving the efficiency of workflow is key. Yes. And something I um a cool presentation I listened to at the DPAI conference in London uh by Orly Ardon was about QC quality control. Uh because this quality control step I mean it is an integral part of producing slides but uh then there was this additional QC step of uh quality control of scans.
00:42:43 - 00:43:12
So not only in digital pathology unlike radiology maybe it's going to change with the direct to digital imaging technologies but not only we still do the analog glass slide we then scan it and then on top of that we quality control the scan even though the glass was already um Q seed but because the different artifacts and different things are important in scanning we need to do that.
00:43:08 - 00:43:48
So she was talking about uh implementing a software for that. uh and I thought hey yes this is something that is high leverage low risk and and using AI in these situations um helps obviously with workflow with uh probably cost savings or even um maybe generating of revenue but what it helps with most and that was part of uh my conversation with my healthcare provider when I went to the doctor and uh we were talking AI was like, hey, people are afraid.
00:43:46 - 00:44:22
And I think we uh talked about a paper where people uh where there was a mention of people when there was a mention of AI being used in a clinical trial, they would withdraw from the clinical trial. I'm like, wow, that is not good. So anyway, there was this discussion about uh making it understandable uh introducing it uh to the patients, introducing this also to healthcare providers to be comfortable with the tools and um implementing guard rails. This is an ongoing discussion.
00:44:19 - 00:44:45
You're going to be hearing about it. Uh h but for today, let's grab a coffee, start our working day. Thank you so much for joining me. Uh, next week I'm going to have a new computer. Uh, and no, I don't even know how to like switch it off. Oh, no. Well, reading takes you a long way.
00:44:40 - 00:44:54
It has a button that says endstream. Thank you so much. Uh, thank you so much for trailblazing this trail with me. Leave me comments if you're uh looking at if you're listening to the recording.