The Translational Mixer

Episode 2: Eric Topol, multimodal AI models in medicine, and a glass of Merlot

Andy Marshall Season 1 Episode 2

Send us a text

Scripps' Eric Topol is a visionary in the application of artificial intelligence to medicine.  He has a wide-ranging conversation with JC and Andy about : 

02:52 Multimodal AI is coming
06:44 FDA-approved AI software
10:02 How to validate AI models?
12:50 Synthetic doctor’s notes and other early applications
15:38 Thinking about the model and its training
19:28 Dealing with hallucination and GPT5
24:13 Low-to-middle income countries 
27:40 Uptake by the medical community
29:15 Open or proprietary?
35:46 What to do with the data?
39:46 Eric’s elixir


1) His September 2023 Science editorial,

2) Derma Sensor.  

3) Wowser package AMIE (Articulate Medical Intelligence Explorer) a descendant of MedPalm 2 

4) A recent Science essay on diagnoses

Eric’s Elixir: 

Pinot Noir!

Marimar Mas Cavalls Pinot Noir 2018 from Russian River Valley, CA, USA.

Cristom Marjorie Vineyard Pinot Noir 2021 from Villamette Valley, OR, USA

Racines Pinot Noir Santa Rita Hills 2017 from Santa Rita Hills, CA, USA



The Mixer music “Pour Me Another” courtesy of Smooth Moves!

02:52 Multimodal AI is coming

06:44 FDA-approved AI software

10:02 How to validate AI models?

12:50 Synthetic doctor’s notes and other early applications

15:38 Thinking about the model and its training

19:28 Dealing with hallucination and GPT5

24:13 Low-to-middle income countries 

27:40 Uptake by the medical community

29:15 Open or proprietary?

35:46 What to do with the data?

39:46 Eric’s elixir

Andy Marshall: Welcome to The Mixer. I'm your host Andy Marshall here again with my partner in crime Juan Carlos Lopez.  

Juan Carlos Lopez: Hello Andy, good to be here. Tell me, who is our guest today?  

Andy: So today we have Eric Topol. Eric is a physician scientist; a renowned cardiologist from his time at Cleveland Clinic. Since then he became Executive VP at Scripps Research and is the Founder and Director of the Scripps Translational Research Institute in La Jolla. He’s also of course a very talented writer and editor and he’s written several insightful popular books about digital medicine and also artificial intelligence in medicine. 

JC: Yes I’m familiar with his work on multimodal AI and I’m sure its going to be a very insightful conversation.

Andy: Great,  so let's  get  stuck  in.  —

JC: Let's go.  (music)  —

Andy Marshall: Welcome, Eric. Welcome onto the show. Thank you very much for agreeing to come on board. So, you know, there's been a lot of chatter about AI. One thing that really struck me was today I was on my personal (X) Twitter stream. I had a listing of the top AI innovations in 2024; what I should expect. But what it said was, we've got the election year coming up, you know, essentially, I should expect to find all of these deepfake images of people during the election. And then it told me that there's this pandemic of loneliness among people among the world and AI chatbots are gonna be the new kind of romantic partners. I just find it really interesting how this technology is permeating every aspect of society. And we're gonna focus in this podcast on science and research and how AI is going to affect that. But it's just pervasive, you know throughout society. 

Eric: No question, but I do think the biomedical applications are unique because of the benefit to risk ratio is so extraordinarily high, especially if we do this right. So yeah, I am concerned outside of the biomedical sphere, but since I try to keep tunnel vision on that, I don't lose my optimism.

02:52 Multimodal AI is coming

Andy: One thing that I really wanted to kind of explore with you was the editorial that you wrote in Science (381, (15 Sep 2023) DOI: 10.1126/science.adk6139). When was it, September last year? 

Eric: Yes, on multimodal AI. 

Andy: Exactly. So these kind of multimodal large language models, I think are coming, yeah? You wrote this really thought-provoking editorial where you laid out your reasons for why you think this is going to be important. For the benefit of our audience, can you summarize why you think multimodal AI is such a big deal and why it's such a leap forward? 

Eric: First you have to go back to what we got out of unimodal AI, and it was far more extraordinary than we had anticipated. So many of us thought, well, maybe we could have more accuracy of interpreting medical scans. And that was superseded by a remarkable discovery that machines could be trained with supervised learning to see things that humans will never see. Unimaginable that we would see that in the retina, for example, that the AI unimodal could be trained to... detect risk for Parkinson's, Alzheimer's, years before symptoms would be manifest. Heart disease, kidney disease, hepatobiliary disease, and the list goes on and on. And that's just one example of what one can glean via machine learning.

That's set into a foundation model that I was involved in collaborating with Pierce Keen and his colleagues at Morefield's Eye Institute in the UK, one of the leading, if not the leading Eye Institutes in the world, and with over a million images, instead of having one study after another for all these possible use cases of machine eyes, it could do all of them with one foundation model. But still, what we're seeing now is the potential to take different types of data, not just scans, which was mainly the deep-learning era in biomedicine, but also text and speech and audio. So when you start to do that, then you multiply where this can all go. I mean, it truly is a multiplier because let's just take the example of cancer screening today. When we have screening today, we're going to take a look at what we're seeing that's based on a dumbed-down metric of your age. Now, it can't be much dumber than that. It's just your age that you should get screened for breast cancer or colon cancer or whatever cancer.

Well, it turns out, you know, there are a lot more younger people, even people in their 20s getting colon cancer. You know, women are getting breast cancer in their 30s. And so they're not even part of the screening. Is that the best we can do? I'm convinced it isn't. But when you start to put in all the other layers of data, genomics, predisposition, you have family history, all the electronic records, scans that can pick up things that otherwise wouldn't be picked up, liquid biopsy assessment, chips, all sorts of tools that we haven't done yet, then I'm convinced we can pinpoint risks so much, much better. And that's just one multimodal AI going forward. There's also many others that include a virtual health coach, digital twin infrastructure, remote monitoring that's coming. So you don't have to go to the hospital unless you're an intensive care unit sort of thing. So this is the ability to take many layers of data and including real-time data biosensors. That gives us a whole new look for what can be achieved in medicine. 

06:44 FDA-approved AI software

Andy: Before this, I was doing a bit of research. I was looking at the FDA site and some of the software that's been approved on the FDA site. And there’s are a lot of software packages — it's like >500 now that have been approved by the FDA — but these are still pretty simplistic, shallow learning models, and I was wondering — in terms of this type of multimodal model that you're talking about — which of these large language models that incorporate many, many different types of data into the model. Where are we there? Have we had any multimodal AI applications approved by the FDA?

Eric: No, there has not been a single multimodal AI, as far as I know, not even submitted for 510(k), no less cleared. So that's a really important point you're making. There's now 650 FDA, primarily 510(k), but obviously some that were formally approved, AI algorithms. They're all unimodal. The vast majority are scan-related radiology. And what's really disheartening is that the FDA let these go through with no publication, no availability of the data to the medical community for what it showed. Most of these were retrospective, rather than real-world prospective, studies.

There's less than 100 randomized trials out of all of AI and that's not even connecting it with the FDA approval process. So what we have is a lack of compelling evidence and the FDA isn't helping 'cause they're not holding these companies to the fire of you know, you've got—got to make the data transparent. It's got to be prospective, you know, it's got to be really high quality. And now all of a sudden, we're in a new era of transformer models with multimodal data, all these layers. And, you know, they've been a somewhat of a low bar. You could even call it sieve for getting these through. What's going to happen now? Well, on the one hand, the multimodal models will multiply for opportunities, but they could also multiply for FDA promiscuity of giving them a green light when they don't deserve it, which is the theme so far. 

That said, there are some really good ones; you know, just yesterday, there was an AI unimodal called Derma Sensor that was approved for use of refining skin cancer diagnosis. I said, okay, I'm going to look this up. This has a paper associated with it ( J Clin Aesthet Dermatol (4 Suppl) s16-s17, April 16 2023; J Prim Care Community Health 14, 21501319231205979, Nov 7, 2023; doi: 10.1177/21501319231205979). And I was so relieved to see that there actually was a manuscript with the data. Isn't that nice? Because most of the time when I look things up on these approvals, I can't find anything. So maybe the FDA is coming along, but the medical community is not going to get into any significant implementation phase, unless the data are transparent and compelling. 

10:02 How to validate AI models?

Andy: This is a really important point that kind of resonates with something that I've seen. Whether it's, you know, announcements about AI advances, or whether it's even papers, yeah, is this question of validation, Eric. I think it's really worth us kind of thinking about validation and what we really need in terms of thinking about how we validate the models and how do we validate the data sets that are used in the models? What do we have in terms of concerns about generalizability, biases in the training data? How do we ensure that the stuff that we're seeing in the AI literature is reproducible in the real world?

Eric: Right. You know, there's a real difference between, you know, life science and the language of life with omics and that of medicine when you're making a sense of evaluation and assessment of a medical AI model, the things that you just highlighted are so critical. 

So what are the inputs? Are they from diverse people, patients? Is it from many sites or is it just from one place? And how big is this data set? And, you know, so many considerations, but even after you get to the model's putative performance, where does the expertise come from? Validation? What has been done to interrogate the inputs or unintended processing of the data for biases, which we've seen many, many flagrant examples, often only discovered after the AI model was published or even used at scale? So there can't be enough rigor to make the models as they're used throughout biomedicine as good as they can possibly be.

And that's part of the problem we have now, is, you know, there's a lot of skeptics that are very reasonably concerned about errors and confabulations and what large language models certainly have a major issue. And to override that, we have to prove that the benefits, the outputs are overriding. There hasn't been enough attention to that overall. There are some great examples, but it's not the usual model that we get to review as a publication or even as a potential health system that's going to purchase the use of a new approved model. 

12:50 Synthetic doctor’s notes and other early applications

Eric: The other thing just to mention is, there's... some things that don't require regulatory oversight. And so one of the biggest right now is the ability to have an automated synthetic note of a visit between the doctor and patient. And this is something that's extraordinary that doesn't have to go through any regulatory path because its value is being seen by clinicians as saving them hours a day of not being data clerks, of restoring the actual eye-to-eye contact with patients, of generating notes that are far better than anything we've ever had in our electronic record, and then doing all these downstream functions. So that is an example of how we can advance the field, in a way, a sort of rescue mission for a very serious problem, practically, which is a break breakdown of the patient-doctor relationship and the sense of clinicians having to be data clerks, which they don't want to do. They want to care for patients, of course.

So the one thing we have to keep in mind is, whether we're talking about pure life science or medical, there's lots of different compartments of how this can be used. Each one is different. You know, for some, there's no issue about a regulatory path. You know, if it works, it's great. So that's, I think, something that people, when they just think of AI, they just jump into one part of the story and it's probably not an adequate way to view it.

JC: Returning to multimodal AI for a minute, where do you see the low-hanging fruit in that field? What application would be the first one that you think can set the standard for this emerging field? 

Eric: Well, I think the basic theme is, you know, we've seen how we can take the weather forecast and make it orders of magnitude better. And so that's what we're talking about with people, defining their risk of various conditions long before they ever had any symptoms of those conditions. And this, I think, is the integration of all the information on a given person, multi-layered, high-dimensional data, and so this is a much more intelligent way, as the cancer example I gave, of just saying, "Oh, you're 50, so you should be having a colonoscopy every five years." So, I think this is, I think, one of the immediate goals that we should have is defining risk, whether it's for cardiovascular, cancer, neurodegenerative, you know, whatever that is, we can go into prevention mode, which has been a fantasy in the medical arena for decades, if not centuries. We can actually get into prevention if we only knew what the person was at risk for. 

15:38 Thinking about the model and its training

Andy: So Eric, I think there's two aspects to this in terms of validation that are really, really important. So one is the models and the other is the data. Can we talk about some of the aspects of the models that we need to think about that are really important? Like what needs to be disclosed about the model? That's really important in terms of reproducibility in the real world. And then how do we think about the data that's used in the model and maybe that's used to train the model? 

Eric: Yeah, well, there's a lot of holes here, right? So firstly, we're now in the self -supervised, unsupervised learning era. It's a big shift from what was having experts review the data inputs, ground truths, you know, it was a whole different look. Now, it's what the data is basically learning from itself. you know, self assembly, if you will. And that is a problem in with respect to how good is the data. And the problem that we have today is some of these base models, like for chatGPT, Gemini, several of the leading ones are non-transparent. We have no idea. We just think, well, maybe they have the entire scraping of the internet and they have Wikipedia and 100,000 books. Who knows what's in there, right? But whatever's in there, we know has extensive cultural biases because it's human content. So that's one part. 

But then, like, for example, an incredibly remarkable paper preprint from Google (https://arxiv.org/abs/2401.05654) came out last week on medical diagnoses. It was about what they called ‘conversational AI for medical diagnosis’. It compared 20 primary care doctors with 20 patient actors for over 120 complex cases.  And it showed, I mean, looked at 26 performance criteria, right? And 24 were superior by the... Google model. They called it AMIE (Articulate Medical Intelligence Explorer) It's a derivative, a descendant of MedPalm 2 of theirs. Anyway, we have no idea what's in that model. They don't disclose any details, but the data were like, wow. Now, this is somewhat contrived because they have these actors having conversations with either the AMIE model or the primary-care doctor. And they were rated both by the patient actors and by the external doctors as to how good do they communicate, how much empathy, how accurate was the diagnosis, how good was the management plan. And the doctors look weak, I mean really weak. And it's very impressive. 

However, what is in the model? We have no idea. We have no idea what the real data inputs were. We just... We just, we know about the outputs and their, you know, their wowser outcomes. So this is a consistent issue. We have to place basically implicit trust without having knowledge of what's really inside. What's going in. Exactly how the model works; you know, explainability of the model. We don't really know. So it's a very tricky thing. 

And the other point is it's not been replicated. Of course, there are other studies to suggest some similar aspects of this so we have to always be skeptical with these limitations until we have a lot more to go with and transparency is we're not there; we're just not there

19:28 Dealing with hallucination and GPT5

Andy: To me this really is the challenge of this; the fact that this is so kind of opaque. We see with these large-language models, I'm trying to think through issues such as hallucination, yeah? So most famously, chatGPT generates answers to new questions basically by reusing and reshaping its training and comes up with fictional unverified information like references that don't exist. People have talked about this like it's these models are ‘stochastic parrots’. So to me this really is an issue. 

I've been kind of trying to think through this in terms of how do we think about applying these models in terms of the clinical world? And to me, even if most model output is of high quality and most training data are reliable and you get results that are trustworthy, the problem is that the recombination and the kind of way in which the models parse these data in a new context may lead to prediction failure. So in the research world, we're interested in how you use this to predict hypotheses that you can then test experimentally. But in the clinical world, you're really interested in what's this going to tell me about this patient. So if you have high-quality training data that's reliable and trustworthy, the recombination of these data into nuances in a new context, you have no guarantee that the model's gonna come up with something that's gonna be prediction failure or some kind of completely left-field answer. And kind of going in and improving the accuracy of the models, feeding more data or fiddling around with the... algorithms kind of seems insufficient because in a way, in the clinical realm, the more accurate you make the model, the more the users are going to rely on it and the more they will be tempted not to bother in verifying the answers. So how do you deal with that issue, Eric?  

Eric: Right. Well, I think it's important to be cognizant of this and something I think about all the time. But remember, GPT-5 will be out in the weeks ahead. We're still early. I mean, transformer models got their start in 2017, it was only the end of November of 2022 when we really launched to the world, and we'll see better performance with respect to confabulations and overt errors. 

But in the medical sphere, I'm not as worried about that because, for example, you should always be checking references. But if you get a diagnosis, are you ever going to have a diagnosis from a GPT model that you're going to go with and have the patient undergo surgery without a doctor’s oversight? A doctor, who has wisdom and experience? No. I mean, there's a human in the loop factor that I think is important. And that, you know, back to JC's question, which is improving the accuracy of diagnosis. You know, I've got another Science essay on January 25th on that topic. Because when you have all the layers of data, you're going to have a much better chance of getting an accurate diagnosis.

AMIE model outputs were incompletely accurate, but they were more accurate than the doctors. That's for sure significantly more active. So again the fear of the fakes and the promotion of disinformation misinformation and all these other things, I'm not as worried about it in our world. 

We need help. We need a rescue. Outside of medicine, it's much more of a problem. That doesn't mean that the things you brought up Andy are not, you know, serious matters. It's just that ratio of benefit, I think, is more favorable. 

24:13 Low-to-middle income countries 

JC: To me, the benefits clearly outweigh the risks, thinking particularly about low-income countries. So they don't have access to specialists, but they can have access to these kind of diagnosis. So the number of times that the model will be right will supersede the number of mistakes that it can make. So to me, I think that these models will only get better. And if there are mistakes, you know, think about people with ultra rare diseases, right? They can spend six, seven years looking for a diagnosis. So we're not very good anyway, particularly in those cases that are very unusual. In the case of more common diseases, I think that the models are going to supersede a lot of what can be done. And I can see that as very beneficial, particularly, as I was saying, for low-income countries. But to me, the question returning to some of the earlier points we were talking about is performance is going up. It seems that regulatory science isn't really catching up and adoption is uncertain. So if you're a doctor, how confident are you that this is something that you should adopt or not? So to me, the question is, what do you think is going to influence more the adoption by clinicians? The increased performance or an overhaul of regulatory science? And related to that, how can we go about overhauling the regulatory science, which seems to be quite like lagging behind? 

Eric: Well, that's a nice summary, JC. I like how you process that. I agree with you. I'm worried that the FDA won't step up here and show they have some teeth and they really demand, you know, a higher level proof of whatever they're assessing. I think the 510(k) path is a problem here, frankly. You know, where are the prospective real-world studies, not retrospective data sets? Where are, if possible, randomized trials even? So the quality of data presented to them, these are two problems if you are going to give the ability for commercialization. 

Now, the implementation is weak now in the health systems around the world. I do agree with you that the advantages in low- and middle-income countries is extraordinary. We could actually reduce inequities that exist because of the capabilities that you outlined. I mean, I've seen many examples of this. There was a randomized study of diabetic retinopathy algorithm in low- and middle-income countries to show how inexpensively and rapidly you could get where most, as you know, at least half, if not the majority of diabetics are never screened for retinopathy (BMJ Open Diabetes Res Care 11, e003424, Aug 2023, doi: 10.1136/bmjdrc-2023-003424). So here, all of a sudden, you've got a way to get on top of that. No less things like imaging’s place in the world. So that is a very important point and Saurabh Jha and I have written about that in the Lancet (401, 1920, June 10, 2023, DOI: 10.1016/S0140-6736(23)01136-4); about how the low- and middle-income countries could really benefit and in some ways, they're adopting much quicker than we are in the US and rich countries.

27:40 Uptake by the medical community

Eric: But the implementation issue that’s my biggest concern is the data are weak for the reasons that we've been discussing and the medical communities never going to buy into this, except with exceptions as outlined. If AI gets rid of data clerk, keyboard liberation, great. But that's not the kind of thing we're actually getting into. The implementation will be very limited and slow, unless there's overwhelming evidence that establishes trust, transparency. And doctors find all this threatening. They've ruled the roost and own this space. There wasn't any machine that was gonna be capable of any of their work, no less superseded its performance in some respect. So in order to deal with all this we have an evidence lack that has to be dealt with. 

And a lot of these companies that are in this space are startups that are not committed (to transparency). They don't have the funds. They'd like to do these studies, but they... just can't. And then, you know, the big tech titans, Google, Microsoft, Meta, OpenAI, the whole lot of them, they're not used to, you know, high-quality randomized trials. So we have to get into that. And again, they don't have to be randomized, but they sure have to be prospective in real world.

I wouldn't want to bet anything on some retrospective in-silico data set to try to change the future of medicine. To me that that's just a hypothesis-generating story, not a real proof. 

29:15 Open or proprietary?

Andy: I think this is so important. How do we incentivize the key actors here to do prospective clinical trials and then publish the data from those trials that have relevant controls and that are sufficiently powered and they're published in the literature and that gives us confidence. To me, this area is kind of mired in this problem that there's such a lot of overlap with the kind of tech startups where you don't really...need to show the data, you don't need to show the methodology. It's really interesting to me because the AI research community is really open, yeah? They're very, very pro-open science, open data, open methodology. And yet, where the money is being invested in the private sector, we have this kind of contradiction where everything is proprietary and we... It's really difficult to kind of get this evidence base where we have confidence.

Eric: That's the fundamental problem. That's the nuts of it. That's the fundamental issue today, which I'm not sure how we're going to get there. To JC's point, it would fix, in some respects, even though the FDA is not going to go through a reset, if they had high-quality data, if they demanded it, that might help get us to what you're asking about, Andy.

But I don't see a solution in sight outside of the regulatory demands, because the funding, the resources of a lot of the companies that are working in this space, to raise capital to do the randomized trial, you know, many millions of dollars, doesn't seem like that gets their investors excited. If the investment community only knew that the better the data is, the more likely that company is truly going to succeed quickly. If they only got that. But the investor community, they want to have the minimum; get commercial approval, get some revenue, short term, grab some money. And that model isn't working so far. 

We've had deep learning AI algorithms prove for several years. How many of them are in practice being truly adopted at scale? Very few. Most of them are back -office operations things, not actual care or for patients. So, whether it's the investors, whether it's the leaders of these companies, whether it's the FDA, I don't know how it... going to get on track. I'm worried about it because it's going to slow everything down. And then there'll be a medical AI winter where all these people say, oh, there's nothing here. Where is the great revolution after all kind of thing? 

Andy: This is such an important point that you're making, Eric, because the way I think about this is that if you kind of... of boil down this problem, there's two parts to this, yeah, there's the algorithms and then there's the data. And to me, it's very low bar for market entry into the algorithms. And I appreciate like every now and then, you know, somebody's going to come along with a really kind of innovative way of thinking about parsing data. But really, that's very rare. Yeah, so the models themselves are not particularly, you know, exciting, you know, this is one of the problems we had when I was a nature biotech, we were thinking about the models and how they're applied to the data. But access to the data becomes the most important thing. So I've tried to think about, you know, where is the value here? Where's the gold? Is it the models, or is it the data? Then the other thing that I just kind of want to interject here is the whole kind of machine learning AI field is very open. And yet, if you look at translation, essentially everybody's proprietary about the data. So you have kind of pharma companies, they just want to hold on to the data here. They don't want to open up their data and let everybody else work with it. So how do you think about how do we go forward? Because to advance, it seems like you want the data to be open and the models to be open and then science to flourish. But in fact, the situation that we have is a very balkanized kind of siloed situation in which healthcare systems; hospitals keep hold of the data and then the models themselves kind of exist out of that. 

Eric. Yeah, and I think it's inseparable because it isn't just the data, it isn't just the models. I do think that much of the concern about bias itself isn't as much the model as the embedded biases of input, but overall they're so entangled, interdependent. If we had a commitment to do this right and everyone involved, not just whether it's academics, whether it's throughout the whole continuum of different types of company that are in the space and our regulatory bodies. Eventually we will get there. But instead of many years for some of the most aspirational goals that we've been talking about, it could be really accelerated. But I don't see yet that overriding commitment that we got to do this right and we have to be transparent and we have to, you know, get to real world. None of this in silico stuff. I mean, there's so many shortcuts that are being taken now. And I hope that people will realize that that's not in our best interest, none of our best interest, no less patients, of course. So, you know, I remain optimistic because, you know, all these things of our eventualities is just matter of time. Is it gonna take 15 years or three years? And it's up to us. 

35:46 What to do with the data?

JC: Elaborating a little more on that. This issue of the data. There are two points that I think are important to discuss the issue of data standards. How are we going to go about developing standards? 'Cause who's gonna take the lead? Who should say, okay, this is going to be the standard for these data? Without standards, I find it hard to see how we're going to take full advantage of this. That's number one. And number two, the issue of data ownership, right? And Andy was already alluding to the fact that pharma companies think about all the data as proprietary. And I'm thinking about some of the recent lawsuits that we've seen with book companies and people saying, "No, but why did you channel myself to train your chat GPT model? Are we going to see that?" in biomedicine? Are people going to sue firms because the data was used without their consent or large hospitals? Are they going to be saying, oh, no, no, you're gonna have to pay us because you will use our data without the required consent. How are we going about this issue of ownership and the issue of standards?

Eric. Well, the ownership one is, I don't know a better word to describe it than a mess right now. It's unsettled, it's very rocky, obviously the New York Times and OpenAI is going to be an important precedent. OpenAI could give a little basis point of their efforts to each of these major mainstream publications but apparently that wasn't enough for New York Times and they're going to fight out on court. How much is that going to cost? And how long is that going to take to be resolved? 

So yes, we will see similar things like that in the medical arena. I'm sure of that because medical data are collected through all sorts of brokers and entities. The health system thinks they own it, but they don't own it. And we're not talking about federated learning here or anything that's being done to preserve it. The privacy, security, integrity of the data. There's been partnerships that have been established with tech titans with health systems. You know, of course, those are all, we're not privy to what's in those contracts. There will be fights about the data. I don't know how that's gonna play out, JC. It's just gonna happen. You can guarantee it. Just as you could guarantee, we're gonna see, you know, open AI challenge by media. I mean, I had two books that are in chatGPT and said, "How come I don't get anything? How did they take my books that I've worked so hard on?" And of course, there are many people suing. This is not going to be settled for some time. 

Now, the standards, there are parties working on that, whether they're going to get traction and consensus, but there are different groups that are definitely on that. We'd love to see standardization of data, yes. I'm more optimistic that will occur than in the short-term getting settled about ownership. 

AM: Who are those groups, Eric? 

Eric: Yeah, I mean, there's several. I'm trying to remember some of the groups that have been kind of self -assembling recently, writing papers about it. There was one I came across from Singapore (https://aisingapore.org/innovation/ai-standards/). There's a few nonprofits in the US (e.g., https://dataandtrustalliance.org/) that are organized; the WHO is called for that I mean there's it's coming from many different directions we do realize that that's an important goal but I don't know which one has the best likelihood of driving it but I there is consensus that would help for sure yeah that's gonna be a very big challenge and hopefully there'll be a solution that is satisfactory so that these field advances. 

39:46 Eric’s elixir

JC: You know, we're coming to the end of the recording and Eric, there's always a question we ask to all of our guests as you may have seen, the podcast is called The Mixer. And the reason is that we try to combine our interest in translational medicine with our interest in cocktails. So the final question is, what is your go to drink when you finish a satisfying day of work? What is your go to drink? 

Eric: Well, for me, it would be a glass of a good ol Pinot. That's my thing, yeah. 

Andy: Oh, fine choice. My wife's French, so she would very much appreciate that. 

Eric: Yes, Pinot Noir, I should say. Right, yeah, red's the only wine I would like to drink. 

JC: Very good. 

Eric: Yeah, I really enjoyed it. You two are really sharp. And it was fun and I'm sure we'll converge again, whether it's On the Mixer or all sorts of other opportunities. 

JC: We look forward to that. Thanks so much. Thank you very much for your time and for your insights.

Eric: Thank you. You guys take care.


JC: Well Andy, that was a very interesting conversation. I learnt a lot and I’m sure you learnt a lot too. Now my one criticism would be that Eric doesn’t drink cocktails. Fortunately, he expressed his interest in Pinot Noir and as you said during the conversation, your wife Delphine knows one or two things about those type of wines so hopefully you won’t mind sharing some of that knowledge with our listener. 

AM: That’s right JC. Delphine is from Medoc which is on the left bank of Bordeaux, but Bordeaux really isn’t famous for its Pinot Noirs; it’s more Merlots and Cabernet Sauvignon varieties. The place in France that’s really the benchmark is Burgundy.  But in recent years, we’ve really seen these West Coast US vintages coming to the fore, so Oregon and California, so we’re going to give a description of some of those below.

 JC: Terrific. I hope I can afford at least one of them! If not, I’ll stick to cocktails and raise a glass to you until next time Andy.

AM: Cheers JC!

People on this episode