The AI Fundamentalists

2025 AI review: Why LLMs stalled and the outlook for 2026

Dr. Andrew Clark & Dr. Sid Mangalik Season 1 Episode 40

Here it is! We review the year where scaling large AI models hit its ceiling, Google reclaimed momentum with efficient vertical integration, and the market shifted from hype to viability. 

Join us as we talk about why human-in-the-loop is failing, why generative AI agents validating other agents compounds errors, and how small expert data quietly beat the big models.

• Google’s resurgence with Gemini 3.0 and TPU-driven efficiency
• Monetization pressures and ads in co-pilot assistants
• Diminishing returns from LLM scaling
• Human-in-the-loop pitfalls and incentives
• Agents vs validation and compounding error
• Small, high-quality data outperforming synthetic
• Expert systems, causality, and interpretability
• Research trends return toward statistical rigor
• 2026 outlook for ROI, governance, and trust

We remain focused on the responsible use of AI. And while the market continues to adjust expectations for return on investment from AI, we're excited to see companies exploring "return on purpose" as the new foray into transformative AI systems for their business. 


What are you excited about for AI in 2026? 


What did you think? Let us know.

Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

  • LinkedIn - Episode summaries, shares of cited articles, and more.
  • YouTube - Was it something that we said? Good. Share your favorite quotes.
  • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
SPEAKER_01:

The AI Fundamentalists, a podcast about the fundamentals of safe and resilient modeling systems behind the AI that impacts our lives and our businesses. Here are your hosts, Andrew Clark and Sid Mongolik. Hello everybody, welcome to today's episode of the AI Fundamentalists. We are here for our 2025 recap. Just when we thought that, you know, the normal end of the year would be wrapping up. It feels like we're still on fire and shows no signs of stopping with regards to AI.

SPEAKER_00:

Let's start off with like a little bit of news, right? So very recently, I think like in the last week or so, we've seen uh Google finally take its place back as top dog in the LM space. This feels like it was a very long time coming. I mean, they were the first ones doing this. They had this cooped up in a lab for three years, and then OpenAI just went public with it. Um but we've seen reports that OpenAI is currently in a code red right now because they're seeing a ton of user base moving over to Gemini. Uh the Gemini model is, you know, subjectively better to a lot of people. People are enjoying it. Um and now they're rushing out of 5.2, which you're all gonna hear about probably in the next 24 hours. Um so I, you know, I think that there's a lot of fire under these fast movers that have now, you know, meeting up these companies that have been doing their hard yards and building it from principles for a long time now.

SPEAKER_02:

I I yes, I'm I'm personally uh uh socially pretty happy to see this. Google like got so much flack, and I know like they were kind of pressured to release uh as well, but also what's great is that they're vertically integrated, so they've done it more efficiently, and it's also like that some of the the math out there just isn't mathing for some of these companies. Uh Google has the cash to do this, and they did it, and they built their the their TPU systems and uh and things like that, that it's it's just completely changing the game, and also showing that like these foundational systems are kind of commodities now. Like, really, what is it really that different between OpenAI versus Gemini versus versus uh anthropic versus everybody else? I I don't think it really is outside of niche things, and in any case, Google is doing it more efficiently, cheaper, and without spending money they don't have to do so.

SPEAKER_00:

I think that you know, one concern with you know Google taking this lead though is that we're gonna now see very officially happening, I think the push to commercialize and monetize LLMs. And what does that mean? Uh that means ads, right? You know, be ready to open up ChatGPT. There's gonna be some ads in the side side panel, and I think it's somewhat inevitable that the content that's actually retrieved from the model and returned from the model is gonna be monetized content.

SPEAKER_02:

Oh, for sure. And this is like this is even what Google Google's been too. Like, and I think that's one reason we have this oftentimes in tech, it seems it's not nothing to do with even AI of like people really love a product, they're hemorrhaging money, uh, and it's like you just grow the user base, and then they start introducing the business aspects and people don't like anymore, and they go to the next small startup that's doing it otherwise. So, like I think it's fine that people didn't think this was gonna happen. Uh, I I think a lot of people thought it was gonna happen, but like it's not a great business model because then they'll go to whoever's the new free thing that doesn't have it. So it's yeah, but definitely the the fidelity of uh you know, citations and stuff in LLMs are already kind of dodgy to begin with, and now it's just gonna be the ad-sponsored content.

SPEAKER_01:

Yeah, and I I saw that we were gonna be talking about this today, and from an advertising perspective, like it's exactly what happens. So we find a network, we love it, we start advertising on it, it dies in popularity, so then someone spins up an out a network that they that doesn't have ads, and people are really enjoy using it, and then all of a sudden advertising takes it away. I am interested in how that will work because we've been seeing a couple of movements. One, um, executive changes over at OpenAI to look at more consumer-based, I think recruiting from Salesforce, recruiting from uh some other places at the commercial level, and then at the same time, there's still under a lot of IEP and copyright infringement lawsuits that a lot of what happens with advertising is like the attribution to where it was advertised and where it was placed. And with so many um with OpenAI under such scrutiny for like where were they getting their data and was it legal? I'm interested to see how that's gonna work out.

SPEAKER_00:

And I think we'll have plenty more to discuss about this as we do a little bit of our outlet for 2026, but let's get into the the meat of today's episode, which is a recap of 2025. You know, what is what are some of the things we talked about, what are our thoughts on it, and then we'll do some reflections. Um I think the first thing I really want to talk about from 2025 was the state of scaling. Right? We saw these major LM creators effectively run out of human content, right? This was the year where ChatGPT-5 hit and people said, Oh, this is not the 20-30% increase in improvement I expected. And maybe we're seeing a point of diminishing returns from the LM paradigm.

SPEAKER_02:

I I think we definitely are, and it's back to our like one of our first episodes, I think, of the this this podcast in what was it, 2023 or something. We were basically saying this was gonna happen, like it we're gonna run into like these paradigms fundamentally, like it's a 2017 architecture based off of essentially, and then 2019 is you know, unsupervised with a little bit of fine-tuning to predict the next word and sound human and and some of those semantic similarities and things, not thinking. And somehow we had you know, anthropic and so uh and open AI and lots of other researchers somehow just thinking you throw enough thing enough GPUs at the problem, you're magically gonna make systems that can think and reason like a human, you have AGI and things like that. And we and it was just kind of like this you just get more scaling, more compute. Now you're building these trillion dollar data centers in space and all this nonsense that's happening, uh, and it's just gonna somehow magically get better. It's like we know this for like take any discipline ever, throwing like uh I like to use running analogies, but like there are coaches, old school coaches that uh that used to be like it's called throwing eggs at the wall. You just train your runners into a hole, most of them burn out, get injured, quit quit the sport, don't like it, and you just get one or two people can magically handle the load and they're good. But that's not how you actually like that's not thinking. That's not that's not a smart way to train people, or not a smart way to like uh get progress. And somehow we've we've just decided that you just throw we saw Moore's laws already decreased and things like we have all these historical precedents for this is not gonna work, but we just decided that we're gonna like isolate all other types of research and development around uh types of AI systems and focus them all in on this one paradigm, ignore any warning signs, and just think that compute is the is the magical answer, and that yeah, we're gonna run out of data, and like there's all the big synthetic data startups and things like that that are happening and stuff, but like synthetic data is not as good. And we've even had prior to 2025 all the papers coming out of like these tra systems that train on the existing data and they get worse over time. So, but I think instead of it being us, Gary Markets, like niche people on the fringes saying that uh that this is the that this is happening, we're actually starting to see like articles in the Wall Street Journal about this and stuff to Sid's point. It's like we're actually really seeing the state of scaling being talked about. Uh, the previous uh the co-founder of OpenAI actually uh has recently come out and said that we have the IBM CEO even saying this recently on a podcast. Like a I it's definitely becoming more trendy to be acknowledging this now, which is a major change from where we started in January 2025 to now in December. Light years of difference in in the public narrative.

SPEAKER_00:

And I I think to the end, we also kind of speculated that this treadmill can't go on forever, and there's going to be appetite for the next big thing, the next big paradigm that does things meaningfully different. And we even speculated, you know, Jan Lakun is gonna be the guy to run out there, do his do his work, and and see what he can accomplish. And he's doing that. I I mean he left meta, and I think that he's actually pursuing exactly this type of problem. Because I think he sees what we're seeing, which is that like we're running out of treadmill to run on. And if it can't go forever, what's the next pick thing gonna be? Another topic we talked about this year, which I think we had some hopes for or some promise to see was human in the loop. This idea that, well, we have all these large language models and they're out in production, and we're not totally clear that they're aligned based on, you know, the reinforcement learning that's applied to them. But we'll just have humans go in and monitor the outputs and see if they actually map onto reality. Uh, what do you guys think we've actually seen?

SPEAKER_02:

So human loop is an idea, is a great concept. The problem is in practicality, it's like a band-aid and a I I don't have to actually do anything and I can get compliance or or or risk management to just give me a pass, right? Which is uh if you actually have humans helping to like annotate the data and overrule the systems and intelligently like help help determine is that the right decision, that's great. The problem is it's often like we're gonna push this out to call centers. We want everybody to use AI, AI, AI, AI. And then you overrule the AI, why'd you over the rule AI? That's not okay. Why are you like it's then you have like the cultural pushback of you have to just agree with it? And then it's like, well, why don't I just agree? Because if not, they're like I'm gonna have to fight uphill if I'm gonna overrule a system. So it's different of like I'm actually actively participating in things versus my boss wants to uh automate me away with this thing. I get in trouble if I overrule the thing. Why like it? So that's the like I think it's a great concept, but we have to have very strong enforcement of how and thought around how do we actually implement it and not just say, hey, I got a human that looks at the results before it submits, so I'm good. Because that's not governance, that's just a band-aid, right?

SPEAKER_01:

Yeah, and I think we saw this like even just in our most recent episode coming to the end of the year, but at the beginning of the year, it seemed a lot like we had people recognizing human in the loop as like, oh, my current committee structure has a human in the loop. We'll be able to do this fast forward to the discussion we just had on the last episode, really needing to think about human in the loop is a great concept, but where are the humans in the loop? And that that was that's what stood out to me from the last episode as a trend that changed over the course of the year.

SPEAKER_00:

And it highlights this misalignment of incentives that we're not seeing AIs be these like wonderful expert systems that you know contemplate problems for a week and then get back to you and say, like, what do you think of the solution? What we're seeing people use AI for is very quick, immediate, fast responses, which is inherently not governable, inherently not human in the loopable. Because then you would just double the time of the task. Now a human and an agent has to do the task. Uh like that's just not scalable with how we're trying to use these models, which is just to make lightning decisions automatable rather than large-scale business decisions.

SPEAKER_02:

100%. And that's where it comes back to the the the the going to the dentist, the doctor, what nobody wants to do, which is you got to validate your systems and thoroughly stress test them and are they fit for the purpose before they go in the wild. And that's the better that is tractable, it's more governable. That's the thing we should be doing, but that's the thing that's hard work and nobody really wants to do that. They want to as quickly as possible go from a happy path POC to production and skip steps and then like, oh, but I have a human looking at the results periodically, I'm good. Versus like the the hard yards of validating it and things like that, which to Sid's point is like you can't really, what you're trying to accomplish, human in the loop is not what's doing that, right? Versus like the validations are what like what are we trying to accomplish? A system that we can rely on that might be more performant than a human that's cheaper. If we want that outcome, we can strive for that outcome, but we have to we have to align how we're doing that and building governance by default and like just testing the systems and monitoring and all that other non-fun stuff, but just pretending that we have or just having a human look at the output periodically and just and and saying that that's in place of that is the problem we're having.

SPEAKER_00:

And we positioned ourselves on this matter saying this this situation absolutely exists, and the worst way to solve it is to have an AI do your validation for you. And we were stressing, like, you know, this absolutely cannot be the is done because now you're just doubling error, which segues nicely into our next topic, which is agents, where we saw exactly this happen in 2025. Companies would roll out agents, and then when asked and pressured to respond to the questions of, well, how good is this? Because your accuracy is around 60%. Well, don't worry, we have another agent that validates agents. And we've put ourselves in a bit of an auraborus situation where modern agentic modeling is based around AIs validating AIs.

SPEAKER_02:

Yeah, which which is crazy because even like the research papers that are validating AI by Apple, by Salesforce, and other things, they're all using like state-based systems and things, which I would argue like if you actually this is the other complete side tangent here, but I I really, really want the industry to start to stop talking about AI equals LLMs. Like, that's not what AI is. But so even like state-based dynamical systems, I mean we've talked about that a lot on the podcast. We're we're validating like in papers and things, accuracy of agents with those sorts of mechanisms. So it that's again, like we can have like the I think we're far away from like agents being able to on the MCP protocol talk to other agents and things like that in an enterprise setting securely and for fit for purpose. But we maybe we can. But the issue is we have to then be building something that validates it, but we have an LLM validating an LLM that validates another LLM. Like you're not actually validating. It's again just like you're stacking, it's like it's the we've talked that about a lot on this podcast, the hard yards and things like that. It's like humans will do anything, anything, much more work, much more expensive to not have to use their brains or do certain harder tasks, right? So it's like we're we're like, well, let's let's spend a lot of money having LLMs validate LLMs, but nobody actually has to manually look at validating the output on a sample basis, right? So like uh and that's what's just difficult with agents is we we're kind of we're very much skipping steps, and we're just like, hey, I can string these things together, it's gonna be great. Without actually testing, is it gonna be great? Which it can be great. And that's what we did our whole like uh as listeners, like the whole uh agentic series and things of like we are firm believers in agentic and multi-state, and LLMs do have a place to play in that paradigm. It's just the multi-step LLMs validate, multi-step LLMs, and nobody actually ever validates the LLMs, is the problem that we have.

SPEAKER_00:

That very nicely encapsulates my thoughts on this agency world that we live in now, and that we do have optimism that the good agents can be built, but we are seeing this type of conflation between LLMs being agents or being effective agents the way we expect them to be. Uh contrary to how we've described agents on this podcast, as very clear-cut mathematical operations and optimizations. The next topic I want to dive into that we talked about this year, which is very close to my heart, is high quality and small data. Uh, we just got done talking about this on the last episode with Dave, but here we really saw an emphasis on you can get very great high-quality models using a small amount of high-quality human expert data. Um, I published a paper this year which showed that in a task where we were trying to find out what people thought about themselves, a very a very uh niche but interesting psychological task, we were able to outperform a hundred thousand LLM annotations with just 2,000 human annotations, right? So vastly, vastly smaller data, but high high quality, high alignment, and high intent to purpose.

SPEAKER_02:

Love it. And this is I think I don't want to jump ahead too much, but as we're talking about what's gonna see more in 2026, I've been hearing a lot of this as well. Like Sid has done some top flight research proving this out. Exactly what Dave was saying, but also what we we've been saying since day one of bringing stats back of like statisticians got a bad uh it kind of got marginalized a little bit with the data science thing and everything where and big data, when everything is like the more data the better, and we're gonna look at correlations. But as like Dave walked us through last week as well with the actuarial profession and and statisticians uh are are also about like the what's what's the causality I can find in there, what's like the the quality matters. You don't need like originally statistically valid samples are around 30, a thousand is great. Like these smaller hand labeled data sets give you much more bang for the bucks. Like also taught the paradigm we've talked about at this the start about the the scaling of like let's just throw every word of the human language plus a bunch of fake stuff, and we're gonna somehow magically make a system that can uh reason and and uh and understand things, versus like what is the this I think that goes into the high quality small data. Also, it's like we can have some really solid, we've essentially put AI research on the back burner for years now, focusing on scaling. We can make some very expertise data uh on on small systems that can be super performant. Like if we're really trying to work in like claims processing things, we can be building some very expert small data set systems in to end. We can stress test, validate it. We might not even use an LLM. Maybe we're using a garden variety deep neural network, maybe there's a new paradigm that comes out, maybe it's an XG boost system, who knows? Could even be old school time series. But we're doing it in a way that, like, what is our purpose? We can get high accuracy. Back when data science was a thing where you know Sid and I were coming up in that era, you know, you were like 98% accuracy sometimes with these XG boost systems. Now we're like, yay, my agent's 50% accurate. Like if we retool this and stop trying to be the one size, it's like you know the joke in you know, clothing and stuff, one size fits no one. Let's go into the small size or small data set that's high quality with high quality data models that are built for one specific purpose. And I think we're gonna start seeing that trend back versus this empire world model uh, you know, foundational system or siren song we're on right now. And I think there's been a lot of green shoots, and Sid's been a major contributor from the research side to these green shoots and seeing that this quality of small data with expertly tailored systems can really perform what we're wanting these systems to do anyway.

SPEAKER_00:

And I think that this gets into like, you know, maybe a topic that I'm I regret talking about here at like towards the middle of the episode is that it's not all doom and gloom. I think experts and scientists are really thinking hard about how to do this the right way and how to build excellent systems, even LLMs, using high-quality data that's meant for exactly what we want to build. Uh, and we saw this in some of the episodes we released this year, right? We had uh Adi come on and he talked about his research in AI systems and psychology and showing that language models do have a place in high quality research, but it comes from a place of understanding what is the LM giving you fundamentally that we are not capable of doing with current models. Rather than finding the use case, now that you have the tool, really thinking about what is a new problem which was not possible before. Fully agree. We also spoke with Christoph and Timo, who are very much in this mathematical space where they see AI and machine learning as tools in the scientific method. You can build them to build causal understandings of the world, you can build them to give us better predictions and better modeling of reality in ways that are statistical and interpretable.

SPEAKER_02:

For sure. And I I'm so excited, like those are all great examples, and like there are, and then since research as well, like there are definitely have been throughout this whole time, there have been researchers d really pushing the ball forward. And I'm just very excited to see that the the general market and the acknowledgement of like the the broader tech industry, uh or broader society even is starting to come back that way. There's definitely been those excellent examples that have been occurring this whole time, but uh Uh we we're I think we're gonna see more emphasis from research firms and things like that. Instead of it being like there's people doing awesome things on like quietly, it's gonna become a little bit more mainstream, maybe some more more funding and some more uh focus being on building these expert systems. But like yeah, Christoph and Timo were really, really ahead of the curve their audience, Sid and their their research. And just I'm just happy to see those sorts of those excellent researchers getting more spotlight, and that type of methodology I think will be coming back in vogue a little bit more. Um as there are so many advances, and and we can really like what are we trying to accomplish with AI outside of turn, like I wouldn't want a Terminator AGI, but even if we did. But what are we trying to accomplish? We're trying to have business objectives, we're trying to be more efficient with capital, we're trying to be more productive, we're trying to be uh fairer to our consumers, all of those objectives. AI is definitely a great tool for that. It makes it allows us to accomplish those things better than a human doing it manually, as an example for uh fairer, more performing, like all the things. It's literally every category it can do that and less expensive. But it's the as we've just been talking about, it's the how do we approach that? What's the right tool for the job? What's the right tool in your toolbox? And realizing that all the funding and research shouldn't just go to be a generalizable one size fits everything system into building the specialized, is what I'm really excited about. But yeah, thank you, Sid. We're very much. I am very pro-AI, and I know you are as well. And I don't want us ever coming off of that we're not. We're talking about use all the tools in your toolbox, not that just the one hammer for every nail.

SPEAKER_00:

And it really is this fundamental understanding of what these tools are and how they work and how they benefit us and what their downsides are that allow us to use these tools in really meaningful ways and in some sense in a more correct way. Right? We don't we don't bring an over like a jackhammer to like put up our like paintings on our wall. Right? We understand the the tool for the job. And to the end of understanding fundamentals more, this year we actually got to do a little bit of a mini-series, which we're still finishing up on metaphysics, where we're really talking about the fundamentals of fundamentals, right? You know, what are the questions of reality? What is the what is the fundamentals of thinking? And these types of fundamentals can ground us even further in a sense of not just what AI tools to use, but you know, when is science applicable to situations, how do we understand how objects move together and co-occur, and really building some sense and some knowledge around how do we do this type of work in general.

SPEAKER_02:

Fully, fully agree. I am I'm loving this series, and I think it's it's hugely important uh for just like this reframing of AI. So it's like really since ChatGPT came out, we've been all like outside of the exceptions we've been talking about, like as a as an industry and as a culture been pushing one direction, like the metaphysics is is something really helps us like reframe the conversation and really think through the objectives as well as really understand what makes a human human, and like the the most productivity enhancements in the history of the world really are you know, wheel, airplanes, cars, computers, like all of those things are how do we make humans more productive and efficient? And just re and what what makes you the most humaness and helping you have the creativity and the reasoning and the thinking is what makes you human, and how do we better understand that and leverage that as the and using machines as extensions of ourselves to help us do that, do what our goals more uh uh more productively. And I think that's what this meta is taking way back to you know, Aristotle, Plato, and I like I love the conversation we're having around that, allowing us to reset a little bit and help recalibrate what are our objectives, and then that allows us to then build AI systems for that will make help us hit all of our objectives.

SPEAKER_00:

And to be a little bit more philosophical about it, as we enter this world where everything is becoming digital, and to some extent people feel even thinking is becoming digital and belong to the cyber world, it becomes even more important to ground ourselves in the natural world and the natural principles of the world and how we exist in it. Um I definitely invite the listeners of this podcast to slow down, reconnect with the real world, and and really think about why we do the things we're doing.

SPEAKER_01:

Before we transition into our recap on 2025 and go through some reflections, many of these discussions were possible because of our guests. So shout out to the new guests on the podcast this year Timo Friesleben, Dr. Michael Zargum, Anthony, Nick Broadway, David Thario, and Dave Sandberg. Plus, we were excited to see return guests like Christoph Monar, who introduced us to Timo, Michael Herman, and Rachel Osako, who joined us for our series on metaphysics and modern AI. And with that, let's move into our 2025 recap and focus on some reflections. First question for you both Is there something that you see happening now that you didn't see coming when we started the year?

SPEAKER_02:

Oh, that's a that I'm I'm definitely there there are. I'm I'm I'm not surprised that the macro trends we're seeing, and that even like people talking about a bubble is actually like there's a Wikipedia page about it now, and it's actually becoming mainstream and like um things. I think the probably the biggest thing I'm actually surprised about is Google. Google's resurrection, we've already talked about kind of what that is, and the fact that they've like that's where I don't want to fanboy too much, but like the fact that they're vertically integrated and how efficient they are for like and the video chips were not actually ever built for AI, they're just a happy coincidence that they work well because of linear uh programming and things, but it's really they were built for you know graphics, graphical processing unit, GPU, it's not AI training module, like you know what I mean? Like uh Googles are built from the ground up for AI in their own data centers super efficiently. The fact that they came from like that everybody's writing them off, and like, why did Google like fail at AI at the beginning of the year to now literally ending the year top of the mountain and actually doing it where they're not hemorrhaging cash, they have efficient margins, like they have efficient uh like using their own money and finding a way that they can actually like be leading. I'm just so impressed with that, and I was not expecting Google to to do this this year, like with the TPUs and just the like the holistic product launch around it. I'm very impressed.

SPEAKER_00:

I feel like the biggest shock for me was seeing the end of scaling happening this year. I really thought we had like maybe two or three more years, um, but just like the speed and the aggressiveness that we saw from this industry to get this done as fast as possible. Uh, you know, resources and money be damned. It was really incredible. I mean, like I I thought we had I thought we had a lot more time in us, and I'm really shocked that like we kind of hit the end in 2025 of of like large scale scaling with data.

SPEAKER_01:

Now let's think about this from the other side Is there anything that you would take back or recharacterize? In other words, what was a bad take that you have from the past year?

SPEAKER_00:

I'll say one thing that I definitely felt like I got wrong was I I was under this impression that while agents aren't governable and agents aren't verifiable and agents are qualitatively pretty bad at their job, companies definitely will not use them. That was a false assumption. I think people just jumped for it because if it works, it's it's almost infinite money, right? You basically take a full human employee out of the loop. And I thought, well, it must be a certain quality for companies to use it. They don't, you know, we don't put out vending machines that don't work, but somehow this is acceptable. And I definitely did not anticipate how much every organization company out there was happy to deploy an LM agent.

SPEAKER_02:

Yeah, I mean I I fully agree with that, and I also think based on your your comment earlier, like I also like I thought we had more time with the scaling. Like I I we're not I'm you know, my opinion has been this very consistent since 2023 on where it's gonna end up, but I think I I did not start the year thinking it was gonna end in 2025 as like the topic, but also um sort of very much agree with the the agents as well of like I I the the Kool-Aid and the FOMO, I didn't think it would be that unsubstantiated on some of like the the deployments of like the amount of willingness to go, you know. I don't think it's still widespread, but the the fact that you go from like happy path POC to to production and then put a human on in the loop, that has been a little surprising at times. But um I actually would need to spend some more time reflecting on this one. I'm sure I've gotten many calls wrong, but it as a general like trends, the macro trends, uh outside of timing and things, I'm not not too terribly surpr I don't think there's any like major misses on the trends that we've had, but maybe for timing, and I'm sure there's some other things. If I you know went back and listened to the episode, I'd probably probably have a couple per episode.

SPEAKER_01:

And let's take this to our final topic. We've talked a lot about 2025. What's coming up in 2026? What has you excited?

SPEAKER_02:

I'm I'm very excited about 2026. I I think that like a lot of the things we've talked about so far, of like the the there's there's a little bit of the trend back towards people realizing, you know, the limitations of the systems, realizing we need to have a little bit smaller expert systems, reboot what research looks like. Like I that uh the co-founder of OpenAI going publicly of like we need to reset up, like maybe we need to go back to the drawing board a little bit on some of the AI things. So I'm really happy about that. Uh in a lot of conversations I've been having and things, I think there is a trend back to realizing like, hey, we can't just replace all of our entry-level people because at some point, five, ten years down the road, we're gonna need people that understand how the business works. Like it's an existential business risk. You can't just like uh replace all the all the humans everywhere, and then at some point someone needs to understand actually how the things work. Um, so I and I I know we talked about that a little bit last week, but I keep thinking about the feeling of power by Isaac Asimov of like companies are are are figuring that out a little bit before of like, hey, we actually still need to understand how the how the process works, how the sausage is made. And that's critical knowledge. We can't lose all that. So like there's that juxtaposition of understanding that uh and I'm excited to see how the industry realized a little bit that we need we were making our folks more productive with AI systems and and co-pilots and things, and um uh also back to the last conversation. Like, I also still think a lot of companies that are saying they're deploying agents are really just deploying generative AI. That it's so it's I I still not sure how much of it they're actually deploying agents, they're deploying LLMs and calling them agents some as well. But like what how what's that balance? I'm excited to see as the whole industry grapples. It and I'm very happy to hear that the the see that people are understanding that disconnect and knowing that there has to be a thing. And then like I'm excited to see where we figure out. I don't know where the right answer is of that line of where to use like how much co-pilot to lean on versus how much of understanding the hard yards, but I'm excited that that's now a thing we're as an industry and a society working towards figuring out. So I'm just very excited the research potential and like the reframing of the conversation. It feels like the the ship is kind of turning in the right direction finally along the metaphysical routes, and then there'll just be a lot of like innovation and things around like how do we best settle on those.

SPEAKER_00:

I'm definitely feeling a lot of excitement about apart from the industry side, the the research side. And I think there's two big things that are really keyed in for research on 2026, which I hope we're all paying attention to, which is one that traditional scientists, like chemists and physicists, are finally getting the kick in the pants to say, like, okay, we should do this whole machine learning thing. We really can look at statistical ways of doing these modeling, really you know, using classical models on a couple of GPUs and just really doing proper statistical learning. And I think we're seeing a lot of great research coming out of these fields, right? I mean, like, no, like look no further than Alpha Fold, which has really, really accelerated a field, which was you know, based on slow rule-based systems, which are accurate and correct, but way too slow to be scalable. Um, so I'm very excited for that. Keep keep your eyes open for that as more universities are finally feeling the pressure to start doing AI, and AI can be flexible in what that means.

SPEAKER_02:

I I love that. I love that. Which is that that's the that's the shift from earlier series, we're gonna replace the researchers with yellow limbs to like how do we make this is what uh Timo and Christoph were talking about too is like how do we make the researchers more productive? And I I love that they're yes, the getting the kick in the pants and and actually doing that, but that's that's gonna like it's been the trend we talking about going back to fundamental research of like actually material sciences and like actually moving the ball forward from a research perspective, not a let's throw more GPUs at the problem perspective, is what I'm really excited about is knowledge generation, not knowledge replication as a theme, and like the trending back towards, I think maybe that's the takeaway. Not we're trending back instead of knowledge replication to knowledge creation, and I'm very excited to to to see how that trend unfolds.

SPEAKER_00:

And and and I think the second piece I'm really excited for, and I I don't want to make any promises, I don't know that this is gonna happen in 2026, but there is a lot of work going on in interpreting LLMs, and I think the biggest uh golden egg we could see that we uh I I will we'll see in the next few years. I don't know if we'll see it next year, is when a model gives a response to give the traceability to be like, here are the training samples that this answer comes from. Basically direct citation from data and direct interpretability of like you said this because someone on Reddit said this.

SPEAKER_02:

Wow. Well, I mean yes, I think is it gonna happen in 2026 or not? I think is the the bigger question, but like that is gonna be what matures, and this is a part of our feedback about agentic and agents and things is like they're not ready to for game time until you can have these sorts of things on the generative systems. Like, uh and that's been like I do think generative AI is very helpful. It's part of like the it's a it can help humans, but like the biggest gap we're having and really being able to use it well is that, right? So like if we're focusing instead of how do we scale it to be smarter, but how do we understand it, that can change the game on the business utility of it as a co-pilot, as an example. If you can actually know some of that information on understandability, it can definitely change the game. So, like in the LLM world, like I love it. If that's where we can be getting more um uh LLM specific research, and that is gonna really change the game. And I know that's a very hard problem. I'd love like I actually think it is realistic to have that in 2026, but I think it's the the question mark and also like how much how many that's a hard problem to work on.

SPEAKER_01:

Now that we know what we're excited about, our next question for you two is what are you cautiously optimistic about? There's things we think we see are happening or trends are showing that way. What are your thoughts?

SPEAKER_02:

Yeah Well, I I've I've heard recently and I'm liking that like I'm seeing an industry trend towards you know, you can have a KPI of AI use, but the company's objectives are shifting from use AI as an object company objective to what am I trying to accomplish with said system, right? So like uh don't just use AI for the sake of AI, uh, but use a use AI systems as they make you more productive. I Sid gave it a great example, like researchers should be using more systems, and like the the causal modeling that like Timo and Christoph are working on uh uh as an example of like how to how to really use it for science, like that's a use case that you're using it for to make more productive. So I I've been hearing kind of like the uh the green shoots of companies starting to talk about well, what's my object how am I actually using this? So I'm cautiously optimistic that we're gonna be reframing the conversation, and again, I am not anti-LLM. I think it's gonna be reframing the conversation into what's the business objective we're trying to solve and is it the right tool for the job or not? And then uh and being a little bit smarter about that, I think that dovetails into like the smaller expert systems on smaller data sets. But we still have a lot of and money and a lot of investment on the like turn your brain off and use the scaling. So I'm cautiously optimistic, but I'm not sure we're gonna get fully there in 2026.

SPEAKER_01:

Then our final question for 2026 what are we completely skeptical about? What is the promise we hear being made that we're just not buying into?

SPEAKER_02:

Well, I mean, that there's a lot of those. I I think the you know, we're going to AGI is obviously a one, like you just keep scaling GPUs, it will work. Um agentic multi-step systems are are ready to like replace your workforce. Like, there's a lot of those that I'm very skeptical that we're actually again, and again, I'm I'm all of my comments for posterity are based on the current state of the technology when we're having these conversations. So uh things can can definitely change, but I think that how LLMs are structured, the current current state of like these GPT systems and um actually the paradigm that they're on. I I'm skeptical that those things can become the same as are currently being built. Like, as we've d very much walked through on this deep dive in our podcast, like we can make agentec systems multi-step that are that could be reliable. But I I think right now I'm a little skeptical of you know what what the current narrative is, but we're very excited that the narrative is shifting a bit.

SPEAKER_00:

To me, what I heard a lot of is like late 2025 and 2026 is the year where AI needs to be profitable. AI needs to start making money for organizations that have been spending a lot of money on debt. And it feels like 2026 is the year where that debt needs to be paid. And I have doubts about that being possible. And I think a lot of market speculators have a lot of doubts about that being possible.

SPEAKER_02:

Yes, we've already started to have those like rumblings in the stock market, it's changing a little bit. We've had the I that's what so a couple days ago when the IBM CEO even kind of said that out loud. Like it's it's yeah, it's it's definitely changing up a little bit. And uh everybody was like lambasting like uh Warren Buffett. Oh, I was like, why is he not investing super heavily in AI? Um, because the ROI. It's all about the ROI, it's not about the FOMO of like a technology. So, yes, I definitely companies are gonna start having like the endless checkbooks of we're just investing money and scaling AI and just magical things will happen, but we have no plan for how it's gonna happen. Yeah, we're gonna have to start seeing some trend, maybe not even profitability or ROI, but like what's the path towards it, I think is what we have to get more clarity on.

SPEAKER_00:

Absolutely. And I think this kind of digs into the problem that like while we can see OpenAI making a ton of money, we have to remember that they're spending insane, inordinates, historically unprecedented amounts of money, building hardware infrastructure, which is not forever. It's not made of diamonds, it breaks every four to five years, and rebuilding that infrastructure easily, easily overwhelms whatever earnings we're getting from you know future ads and subscription plans and specific corporate deals.

SPEAKER_02:

You can't agree more. And that's where like the even though we have invested more than anywhere else, I think it's the market just needs a path towards where you'll actually be making money. It doesn't even have to be even if you waste some of this initial five-year investment, it's is there a path towards it. I think that's the bigger, bigger thing. And it it we'll definitely have to I think that's gonna be probably the big topic of 2026 is how does that how does that work, or is there that path? But um I I think we probably got a little carried away um from an investment perspective around these systems.

SPEAKER_00:

And I mean to some extent, like how can you not? If you're if you're basically sold something that's like this has in in financial terms, infinite potential value. This is what accelerates human humanity to X. This is what makes us so you don't have to hire employees anymore. The win condition is so high, even if the probability is so low, that you just buy the lottery ticket. Because you say, like, well, even if I lose, the expected value is still positive.

SPEAKER_02:

Yeah, I I I I I see that, but also the economist in me is also like, but but no one seems to have realized, like, if you're laying off all your employees, who's gonna buy your things? Like, all your profitability still comes down at the end of the day. So it's been very like somebody's gotta be like, cool, you can be running a billion-dollar company with 20 employees, good for you. Who's buying your your product? And I think that's the thing that's been like until you know, and then it's the same folks that are saying that are like the but we don't want taxes and things like that. So like well, what somewhere, somewhere, like you can't have everybody unemployed and uh and still the market and you're getting a higher ROI on everything. So like I don't know, there's there's some other things that it just it's interesting how the dynamics are, but it is it does make sense from like a thesis perspective for like uh it for some companies like we definitely don't want to miss out. So it is interesting as well. I just that's why I I think the path to where did the money start making sense is I think that all the market needs is like how does it start making sense? Not that it needed to make sense today. My first investment can not make any sense, but where do I start make start seeing a path towards it making sense money-wise?

SPEAKER_01:

Well, said Andrew, I am really excited to see what's gonna happen in 2026 and for some of the topics that we have coming up. I know we're gonna we have two more episodes in our metaphysics series that we're gonna finish for everybody. We have some exciting guests, including in the near future, a return of Sebastian Benthal, and plenty of more topics coming up in the new year. So for everyone here at the AI Fundamentalist, me, Sid, Andrew, and all of our guests who joined us this year, we hope you have a wonderful holiday season and we'll we're looking forward to talking to you in the new year. Until next time.

Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.

The Shifting Privacy Left Podcast Artwork

The Shifting Privacy Left Podcast

Debra J. Farber (Shifting Privacy Left)
The Audit Podcast Artwork

The Audit Podcast

Trent Russell
Almost Nowhere Artwork

Almost Nowhere

The CAS Institute