Everyday AI Podcast – An AI and ChatGPT Podcast

Ep 804: Open Source Surge? Does GLM-5.2 Make Open Source an Enterprise Priority? (Start Here Series Vol 29)

Everyday AI Episode 804

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 38:36

Is the open model GLM-5.2 really Opus 4.8 level? 🤯

You mighta missed this, but over the past few weeks, three distinct forces have all converged at one: 

↳ Chinese open models are near frontier SOTA
↳ Microsoft is reportedly considering open models to run Copilot
↳ Enterprises everywhere are talking token efficiency as AI costs soar

So while many are watching GLM-5.2 as an isolated model, it's important we dive deeper on its wider implications.


Open Source Surge? Does GLM-5.2 Make Open Source an Enterprise Priority? -- An Everyday AI Chat with Jordan Wilson


Newsletter: Sign up for our free daily newsletter
More on this Episode: Episode Page
Today's Episode on LinkedIn: Thoughts on this? Join the convo on LinkedIn and connect with other AI leaders.

Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineup
Website: YourEverydayAI.com
Email The Show: info@youreverydayai.com
Connect with Jordan on LinkedIn

Topics Covered in This Episode:

  1. Open Source AI's "ChatGPT Moment"
  2. GLM 5.2 Model Benchmarks & Performance
  3. Enterprise Adoption Drivers for Open AI
  4. Microsoft Evaluating DeepSeek for Copilot
  5. Token Maxing to Token Efficiency Shift
  6. GLM 5.2 Infrastructure vs. Consumer Use
  7. Autonomous Workflow Overshoot Explained
  8. Capability Gap and Workflow Challenges
  9. Enterprise Scenarios for Open Source Models
  10. Future of Task-Specific SOTA AI Models




Timestamps:

00:00 Open source AI catching up

04:52 Enterprise shift to DeepSeek models

08:57 Comparing AI model performances

12:46 Running AI models locally

14:17 Open source model cost efficiency

17:37 Cost challenges with AI models

21:05 Agentic task token consumption

25:05 Introducing the Start Here series

27:58 Impact of AI on Job Roles

32:29 Evaluating Open Source AI Models

36:00 Considering open source models

37:09 Future of open source AI






Keywords: 

open source AI, open source AI models, GLM 5.2, z AI, Zhipu AI, Chinese open source models, DeepSeek, Microsoft, enterprise AI, token maxing, token efficiency, AI spend, AI deployment, open weight models, proprietary AI models, AI benchmarks, Artificial Analysis Intelligence Index, enterprise infrastructure, agentic workflows, coding tool use, autonomous agents, long context window, coding capabilities, API costs, AI privacy considerations, model distillation, data privacy, compute requirements, GPU infrastructure, AI hardware, API hosting, Hugging Face, AWS, AI cost reduction, Copilot Cowork, Azure security, Anthropic, OpenAI, Claude Opus, multimodal models, task-specific AI models, model capability gap, autonomous workflow overshoot, agentic tasks, non-agentic tasks, state of the art open models, model fine-tuning, small language models, AI adoption barriers, frontier models, AI job automation, workflow transformation, AI subsidies, token billing, Stanford AI study, AI industry trends

Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info)

Start Here ▶️

Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com 

Also, here's a link to the entire series on a Spotify playlist

SPEAKER_00

There's three important things happening right now that make me think open source AI might be having its Chat GPT moment. Number one, the models are actually pretty good, with the recent splash from ZAI's GLM 5.2 leading the way. Number two, the era of token maxing is over as companies cut AI spend. And number three, one of the biggest and most influential companies in the world is looking at open source as a viable option. Granted, this doesn't mean that you'll have a frontier level model operating 24-7 on your computer. That's not how any of this works. But for large enterprises, they will and do have that option today. But even if you're not a Fortune 100 company with GPUs to spare, you two are going to have to start paying very close attention to open source models in 2026. Yes, the Chinese companies are distilling from American labs and there's privacy considerations, but that doesn't change the fact that US tech companies are using these models in production as AI costs are starting to skyrocket. So will models like GLM 5.2 thrust open models onto the streets of mainstream AI America? Or will this just be another drop in the bucket until the next wave of US lab models make the current open contenders look archaic in comparison? Well, let's find out on today's edition of Everyday AI as part of our Start Here series. All right, if you're new here, welcome, but let's talk about the big picture here. Open source AI has nearly caught the top proprietary models. So I think most, even people who are bullish on open models, would have admitted that for the most part, open source or open weight models are about six months behind. And I'd say now that gap is maybe only two months, two or three months, which is pretty incredible to see. And in a lot of benchmarks, which we're going to look at, uh open sources kind of caught the closed proprietary models. So they are now finally credible enough and powerful enough for serious enterprise evaluation. Uh, also, Microsoft is reportedly looking at Deep Seek as it looks to uh lower its costs in co-pilot co work. So that's huge. And the model pushing all of this, I think, right now, is ZAI's GLM 5.2. I know that's a mouthful, but we're gonna look at some of the charts that show that this is now a big picture model. This is a big shakeup, and you have to be paying attention to GL5.2 and what comes after this. And as companies now shift from token maxing to token efficiency, open models may finally be having their Chat GPT moment. So on today's show, here's what you're gonna learn. You're gonna learn why Microsoft reportedly looked at Deep Seek for lower cost co-pilot agents. You're gonna see why GLM5. You're gonna know why autonomous workflow overshoot blocks uh adoption more than model quality. And I'm gonna tell you about that secret issue that I think people aren't paying attention to when it comes to uh closed proprietary and open source AI, that autonomous workflow overshoot. All right, let's get into it. Welcome to the Start Here Series. This is the everyday AI essential podcast series to both learn the AI basics and to double down on your knowledge. So if that's what you're trying to do, sweet, me too. So uh this is an ongoing series. We're actually on volume like 29 now. Uh, so make sure you go to starthirseries.com. Uh that's going to give you free access to our exclusive inner circle community. And there, there's a playlist that has all of the Start Here series uh all in one spot on a Spotify playlist, all of the newsletters, everything all in one spot. And you can go connect with other business leaders that are trying to do the same. Uh, if you missed our last Start Here series episode, I think it was actually a really important one. We talked about AI super apps and why every company is racing to create one and what they are. That was volume 28 or episode 799. And today let's talk about the open source search. So let's quickly recap these three different things happening all at once. So, number one, Chinese open source models have kind of closed the gap. They haven't closed the gap completely, but they've closed the gap to it's like teeny. All right. So obviously, I'm not gonna talk a whole lot on model distillation in this episode, but if you don't know what that is, essentially Chinese companies steal more or less from American companies, right? Uh, we've seen the US government is working uh you know on this with the top labs, but uh Google, OpenAI, and Anthropic have all but said yes, that Chinese companies are stealing all of our work and making open source models. So we're gonna look past that. And the reason why we're actually looking past that is well, Microsoft, right? Microsoft being one of uh Anthropic and OpenAI's biggest investors is reportedly looking at Deep Seek as a viable alternative to using the Anthropic and OpenAI models, and that's one of the reasons why I think it's now finally time for enterprises to take a serious look. So, yes, there's obviously a lot of privacy data considerations when it comes to using open source models, all right, because you don't always know the weights. So, you know, you might be getting an output and blindly copying and pasting that, knowing there might be a geopolitical reason you're getting a certain answer. So there's obviously a lot of considerations to take into account. But I think the fact that we're seeing number one, the models are good enough. Number two, token maxing is going away. It's no longer about, oh, you know, everyone go burn five billion tokens so you can go climb up the internal company token leaderboard. Uh, that's over. Companies are cutting AI spend. And number three, Microsoft looking at open source models like DeepSeek as a viable alternative. So let's talk a little bit more about this new model that is catching everyone's attention. This is from ZAI. Uh I believe they're formally called Zifu AI, but they are called ZAI based in China. And GLM 5.2 is a 744 billion parameter MIT licensed open weight mixture of experts model. So it was developed by ZEI, and it has a 1 million token context window, and it is designed specifically for complex long horizon coding and autonomous agent workflows. And it targets coding, tool use, long context, and agentic work. And here's the reason why we're talking about this, y'all. It is incredibly good. Okay. I've used it a little bit, I've been impressed. I haven't had as much time as other people. Um, and I'm gonna be reading some thoughts from other uh kind of leaders in the AI space. But uh when you look at artificial, uh, the artificial analysis intelligence index, which was actually just updated to 4.1, a little detail there. But essentially, this takes uh about a dozen or so uh widely used and widely respected benchmarks, puts them all together and gives all of these models a score. So this is a good way to look at all the different kinds of benchmarks and to know about how smart or good a model is. So uh right now, obviously, we don't have Claude Fable, right? Uh that could change at any minute, but uh the best model right now, uh it is 1A and 1B, uh at least at what's generally commercially available, uh Claude Obis 4.0 4.8 max and GPT 5.5 uh X high from OpenAI. And then not too far behind, now you have GLM 5.2, which surprisingly scores higher than any Google model. Again, that could also change because uh, you know, Google did say in June they're coming out with their new model. So uh, you know, that could change later today or later this week or next week. But regardless, this is the first time that uh, you know, in recent memory at least, uh, you know, since the AI race became more than just open AI, uh, this is the first time that an open source or open weights uh company has cracked a top three company, right? So it's now anthropic, open AI, and ZAI. Yeah. So coming in uh ahead of Google, coming in ahead of Grok, coming in ahead of you know, Deep Seek, uh Meta, the uh Quen models, all of these other, you know, even the other Chinese open source models, uh incredibly, uh incredibly well benchmarked. So a lot of accusations out there that maybe they're a little bench maxed or overfitting, uh, you know, to make sure that they score really well uh on these benchmarks. But I I mean, let me just read uh for you all uh some reaction from some respected people in the AI community. So this from from uh Chris Somm, who's a partner at Active Capital uh Capital, who said, I'm starting to feel like GLM 5.2 might be better than Opus. I was spending $300 a day on Claude, switched to GLM, uh switched to GLM, spent $3.82 a day, and it found and fixed a bug, a Claude bug from yesterday. I honestly can't tell which is better anymore. All right, Jeremy Howard, very well-known name. He was the founding president at Kaggle. Uh so he said, wow, ZAI's GLM 5.2 is a marvel. It is at least as good as Opus 4.8 in GPT-5.5. It is super fast, inexpensive, and not too verbose. It responds with nuance and judgment and handles long context very well. I've never experienced an open weight model like this before. Uh, and then uh Guillermo Roch, who is the Vercell CEO. So, yeah, these are big, big names, right? You know, Vercel, they're one of the leaders uh in the AI space. Uh, so he tweeted out genuinely impressed, almost almost shocked at how good GLM52 by ZAI is at coding. This changes things, and then last but definitely not least, another prominent name in the AI space. This is uh Matt Velasso, who's a former VP at Google DeepMind, also uh VP at Meta and worked at Microsoft as well. So he said, all day using GLM 5.2 didn't miss much. All right, so saying compared to using other models. So all day using GLM 5.2 didn't miss much. First open model that passes the bar as a daily driver. Things are not going to be the same. And then talking about how we need to get some serious hardware. All right, so now let's talk about this because uh when people think open source models, what they default to is well, oh, that means I can download this thing on my computer and I can run it 24-7 and I can plug it, plug it into OpenClaw or Hermes Agent, and all of a sudden I have a you know a model that's the level of uh you know GPT-5.5 and open 4.8 running for me 24-7, and I don't have to pay any API bills, I don't have any subscriptions, etc. No, not at all, not at all. All right, no matter what anyone tells you, because you're gonna at least need at least $15,000, $20,000 in hardware uh to get anywhere near that type of performance, right? And I'm just saying the outputs. So if you're just comparing outputs to outputs, yeah, you're gonna need a machine that's at least $15,000, probably a little bit more, and it is going to be slow as a snail. So it is not economically feasible or economically reasonable, right? To you know run a model like GLM5.2 locally, you will also be running a quant version, so you'll be running like a two-bit quant. So it's gonna be a dumbed-down, very slow version. So yeah, good luck doing that. But you know, there's uh people that argue out there, like, oh yeah, I'm doing this. So unless you have some very rare special reason, uh, you know, that you have to be able to run things locally, right? Because that's the other uh obviously, aside from cost, you know, once uh models do become more powerful and smaller um and and efficient enough to run on consumer hardware, obviously the privacy, right? Being able to run things locally and not having to send things to the cloud, but that's not where you're at with GLM5.2, right? There's maybe, I don't know, a couple people in the world that have a powerful enough uh machine to actually do that. But it's for most people, this is not going to replace your $200 subscription to Claude or Chat GPT. But this is a legitimate option for large enterprises who have access to compute. All right. But small teams, skip this, all right. Uh I'm not saying skip GLM5.2 because obviously you can still use it on the API side. You know, there's plenty of hosts from you know AWS to Hugging Face, every everyone in between that you can actually go in and run this model uh using their servers, their GPUs, right? Um, but this is not something you need to overlook. All right. And what do I mean by that? GLM5.2 might not be a model that you're going to run locally. It may not be even something that you run on the API, even though it's a fraction uh of the cost, right? That's the other uh big thing. It depends on what level you're using. Um, right. But if you're comparing uh the highest output of GLM 5.2 to the highest, you know, uh GPT-5.5 or Opus 4.8 or Fable 5 whenever it comes back, um it is a fraction of the cost to run it on the API side. So yes, there still is, you know, uh if you're building modularly, which is smart to do uh and can you know swap out a set of API keys, or if you're using something like open router, that makes it even easier to do that. Uh, sure, you know, maybe you're running this model in the cloud. Uh, but this is not something, you know, I think most people think, oh, open weights, you know, I can, you know, download it, fine-tune it. You're not going to be doing this on a consumer piece of hardware. Uh, but so that's a factor one. Factor one is, well, the model is actually good enough. Uh why this might mean uh open source is having its Chat GPT moment. Uh, number two, the Microsoft move. So according to Axios, uh, and this is just not even a week old. We covered this on our Monday show. Um, so Axios reported that Microsoft is actually looking at lower cost open source alternatives to anthropic in uh GPT models. So right now, uh, both in co-pilot um co-work, but specifically co-pilot, you know, they mainly have always relied on GPT models from OpenAI. Uh, over the past year, they've started to integrate and uh offer other uh models, specifically from uh Anthropic's Claude models as well. But uh co-pilot co-work is obviously an agenc offering. This is based on Anthropic's very popular co-work technology, it's kind of Microsoft's version. Microsoft is obviously a big uh financial backer uh in both Anthropic and OpenAI, and they also well serve those models, so they make money uh on the cloud when their customers use these models. So the the the fact that, and I do have to you know break this down and I won't go too much into it because we just talked about it on yesterday's show. Uh, but the fact that Microsoft is potentially looking at Deep Seek of all companies, right? Because all the US companies have called out Deep Seek by name for distilling their models. So uh I don't know if I'm if I'm in leadership at OpenAI and Anthropic, not super happy with my my biggest financial backer. Um, by potentially, even though this is just a report, the report could be all wrong. Maybe it's true, maybe it's not. But the fact that this report is out there from a reliable source in Axios, that Microsoft, one of the biggest companies in the world and one of the most trusted and respected companies when it comes to AI deployments, right? The fact that they're looking at a company like DeepSeek to potentially be an option under the hood for running co-pilot cowork to make it more affordable to customers is big, right? Uh, because now that co-pilot cowork is generally available, that's new as well, before it was just in beta. Uh, but now they're charging usage. So they're having to uh instead of subsidizing this like so many custom uh companies are, uh, right, they're charging end users for usage. And well, what they're gonna see very far is co-pilot cowork, no one's gonna use it if they have to pay the APIs for you know Opus 4.8, which is one of the most uh aside from Fable uh from Anthropic, it's Opus 4.8 is one of the most uh you know expensive models out there, you know. Depending on the task, it's actually twice, uh more than twice as expensive as GPT 5.5. Um, so pretty big here. Uh but the reported goal was just cheaper agents inside of Azure Security Protections, and that pressure exists because of the agentic work, behaves just like it's it's nonstop, right? I think when we go back to the chatbot era, we didn't have to worry as much about you know token usage. Uh, but now when these agents run in loops, right? And it's it's easier to set off loops now uh in Kodaks and Claude Code. It's easier uh to you know put into this goal mode and you know have an agent work for not just hours but days, right? So now all of a sudden we have to start thinking about things like token efficiency and you know the cost of compute. All right, and that leads us to well, factor number three, and that's the uh the shift that we've gone. Well, now companies are not just saying let's burn tokens, they're saying let's save tokens. Uh, so there's been a lot of recent reporting over the last week or two. Uh, you know, it's New York Times uh article from this week that read tech workers max out their AI use. Now they're trying to minimize it. Uh business insider story titled Silicon Valley's AI Token Craze is facing a reality check. And then a tech crunch article that said the token bill comes due inside the industry scramble to manage AI's runaway costs. Um, and I did cover this uh in depth on episode 789, also in the Start Here series, when we talked about token maxing, uh the shift from token maxing to token efficiency. But in short, you had Meta in other companies that had these internal leaderboards where they were reportedly just rewarding employees for using tokens. And I think it started maybe with good intentions, right? Because AI leaders thought, well, hey, if our people are using AI a lot, that means that the business is going to grow and we're going to be saving time. But that backfired because people were just burning tokens. Uh, you know, a reported example is one engineer uh used 281 billion tokens in a month just to climb the rankings, right? Which is uh pretty, pretty crazy number. Uh, you know, you had reports that employees were just running agentic loops intentionally to keep running, even if they're just you know doing personal projects, right? But they were just burning tokens just to burn tokens because, well, there was a thought for a brief period of time, I'll say from the end of 2025 to early 2026. Well, people looked at employees all like for a brief period of time, right? They saw token usage as like a KPI, as a key thing to evaluate employees on, right? Like, hey, if you're burning tokens, thumbs up, you're doing a great job in our book. And then there was the report that a company reportedly spent $500 million on Claude on accident in one month because they forgot to set limits. And that shows you just the amount of tokens, right? So a Stanford study found that a gentic task can use up to a thousand times more tokens than a single chat, all right? Because chat bots answer once. Agents can call tools endlessly in a loop, right? I actually, uh, for fun, I have my uh clawed code ultra code going uh on a simple task just to see how much usage it's gonna burn when it shouldn't burn anything, right? But I'm looking at the chain of thought and I'm seeing uh Opus 4.8 going in some silly loops, just burning tokens, uh, you know, like I don't know, burning marshmallows on accident. Uh, but this is going to a broader capability gap that companies are already struggling to close. And that capability gap is half the problem. And I'm gonna go through this one quickly because I have covered this one in depth before on episodes 735 and episodes 755. So if you want to know more about this, I did cover it in both of those episodes. But uh great study from Anthropic that talked a little bit about the capabilities. Gap. And this does get to uh the kind of closed source versus open source and GLM52. Stick with me here. But in that study, uh essentially Anthropic looked at anonymized chats and they hit a ceiling for these different um categories of work, and they said, here is this ceiling of a model's capabilities. And according to all these anonymized chats, here's what people are actually using it for, right? So one of the best categories that had the best usage was only 33%, and that was computer and math tasks. But mini tasks was not even 10%. So, you know, you could have the best, as an example, business strategist, uh, and people were only using 10%, right? And that's the baseline kind of model capability gap. And that's the first bottleneck. But there's kind of a new, this is kind of my uh my my secret term I teased in the beginning. I've been trying to put a name and a face on this. So uh I'm trying this out. Maybe I'll rename it down the line, right? There's always these uh these common um common concepts that come up over and over. And it takes me, you know, sometimes a month or two to put you know, to put a put a label on it. So you're not gonna see this anywhere. This is something I'm trying on for size here. Uh, but I think aside from the model capability gap, I think the bigger problem, maybe, is autonomous workflow overshoot. All right. And I think that's the next gap that you need to prepare for. And that might, you might see, stick with me here. You might see how that might actually make some of your day-to-day tasks looking at a model like GLM5.2, even though it's text only, it's not multimodal. Um, it still might make it a feasible option in the future. So let me talk about this concept of autonomous workflow overshoot. All right. So models right now, right? Models by default, GPT-5. Gemini 3535 Flash, they carry your company's context, they can plan, they can act, they can call tools, they can spin sub-agents, uh, right. So essentially I ask companies, what would you do if every single employee had at least one or many 24-7 agents? And it's not a rhetorical question because those capabilities are here now, right? I've had both Claude agents and Codecs agents run for more than 24 hours. That's the autonomous workflow overshoot, because the capability gap is humans are not fully understanding the models' capabilities or using the models to their fullest extent, right? This is a human not taking advantage of what's there. But this is different. This is overshoot, right? So essentially, I'll argue that 99, 99% of bleeding edge capabilities go unused. All right. And that's from a combination of the capability gap, but also workflow overshoots. So, what is autonomous workflow overshoots? That's essentially that the capabilities of the agentic models are uncharted in your standard enterprise workflows. Because I will say that most workflows still need that human handoff, right? If if if I told you at your company, hey, everyone has a 24-7 agent that can carry your company's context, it can plan, it can act, it can research, it can call tools, it can create uh PowerPoints, Excel, anything, websites, right? Hardly anyone, right? Hardly any business leader would be prepared to implement that. That is the overshoot, right? Today's models are more than most companies can handle, not more than most humans can take advantage of the capabilities. That's two different things, right? They can't handle this. That is the workflow overshoot. They these models are capable of just too much because I think that even AI forward companies right now need multiple quarters a year or even more to completely rebuild job descriptions, their inputs and outputs, approvals, and even the type of work that you do, right? These things all have to change. This is also this concept of yes, there is gonna be all these new AI there, there's gonna be all these new jobs that AI is gonna create. I ultimately think uh AI will take away more traditional full-time uh roles than it will create. But AI is obviously going to create millions of jobs that we just don't know what they look like, because it's gonna take businesses a while to understand where all these autonomous agents' capabilities are headed. So we can start creating uh both the context that the agents need, uh, the expert-driven, uh, the expert-driven loops that they need to do these workflows right, but also the lines of business, the streams of revenue that go along with those things. Those things are all going to change. And that is the autonomous workflow overshoot. So thank you for sticking with me because now I can tie this together and answer the question well, should open source be an enterprise priority? All right. And I'll say yes in three different scenarios. So let me lay those out for you. Number one scenario, if your company has access to compute and they also have a high API bill, right? They're shifting away uh, you know, from token maxing to efficiency. I think a general purpose model like GLM5.2 makes sense. Okay. Again, this is a very small sliver of companies, right? This is essentially your Fortune 100 companies. Not everyone has, you know, a rack of GPUs sitting on the shelf and can, you know, if they want to, you know, roll out the type of compute that their enterprise needs. Like I like I'm saying, GM5.2, you can't download this on your every employee's laptop. It's not feasible, doesn't make sense, not gonna work. But if you have the compute, you can, right? Uh yeah, and and what's crazy is I I've talked to and I've worked with plenty of companies that fit into this category. I've seen their uh you know, server racks and all the uh in NVIDIA chips blazing and all the cooling, uh, you know, water going underneath it. It's all above my head, but there's companies out there, yeah, they can, you know, find they can download GLM5.2, it's open weights, they can fine-tune it and they can, you know, make a portal um or a way that their employees can access this model. And in theory, they can use it 24-7 around the clock, right? Obviously, there's still bandwidth and other issues that you have. It's not as easy as you know, click download and you know, click deploy. But there are companies, number one, well, maybe because of that, what I said, the autonomous workflow overshoot. I'm literally thinking of one company in particular that's spending millions of dollars a year on anthropic as an example. They could probably do this because I would say less than one percent uh of people using uh this uh their current AI system that they're paying multiple seven figures for are less than one percent are using it to its full capabilities. So in most cases, the 99% of people, uh GLM5.2 would probably be enough. Aside from the fact that it's not multimodal, uh, that that that is a huge downside, right? Um, but for the most part, number one, the companies that should be using it are those that have access to compute, and for the most part, they're not needing 24-7 or can take advantage of 24-7 autonomous coding agents. Uh, number two, those for non-agetic tasks. All right, and this is just a a different chunk, a different segment of work. So this is the um non-agetic tasks that can and should be chunked for future open models. All right. So, like I said, so few people right now, their workflows require something like a Fable 5. Yeah, it's fun to go in there and, you know, oh, let me make this 3JS world view game and oh, let me go code up this website, right? Yes, there's obviously people on in in in software development and dev roles that need that 24-7. But most people even using this, you know, you don't need fable to write better emails, right? So I think when you start chunking your large enterprise companies, start chunking your non-agentic tasks, your non-frontier tasks. Um, I think that that's a big group of people that can start looking at an open source model like this, maybe using it via the API. And then last but not least, it's preparing for the future. Because I I hope that intelligence just, you know, I hope we truly do get the intelligence too cheap to meter, uh promise at some point soon. But there's a reality that as the capabilities increase, right, at least for now, for the most part, just well, that Stanford study showed that, well, agents just can burn a thousand times more tokens than a simple chatbot query, right? And that is one downside of GLM5.2, it is it is token inefficient. Um, it burns through a lot of tokens to get that level of intelligence, although the level of intelligence is extremely high, the highest we've literally ever seen for an open model. But I do think that we're gonna see smaller um task soda models in early 2027. Let me tell you what I mean by that. Uh you know, task uh soda. Uh so you know, task-specific state-of-the-art models, right? So, right now, when we talk about models, they're just general large language models, right? They're one model and people use it for everything. I've been a huge advocate and believer in the future, uh, right, there's gonna be a mixture of models, technology. Thank you, certain companies that made my crazy 2023 prediction a reality in 2026. I was only a couple of years too early, right? But we've seen that, you know, open router just had their fusion technology, uh, you know, perplexity with their computer, right? You put a prompt out and it will route route it to whichever model it thinks is best. It might put it through multiple models, uh, right. But I think we're gonna see that on a small, uh, small language model um platform here in 2027. Uh probably not this year, but I essentially what's gonna happen, right? The model distillation is both uh, you know, when you I think most people think about model distillation, they think, oh, you know, Chinese models distilling a big frontier trillion parameter model, right? Which is what we're seeing here with a lot of the, you know, the deep seeks and the quens, right? All these accusations flying around. Uh, but what about, well, just when, you know, frontier companies distill their own models, right? The legal and you know, teacher-student model scenario. Uh, or, you know, we're obviously gonna see a lot of these uh Chinese companies come out with uh small language models or task-specific models, but I think those are gonna be state of the art, right? So as an example, I think that whether it's through, you know, quote unquote, not exactly legal distillation uh or intentional legal distillation, I think we're gonna see state-of-the-art open models for things like you know, specific tasks, you know, front-end coding, uh, web search, text summarization, uh, PDF parsing, you know, copywriting. I think we're eventually gonna see dozens after that, probably uh hundreds of open models that are state of the art at one task. Because when you can create a model around one certain task, it can be smaller, it can be better, it can be more token efficient, which in in turn uh makes it cost efficient, right? Which is what people are wanting. Uh so those are, I think, the three scenarios where enterprises should be considering open source models, whether it's GLM52 or something else. So to quickly recap, number one, if your company has access to compute um and a high API bill. Uh, number two, if you're an enterprise company that can start chunking all of these different uh AI tasks into agenc versus non-agentic, and maybe you keep your current agentic options that are maybe a little bit more expensive. And then you take your non-agentic options to maybe on the API side, something like a GLM 5.2. And then last but not least, it's more, I think, for everyone else preparing for the future. And it is starting to categorize those tasks because not every task needs a Fable 5, right? Uh, not every task is gonna need a GPT-5.6 Pro, although I'll still use it for every task, right? Like that's how you also have to start thinking because eventually these subsidies are gonna go away. We are gonna have to become a token efficiency mindset. That is the future. And I think is this the Chat GPT moment for uh open source? Is GLM 5.2 that thing? I don't know if it is, it's still too early to tell. However, I do think this is, if nothing else, the foundation for open source to have its Chat GPT moment. All right, I hope this one was helpful. If so, please let me know about it. Go to startherseries.com. Uh, there you can sign up for free access to our Start Here series uh kind of uh space inside of our inner circle community, where every single episode is there, ready for you to gobble up. You can listen to it on 2x. I'm not gonna be mad at you. All right. I hope this was helpful. Thanks for tuning in. Hope to see you back tomorrow and every day for more everyday AI. Thanks, y'all.