The Tech Strategy Podcast
A podcast by TechMoat Consulting on the strategies and best practices of leading digital companies. Especially in China / Asia.
Tech Strategy offers:
-Deep dives into the strategies and business models of leading tech companies.
-Best practices and lessons in important digital concepts.
Lots more information available at Jefftowson.com and techmoatconsulting.com
To marketers, I do not have podcast guests.
This podcast is not investment advice. Me and any guests may get the numbers or information wrong. The views expressed may no longer be relevant. Investing is risky. Do your own research.
The Tech Strategy Podcast
What Matters in China Tech Right Now (282)
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
This week’s podcast is about 5 big, recent events in China tech.
You can listen to this podcast here, which has the slides and graphics mentioned. Also available at iTunes and Google Podcasts.
Here is the link to the TechMoat Consulting.
Here is the link to our Tech Tours.
The five topics are:
- New operating model: “Humans agents robots”
- China semiconductor biz
- Foundation models
- It’s all about agents
- Robots and embodied AI
---------------
I am a consultant and keynote speaker on how to increase digital growth and strengthen digital AI moats.
I am the founder of TechMoat Consulting, a consulting firm specialized in how to increase digital growth and strengthen digital AI moats. Get in touch here.
I write about digital growth and digital AI strategy. With 3 best selling books and +2.9M followers on LinkedIn. You can read my writing at the free email below.
Or read my Moats and Marathons book series, a framework for building and measuring competitive advantages in digital businesses.
This content (articles, podcasts, website info) is not investment, legal or tax advice. The information and opinions from me and any guests may be incorrect. The numbers and information may be wrong. The views expressed may no longer be relevant or accurate. This is not investment advice. Investing is risky. Do your own research.
00:05
Welcome, welcome everybody. My name is Jeff Towson and this is the Tech Strategy Podcast from Tech Mulk Consulting. And the topic for today, what matters in China tech right now? Although really it's kind of tech in general, but definitely more focused on what's happening in China, or least the China players. And honestly, it's a lot. I feel like I've been underwater for the last two months. I'm reading like crazy.
00:33
IPO prospectus, lots of those. I'll go through some of those. These media announcements which happen every week now, big ones, there's just an enormous amount going on. So I'm going to kind of bubble that up to the surface of, look, this is what I think the four, really five most important things happening are with mostly a focus on people running businesses. So how does the tech, something you used or exposed to? Now, a little bit outside of that as well, because there's some big stuff, but that's going to be the topic for today. So it should be pretty straightforward.
01:02
I'm just going to hit on a couple of topics real quick. Hopefully not too long, but I keep saying that and I've yet to create a short podcast, so I don't know. Anyways, that'll be the topic. In terms of housekeeping stuff, we're launching a new project I was hoping to get people's feedback on. In the end of July, like 27th to 29th probably, they have the big China AI conference, which is in Shanghai. The dates aren't released yet, but it'll probably be 27th or 28th of July.
01:33
That's the one that Elon Musk has gone to, Jack Mogul. It's the big one. I think we're going to go out there and spend a couple days at the conference and we're going to put a couple days in front of it, probably just a small group of people. And I wouldn't call it a tour. I would call it more like a conference before a conference. So go out with a small group of people, probably not more than 15, I think. And go through some content related to e-commerce, related to AI, obviously.
02:03
couple days in the conference where we'll walk around and obviously I'm going to basically moderate so we'll just go company by company. know I don't know I say most of them there's a lot but I kind of know the big players are all doing so I'll sort of moderate the conference as we go through. Anyways that's what we're working on. I don't think it's going to be an expensive trip at all. If that's something you're interested in let us know. We're going to try and sort of get the idea crystallized this week because obviously that's a little bit fuzzy.
02:29
But if you have any thoughts on that, me a note either on WeChat, Line, WhatsApp, or LinkedIn, I'm always easy to reach. Anyways, that's coming up and that'll be in like three months, so not too far out in the future. Okay, and let's see, standard disclaimer, nothing in this podcast, my writing or website is investment advice. The numbers and information from me and any guests may be incorrect. If you use an opinion, express me, no longer be relevant or accurate. Overall, investing is risky. This is not investment legal or tax advice. Do your own research.
02:59
And with that, let's get into the content. Okay, no real concepts for today. Well, there are concepts, but I'm bouncing between a lot of concepts or topics, so it's kind of all scattered. I don't think it's worth pointing that out. I'm going to write all this. Actually, I've got three or four articles basically done. For those of you who are subscribers, I owe you four articles, I think.
03:18
I'm kind of behind and now I'm working on this. Whenever I get into a new topic where I kind of get buried into it, I need some time to sort of get through it and gel it before I write it. Because I hate when I write it up and then I look bad and I'm like, oh, that was wrong. So I think I'm through it now. But man, what's been going on is kind of, I think it's the greatest time ever. I almost feel like the stars have aligned for me a little bit. Because I've been working on China, Asia stuff for 17 years now.
03:46
And I've been doing the digital tech stuff for 12 years. And that's all sort of coming to bear now because now everything is transforming to this whole new thing. But I feel like I'm really well set up to take it apart. Like I feel like I got the toolkit that I spent a decade plus building, the China toolkit, the digital toolkit. And now we've got these major changes. And I actually feel like I'm in good shape. Maybe I'm not, but it kind of feels like I am.
04:13
Anyways, for those of you who've been following what's going on, there's press releases all the time. We've had a bunch of IPO prospectus coming out. Unitree has come out. Ubitech, another robot company, there's two China robot companies, they filed. A couple of LLM foundation companies, Minimax, they're mostly known for video generation, although they do sort of a text to speech, kind of a lot of that actually.
04:40
That's kind of a big one that's come out. Jiu Pu AI, whichever one calls Z-A-I, because Jiu Pu is hard to pronounce. They've filed, so we got their numbers. The semiconductor companies like Biren, which is one of the China semiconductor companies, we got their numbers. So there's all these IPO'd. I've been burning through them every morning at like 6 a.m. Finally got through all of them. I'll give you my summary of the key stuff there. But yeah, there's tons of information. And then if you've been following the news,
05:08
DeepSeek just released their newest model and that kind of sent shockwaves. One, because it's amazingly cheap compared to, you know, Anthropic and OpenAI, but also because it was trained or at least it's designed to run on Huawei Ascend chips. And that's literally the exact thing Jensen Huang in video was saying was going to be the big moment. When DeepSeek, actually he mentioned DeepSeek in particular, would design chips
05:36
not for Nvidia and Coda, but for Chinese semiconductors. And that's basically what they have released. Now they trained it on Nvidia to a large degree as far as I can tell, but in terms of running inference, the new sort of DeepSeek v4 runs on Huawei's Ascend chips just as well as in Nvidia in theory. So that's kind of a seismic moment in terms of the China tech stack. Now we'll talk about that. There's a whole lot going on. It's crazy. All let me just jump into, I'm going to go through
06:05
Five things, I'm going to go through four in detail and number five, which is about robots, is going to be another podcast, because that's just crazy. All right, number one, this is, is, sort of freaks me out. I talked about this in the last podcast, that it's almost like there's a new operating model being built. If you look at all business, every way we talk about business, it's always, we count how many humans are involved. Or maybe we talk about how many factories or stores, but that's kind of,
06:34
the key operating units of a typical business. And that's why when one business is larger than another, it will have certain advantages. Well, it follows from the economics of those units. Machines and retail and certain things scale up quite nicely. Humans don't scale up nicely, which is why most local service businesses are quite small. They're human focused. Okay, but as I kind of said before, look, once you start getting agents in there,
07:04
which is different than AI. Humans using AI is one thing. We're talking about agents, these sort of semi or totally autonomous actors. Well, that's kind of a new economic unit, a new capability that you can build businesses on. So we kind of know what a business is. If we were talking about a business that was mostly humans, let's say barbershops, we kind of know what that means. It's mostly human, it's service-based.
07:32
It has certain economics. Certain things happen when it scales. There's certain advantages you can get, but not many. That would be sort of a human centric operation. And we know what kind of an equipment and machinery, or let's call it tangible fixed asset centric business might be. If we look at a big factory, we look at someone who makes something basic like tables, we're probably not counting the headcount of employees.
08:00
we're probably talking about the factories and the equipment as the main aspect of that business. And everyone knows what the economics of manufacturing-based businesses are. And they scale up pretty nicely, actually. Very different type of model. Well, then you have agents in there as well. And we don't quite know the cost structure of agents yet, but I've been digging into it. I think I'm getting closer. I'll talk about that in a second. Anyways, I read something not too long ago, and it was by McKinsey or BCG. I was some consulting firm.
08:30
And they basically said, look, the operating basics of the future, an operating entity is going to be based on humans, agents, and robots. That's how we need to start thinking about businesses that do anything. Instead of a group of humans in a building, hospital, instead of, let's say, a factory with machinery, no, we're going to have enterprises. They're going to be made up of basically humans, agents, and robots. And each one of those types
08:59
has very different economics. And from that, they will have different capabilities, different things will happen when they scale, and different competitive advantages will fall out of that. I think that's probably right. I think that's what the future is going to look. Now, robots are just a type of manufactured product. But they can be moving around the streets on themselves and doing things on their own.
09:24
So we could call that embodied intelligence robots, things that are largely autonomous, not just humans using machines. So that's kind of my working thing is, you know, I wrote about sort of the digital operating basics in my book. I'm starting, then I expanded it to the digital plus AI agent operating basics. I'm thinking the future operating basics are going to be humans, agents, and robots. And certain businesses are going to be mostly humans. Other businesses are going to be mostly robots.
09:52
Some are going to be agents. We're going to see different emphasis. And each of those is going to have very different economics. That's my little working model. Now, when you start looking at robots, OK, I'll talk about that in 0.5. That's a year or so out. But this is kind of the year of agents. This is when agents are starting to be deployed. They're starting to scale up. People are struggling with it. But we're starting to see it. And as you start to
10:21
deploy these things at scale within businesses, predicting their capabilities is quite difficult. We keep getting surprised. uh Predicting the scalability, also fairly difficult. As I talked about last week, when you scale these things up, it turns out the problem is not the foundation model that sort of underlies the agent. The problem is the data set in the context layer. The scaling becomes problematic. The data becomes less quality.
10:51
The context explodes, which is very difficult. And then the cost structure. Now it turns out the cost structure is actually probably the most predictable of those three things, scaling capabilities and costs. We can start to get our brain around that. All right, so that's kind what I want to talk about. Let me talk about that cost structure a bit. And this is kind of the point for point one. Okay, if the operating basics very soon are going to be humans and agents,
11:20
I think that's 2026, 2027. And after that is going to be humans, agents, and robots. What does that look like in terms of cost structure? And this is where you get back to DeepSeek. Because once you go from AI to agents, things change dramatically. Even though agents use AI tools, they use foundation models to draw on, just like humans do, typing in their questions. But they behave very differently.
11:50
So if it's just a human using an AI, let's say we want to find out, we want to deploy ChatGPT to all the employees of a bank. So they're going to start using it as their daily tools. And maybe we're going to build in some workflows, but it's still mostly humans using this as a tool, which is no different than a human driving a tractor. You we wouldn't say that as an equipment first business. We'd say it's a human driving a tractor. So it's like that. So people call that the copilot.
12:20
I mean, literally Microsoft called it co-pilot. As I've said before, I don't like co-pilot, the idea that you're flying the plane, but you have someone next to you, your AI. I like the idea of the Jaeger from that movie Pacific Rim, where the people climb into these massive metal robots and you strap yourself in and then you run through the ocean. That's how I think about co-pilots. I don't want a co-pilot, I want a Jaeger. I want to have AI tools that give me crazy superpowers.
12:49
Okay, but the economics don't change that much. Once you move to agents and suddenly the agent is doing tasks on its own with some governance and oversight, but a human is not in the loop, we start to see the activity level just go through the roof. The agent doesn't sleep, right? It goes 24 seven. The agent doesn't operate episodically. Humans operate episodically. asked a question, maybe we asked one or two.
13:16
clarifying questions to get the response from the AI we want and then we're done. No, no, agents never stop. They are continually sending in prompts, gathering the information. They are continually using their sensors and other things to gather perception information, context information, feeding that into the LLM. And it's just this sea of information flowing back and forth that dwarfs human activities.
13:46
So the cost structure of the LLM, the cost structure of the foundation model, like a DeepSeek, is very different when an agents are using it versus when humans are doing it. That's the point. And this gets you back to sort of what I talked about last week, which is this idea of a context layer, which is kind of a fuzzy concept. The way I explained it last week was if you have an LLM, a foundation model, and you upload various, you know, it has certain data it can draw on.
14:16
Either it's rag or you've uploaded it. You can give it lots and lots of sort of data. And it can be unstructured data, video images, can be structured data. But if you don't give it some degree of meaning, it doesn't know what to do with that data because it's a machine. So that's kind how I describe context, that it's something that has to go hand in hand with the data set. And you want a high quality data set, you need the right amount of context. And then you put that into a model, you're going to get a high quality answer.
14:45
And in fact, if you want to improve the quality of your answer, you can go to a more powerful model, know, in billion parameters instead of 10 million. But if you improve the quality of the data set, that probably works just as well as increasing the scale. So that's the way I described it. Now, the truth is people describe this different ways. In some sense, you could consider context anything that feeds into the LLM. So that could be the instruction set.
15:15
which usually sits on the GPU or close to it, and that just kind of gets your basic rules. It can be the input you put in. uh It can be the KV cache, which is maybe I'll go into that. It can be all of, you can consider all of that content, all of it context, because all of that feeds as input into the LLM. The LLM turns its wheels and it gives you an output. I don't really think about it that way, but that's often how people think about context. I sort of think about it like data sets versus context. Anyways, the point here is,
15:46
What DeepSeek and some of these others have been doing, as far as I can understand, and I think my understanding is good, but I'm still not obviously an expert at this, it's this idea of as soon as you go from AI feed, know, as soon as you go from a human using AI to an agent using the foundation model, the context explodes in volume. Why? Because it will put in a prompt. Let's say it uploads
16:16
some content like a PDF. Here's a PDF, upload this. Here's my inquiry. It will immediately then ask another question and it will get a response and it will ask another question and it will get a response. Well, usually to save memory instead of having to reload everything as a new input, we have to reload the PDF every step. No, the PDF has stayed in the KV cache. That KV cache
16:43
It explodes in volume as the agent keeps hitting it with questions and clarifications. That whole thing explodes. That doesn't really happen with humans as much. We might ask two, three, or four questions and then we start over. You know, this sort of standing memory that it's using for your line of inquiry, which it could be KVCache. There's also some retrieval, you know, some rag in there, but consider it mostly KVCache, which is either going to sit on the GPU or it's going to sit on some ancillary memory or
17:12
maybe a little further than that, but probably on the GPU. Okay, that all costs money. And that's a problem because these agents are continually sending in follow-up questions and question and question and question. And when you charge for these things, let's say you're a cloud model or a deep seeker or something, if you are charging,
17:37
let's say on a subscription basis where people can just add things in and you get a set fee for month. Well, that works quite well for humans because humans don't hit it with thousands and thousands of API calls. But when you start moving in this way, you kind of have to shift to a usage based subscription or usage based model because you keep, you know, the agents keep peppering the foundation model with question after question after question, you know. So how much that costs?
18:07
Depends if you are just doing things like a human where every time you do another iteration you have to reload the PDF file and you have to reload the images and then you have to put in your next question if you sort of do all that on the front as an input that actually costs a lot If you can push as much of that into the KV cache as possible
18:30
you're not going to really get charged for that, because it's still sitting on the memory of the GPU. Now you're going to charge for it sitting on the memory of the GPU, but that's not the same as sort of putting it into your input. Okay, so that's kind of what DeepSeek has been doing. When you hear these, as far as I know, let me qualify that. Like my understanding is medium. There are a lot of people who will probably take apart what I'm saying, saying, nah, you're wrong here, but I think I'm directionally correct. So what these companies like DeepSeek have been doing have been...
18:58
trying to drop the cost significantly for doing these sort of AI calls and especially as you move into agent-based inquiries because the number of calls goes through the roof. And the main way they're doing that is trying to sort of explode the cash and drop the cost there so that every time you ask a subsequent question, you may have to put in a little extra input like a clarifying question, but you don't have to reload up everything every time. And I'm going to write more about this to take it apart, but
19:28
this sort of transition in, you know, one, how do businesses look when they're mostly agents versus mostly humans? Well, instead of a labor cost, you're going to have a compute cost as your main cost. Okay, what happens to that compute cost? That's the question I've just been talking about. Does it keep exploding in volume or are they going to find a way to sort of create longer term memory so that the model doesn't have to run?
19:58
on everything. Now I guess I didn't explain that very well. The difference between information you're putting in as an input, know, tell me what happened last week in the news versus the uploaded document you've already done. The uploaded document goes into KV cache, which means it's already been converted into a vector database. It doesn't have to be processed again the same way a new input has to be sort of tokenized and then turned to a database. So that's a little bit of the difference, but you're going to see the cost structure play out between those two numbers.
20:27
And that's what I'm trying to get my brain around, like what happens. Anyways, that was not a very good explanation, but I think you can kind of see the direction. From humans to humans and agents and robots. Once we get into agents, the main cost structure is going to be based on token use. There's actually a new business unit that was just launched at, is it Alibaba? I think it's Alibaba. A token, literally the name of the division is Token Business Unit. Like this is a business unit based on the usage of tokens.
20:56
So your cost structure is going to follow from tokens. And that's just for basic agents. And then you can take it to the next level. What happens when the agents start working as teams? Multi-agent collaboration. It's going to surge again. So you can kind of see the cost structure is going to play out in that. And that's what I'm trying to figure out. Okay, that was point one. That was not really a great explanation on my part. I'll write that up and try and do a better job of it. All right, let's get to something simpler. Point number two.
21:25
semiconductors coming out of China. This has been all in the news. know, everything in China was based on sort of Nvidia. GPUs, high-end GPUs. That was Nvidia. was one of their biggest markets. They were unstoppable. And then the White House basically broke their monopoly by using access to high-end Western GPUs, basically becoming a political weapon to cut off China's supply chain, which worked for a couple of years, kind of.
21:55
But it convinced everyone in China they needed their own chips, which they've been doing very aggressively for five years. And Huawei has pretty much landed at this point. If we had gone back a year and a half ago, most of the high-end GPUs in China would have been Nvidia, 80, 90 plus percent of the market. Last year, was 40 % Huawei, 45 % Huawei. Now, there's chips moving in sort of black markets in various ways coming from the West.
22:25
which you can read about. That's kind of transitional stuff. I don't really care about that. You know, people say this year, Huawei's going to go up to 60%. And then, you know, I've been doing these articles on Huawei's cloud matrix and how they're sort of matching Nvidia process performance somewhat by using a lot more chips and then just stitching them together with a super fast interconnect. Okay, fine. But basically...
22:52
This competition plays out at two levels. Number one is what I just talked about. Can you make the chips? Okay, China's mostly at seven nanometers, the US three to four nanometers, but now that those chips aren't really allowed in China, and even if the US reverses itself, which it did a little bit, the government of China has basically said no. We're not buying any Nvidia chips anymore. Everyone's using domestic chips. So Huawei is going from
23:20
a small market share to 50 plus percent, give it a couple years, they might be 60, 70 % of the China market. Now outside of that, this was a big deal. What's the chip level? The other level to think about is it's about ecosystems. It's not just about the chips, it's are the developers writing for your architecture? And that's really been one of Nvidia's biggest moats is the CUDA sort of software system. Well.
23:48
Huawei's got their own and DeepSeek, this is why this was a big deal this last week. DeepSeek's V4 runs on the Huawei Ascend chips, the 950, I think it's the RT. They have the DT and the 950 RT, the one that's more focused on inference. So that was kind of the moment that Jensen Huang was talking about in the news a couple weeks ago. Watch out for this, when these major Chinese LLMs start building their systems around Chinese semiconductors, watch out.
24:18
And I think he even named DeepSeek specifically. Well, that kind of happened last week. So anyways, that's where we are with that. If you want to know more about that, Bren, B-I-R-E-N, they've gone public. They're about 4 to 5 % market share for China chips. They're a standalone. There's a pretty good breakdown of what's going on in the state of China. They're actually quite profitable, which is impressive, but their market share is quite small.
24:48
So anyways, if you want to know more about China's semiconductor business, can now sort of... Huawei I've written about. I suspect I've written more about that than probably anybody. So Huawei, I'm probably the source, but BREN, you can look at that. You'll get a good, decent read on the semiconductor business. Now, if you're running a business, that's probably not relevant, but if you're an investor, yeah, maybe take a look. Okay, number three. Foundation models. Yeah. This is kind of... uh
25:18
All over the Twittersphere over the last week, there's been a big fight about open source versus proprietary because China is kind of the king of low cost, open source, downloadable foundation models. I they're kind of the king. There's a couple in the US, but that's really been how these companies in China have positioned themselves.
25:45
There's a rumor going around, which I tried to verify if it was true or not. I'm not sure that 80 % of Silicon Valley startups are using Chinese foundation models because they cost nothing. You can download them. They're open source, open weight. That's not always what people think it means. It's a bit of a spectrum. You can be more open source or less open source. But generally speaking, that's what they're doing. Now, that's been freaking people out because they feel like...
26:14
It's almost like a year ago there was this sort of political push out of the US that China is engaging in overcapacity. This was a made up economics term. I'd never heard this before, but all the US politicians were like, China's doing overcapacity, which I guess means producing more than you need and dumping it on the market. And then it kind of disappeared from the language, but it was a big deal. Well, this is adversarial open source, low cost open source.
26:42
this is being characterized the same way, which is, you know, this is some sort of political maneuver to crush Silicon Valley profits. I don't think that's true at all, but that's what people say. If you listen to news in the US, that's what you hear. I actually think it was just a matter of that was how they decided to differentiate. And actually, they weren't even doing this. If you had talked to Baidu, Alibaba, a year and a half ago,
27:11
they were not doing open source. They were doing proprietary, some open source, but mostly proprietary, especially Baidu. It was DeepSeek about a year ago when they came out with their super low cost open source model. They kind of forced everyone else to do the same. It was only after that moment that Alibaba and then Tencent, they started to open source some of their models and even Baidu eventually.
27:37
So it wasn't some grand political move. It was kind of a response to a hyper competitive market. And DeepSeek to its credit has a sort of noble approach as far as I can tell to this. Like Elon Musk doesn't patent things. He just invents them and then he releases them and he doesn't do this patent protection stuff. Too much, a little bit, but mostly no. DeepSeek kind of seems the same as far as I can tell. They create these new models.
28:06
and then they write very detailed papers about how they did it and they release them. I think that's probably most a big part of the reason they made their stuff open source. One, if you make it open source, people will use it and trust it because if you're from China, maybe they're not going to trust you as much. And you want developers all over the world. So you make it open source and you make it downloadable. Huawei makes a lot of its stuff outside of, it makes it open source too for the same reason probably.
28:34
And then so that's part of half of it. The other half, I think they are just leaning into low cost, which is the standard go-to China strategy since 1992. Manufactured products, services, Shien products, they always lead with low price. That's how they get adoption. So I think that's just kind of a standard strategy for China.
28:59
Now, if you look at some of these other companies that are public, DeepSeek, we don't know, I'm kind of guessing. But if you look at like Zhibo.AI, z.ai, yeah, it's pretty great. It's not as if you compare Zhibo.ai to say DeepSeek. DeepSeek is the king of low-cost open source downloadable AI in this world. That's their, they're the king. Zhibo is more of sort of not as cheap, but it's still open source.
29:27
It's all good. It has three to four different types of uh AI foundation models it can use. If you go through the financials, which is useful because you can see the cost structure for the first time in a lot of these things. It's actually I thought it was going to be worse. I thought the gross profit was going to be much lower, but it's about 50 percent. Don't hold me to that. But gross profits were about 50 percent. But
29:56
That 50 % gross profit has a lot to do with models that are being used on-prem that are secure and customizable, as opposed to things that are accessed through the cloud. That's more where they're focused. When you look at their cost structure underneath that, I thought the compute was going to be higher, but it was about 37 % of revenue. Then the other was labor and things like that.
30:26
That's interesting. Though the compute now their revenue is only about 50, 60 million dollars. So, this is quite small. uh but yeah, that's an interesting look at the economics of these things. Now that's the operating cost. If you look at the R &D spend, well then it just blows it out of the water. It's like R &D spend is like 800 % of revenue. It's crazy. So, I'm not paying, because everyone's building like crazy. Fine. But I'm trying to figure out the cost structure for these things. So that was kind of interesting.
30:55
You can compare Zhibo to say Kimi. People talk about Kimi a lot. That's kind of the long context king. If DeepSeek is the low-cost king, Kimi is sort of the long context king. So, if you want to upload lots of documents and have it go through it and do lots of sort of reasoning chains, that's probably Kimi. You can look at the other majors, Qwen, HunYuan, Baidu's Ernie.
31:25
The other sector I think is worth keeping a look on in China is the video generators. Because you know, three to five or four to five out of the top video generators are Chinese, which is really interesting. So Kling, which is part of KuaiShou, I use Kling every day. It's really cool. uh Seedance, which everyone's using. Well, that's ByteDance. And you can look at Minimax, which is now, you can read their filing. Very interesting stuff, actually.
31:53
What I liked about Minimax is, what I like about video generation is there's a real clear business model. Everyone who's making content, every group creating advertisements, which is a very well-known market, you can sell them subscriptions on day one. So, you've got a pretty good subscription and API revenue stream almost immediately. And Minimax, their video generator for those who aren't familiar is HaiLuo. They also have a text to video, I'm sorry, text to speech.
32:23
which is interesting, called Talkie. But they've basically got three models. They have the M series, which is their general model, which is what they're going to use for agents. They have HaiLuo, which is video, and then they have their speech and music one, which is Talkie. 27 million monthly average users, about 1.7 million of them are paying. Okay, that's about what you'd expect, five, six percent, something like that. What I like about Minimax, and I would encourage you to read this one.
32:50
They have a very good summary of what they're doing and it really matches, I think, the China strategy, which is we’re going to focus globally from day one. This is not going to be a China company. It's going to be a global company, which is what you see from these companies. It's going to be open source. Now, in the case of Minimax, they're open source in 100 plus countries now, like DeepSeek and like Qwen. They're going to focus on having a very long context window.
33:19
This is the idea we've got to shift things into the KVCache. And this is the part I think is not talked about. Okay, we're going to be very, very low cost. Fine. How are they doing it? Most of the ones I look into immediately talk about doing a mixture-of-experts model. And this is exactly what DeepSeek v4, which was just released, this is exactly what they talked about doing. You build a huge model with a mixture of experts.
33:48
There are lots of different types of things in there. And then you allocate part of the model for your specific question, but you don't ramp up a billion parameters for every question. You allocate. The other thing they talk about is their hybrid attention mechanism. And these are kind of the two things I'm trying to understand on how does DeepSeek and Minimax, how do they come in so cheap?
34:13
And it's the mixture of expert; it's the hybrid attention mechanism. That's what I'm trying to figure out. If you have any good sources on that, please send them my way, because I'm trying to figure out how they're doing this cost-wise. And I don't think I really have it yet. OK, let me get to number four. And this is really, I think, the most important one. This is the year of agents, basically.
34:35
It's playing out in the operations, it's transformative. You hear these stories of people building companies with 10 agents and no employees. Yeah, I think this stuff's all real. But it's early stage, it's problematic, it kind of works, its kind of doesn't. But yeah, now within the idea of agents, okay, the Chinese government just canceled the acquisition of Manus by Meta. Well, that was interesting. That's in the agent category, I guess.
35:04
Also in the news is OpenClaw, which has just gone crazy in China. I mean, it is just OpenClaw crazy. Tencent in particular, a lot of OpenClaw stuff happening there. I'm going to be writing about that. Now getting into agents is, I mean, it kind of blows my mind to tell you the truth. Like, I think it's so transformative, it's hard to get your brain around. But, you know, there's a good report called the Huawei Intelligence Report 2035.
35:33
where they kind of predict what's going to happen in the future. It's a pretty good one. It's downloadable if you want. Just look up Huawei 2035. They argue basically that this is a whole new computing era, that it was the PC era, then it went to the mobile smartphone era. Now they call it the Agentic era, where it's just going to change. Mean, it's hard to predict. I mean, we've heard Elon Musk say that, like, if you're just in a... If this is your primary user interface where you're just talking with your agents,
36:04
Why do I need an operating system? Why do I need to open apps at all? Why do I need any of that? And maybe you don't. Maybe you do. Maybe we embed agents within WeChat, and that's where we access them. So, there's this idea of AI and really agents becoming the primary user interface. That's a big deal. It's this idea that maybe we're going from an app-centric ecosystem to an agent-centric ecosystem.
36:34
Instead of having your 10 favorite apps or your 20 favorite apps that you use all the time, you just have eight different agents you use all the time. And that's mostly how the ecosystem works.
36:47
You can think about workflows, you this idea of agent workflows, which I've talked about before. I broke it down in four levels and four stages. You can look that up on the website if you're curious. But okay, if you have agents doing entire workflows or part of workflows, that’s a very different way of working. And once you get one agent, you start to talk about multi-agent collaboration, multi-agent teams.
37:14
You start to get into ideas like swarm intelligence. If I want to sell a product and I've basically built it around agents, could I have thousand agents just out there selling all the time? Why not? Is intelligence one really smart LLM or is it a thousand agents all using smaller deep-seq like models? But I've got a thousand of them. Swarm intelligence. Ants are very smart, but each ant is kind of dumb. Interesting.
37:44
And then you get to this idea of human agent collaborations and what that's going to mean. So, the whole thing, it just turns everything upside down. I've taken it apart a bit within e-commerce. I talked about agents, brokers and concierges and how marketplaces are going to change, which I think is totally true. I think basic commerce is going to change. I think the attention economy is going to be front and center.
38:14
Because if your mobile network and the global networks of the world carry 8 billion humans and 800 billion agents, trying to get your product or service in front of a human being's eyeballs is going to be very difficult. You're going to have to get through their sea of agents that represent them. That's really interesting. So, let's say you've got a cool website and you're getting a lot of traffic on SEO.
38:40
and people are coming to your website and you're monetizing by advertising. Does that even work in a world where 80 to 90 % of the traffic is not human but it's agents? Advertising to agents, is that even a thing?
38:57
Now I would argue no. think BCG said it pretty good that like look, advertising is a business model may not work very well because reaching human eyeballs for a lot of people is going to be very difficult. So maybe what you need is if you're dealing with agents most of the time as a business, what you need is a value exchange. You need to be thinking not eyeballs on a video, but API and token calls going in one direction and payments going back the other direction.
39:27
Maybe that's what a lot of this is going to be. And then of course you get to this idea as you start to move into agents and you have billions of them, this idea of having long context windows and lifelong memory, which is kind what I was talking about earlier, that becomes very important because of these agents, that's how they interact with each other and with other things. So that whole world is just crazy. And that's kind of, think, I don't think we know enough of what it looks like, but I've been focusing on it on e-commerce.
39:58
Yeah and it freaks me out. I think it's a sea change. I think omnichannel e-commerce strategy is going to be in big trouble. So that's kind of where I am on that. So that's kind of number four on the list. Number five and I'm not going to go through this one but I'll write an article about it which is basically robots. Yeah I want to kind of talk about UBTEC and Unitree and sort of what's happening in physical AI which is I don't think it's 2026 but let's say 2027.
40:26
This is the year of agents. Next year is probably the year of physical AI and robots with embodied intelligence, pretty soon after that. So that's where we are. So that's kind of my five what matters in tech right now. Number one, this entirely new operating system that's going to be humans, agents, and robots. And the cost structure for agents is a big question. That's why I'm kind of trying to take apart that.
40:53
middle ground of KV cash and lifelong memory and how big of a context window do you need and how does that cost structure change when you know agents take over instead of humans I Don't know what I'm trying to figure it out number two Semiconductors out of China kind of a political question But if you're interested in seeing the numbers take a look at B Rens IPO number three foundation models LLMs Very cool. What's happening? I think the China model
41:22
low cost, open source, globally focused, uh big focus on long context windows. I think that's very interesting. If you're interested, look at the video generation models like uh Kling, Minimax. If you want to see the others, look at DeepSeek or Zhibo AI. All very interesting. Number four, then it's just agents. And get ready for an agent-centric world. uh That's kind of where I'm going to focus most of my attention.
41:52
Anyways, that's it for the content for today. I know that it was kind of pretty scattered. You can tell I'm sort of struggling with a lot of this stuff, trying to get it into something digestible and usable by businesses. And I'm not there yet, but I think I've got my eye on the right target. Like, I think I'm looking at the right thing. I just don't think I've sort of figured it out and got it in a nice, usable, digestible form yet. But I don't know. I'm getting there. Give me a couple of weeks and maybe I'll get closer. So that's where I am. But yeah, it's...
42:22
It's exciting. It's really, this might be the most exciting time professionally of my life, probably. I can't believe how fast things are changing and how sweeping they are. Like even Silicon Valley people, the big Mark Andreessen's and all of them, like, you know, they've seen everything. They use the same type of language. They say, this is the biggest thing we've ever seen. Like this might be it. And we've all got sort of front row seats, which that part's amazing.
42:51
but we're all getting kind of disrupted and replaced as well. You know, I wear consulting hat, I wear a professor hat, I'm a knowledge worker. Well, that's nice. I'm a knowledge worker and a service worker. You know, the knowledge part is definitely ground zero. The service side is actually doing quite well. But yeah, we're all sort of getting disrupted, but at the same time, it's also kind of really exciting. So I don't know what you call that. There should be a word for that. Anyways, that is it for me.
43:21
I hope you're all doing well. I hope things are if you have any feedback on the a Shanghai AI conference Let me know I'd really appreciate it. Oh last news one of the reasons I've been kind of quiet is I was finishing up my book Book number one is done and I'm just going to do it as a free download as a PDF I was really kind of mulling this over I used to sell these things on Amazon But book number one of seven that has all the main thinking that has all the main frameworks. That's the summary of everything
43:50
I think I'm just going to put it up as a free download. yeah, so anyways, I'll try and send that out as soon as I can get to a week or two, something like that. But yeah, I think that's right. I guess now that I've said it, I guess I can't go back. So OK, I'm doing that then. Yeah, I think it's good. I feel good about it. Like usually, now that's not actually awesome because usually the books I like don't do well and the books I don't like, they sell much better. I don't know what that's about, but I actually feel like this is my best work ever.
44:20
So I feel good about it. We'll see. Anyways, who knows? That is it for me. I hope everyone is doing well. Oh, I have a... Maybe I gave this. I have a TV recommendation. Maybe I did this one already. If you haven't watched Dune Prophecy on HBO Max, that is amazing. It's my favorite show in the last year, because I love the Dune books, Frank Herbert, and I've read them all a bunch of times, and I love the two recent movies, except for Zendaya. I don't like her.
44:48
But then HBO did a series called Dune Prophecy and it was amazing. Like it was as good as the movies. If you haven't seen that and you like that sort of science fiction, go check it out. I'm definitely going to watch it again. Anyways, that's it for me and I will talk to you next week. Bye bye.