E132 - The AI Update: The Latest News and Trends Uncovered Artwork

AIAW Podcast

The Artificial Intelligence After Work (AIAW) podcast is a weekly live streamed long format conversation aiming to demystify data innovation and AI, as well as their impact to future business and society by bringing the listeners close to the challenges that AI practitioners aim to solve today. The case-study, industry-by-industry, human-focused, and guest personal angle on the topic approach makes the podcast educational, emotional, engaging, and entertaining to all who are interested in learning more about AI, the future developments in the area, or simply getting exposed to variety of topics from practitioners and experts with first-hand industry experience and knowledge in the topic of the day. Hosts: Anders Arpteg & Henrik Göthberg. Program Manager: Goran Cvetanovski

All Episodes

AIAW Podcast

E132 - The AI Update: The Latest News and Trends Uncovered

September 09, 2024 • Hyperight • Season 9 • Episode 1

In our opening Season 9 episode, we're diving deep into "The AI Update: The Latest News and Trends." Join us as we sit down with three industry leaders who are at the forefront of AI innovation
Luka Crnkovic-Friis, Head of AI/ML at King, Jesper Fredriksson, AI Engineer Lead at Volvo Cars, Salla Franzén, Investment Manager at Navigare Ventures AB. In this insightful discussion, we explore the latest developments and trends shaping the future of artificial intelligence. From the hype cycle of AI to the battle between open and closed source AI, and the future outlook of AI—utopian or dystopian?—this episode is packed with valuable insights.

Speaker 1: 0:00

And now he's coding. I didn't do much, what was the application.

Speaker 2: 0:03

I didn't really understand what it was.

Speaker 1: 0:06

Now we're going to steal the idea. So he has an idea of making a marketplace for kids who want to teach elderly people things, because that's what he did for his summer job. So there's some kind of marketplace which started out by him teaching elderly people about tech and then it evolved into something more like maybe gardening, maybe something else then when you say elderly, elderly do you mean people like us, or everyone beyond 30.

Speaker 1: 0:44

probably, yeah, probably a little bit older okay okay, so. So they need to have some front end and you fix the back end with some database or something and some apis so now it's uh, he's with his mom this week and I'm calling in sometimes and say well, do you need help with anything? No, I got this, me and claude, we can do this so, and he uses cursor as well to develop or uh, that's the next step.

Speaker 1: 1:14

I'm evaluating cursor myself. I think it takes a little bit of uh getting to know it. I think he could be ready for it soon. But it's uh, you have to think a little bit differently.

Speaker 5: 1:24

So the angle here is of that traditional programming without any support not really interesting.

Speaker 1: 1:32

Having a good idea of an app and then having AI-oriented he knows what he wants and he's never been interested in coding. But it's a means to an end and this is his way to get to what he wants, and he's never been interested in coding, but it's a means to an end and this is his way to get to what he wants and you used to elaborate a little bit around the toolbox you gave him and what is actually his.

Speaker 5: 1:53

How's he working right?

Speaker 1: 1:54

so I set up vs code for him and github and flutter is the technology he's using, and then he's just prompting cloud to implement this feature. Now, sometimes he maybe gives an image of what he wants in the, in the front end, and and then cloud and what is cloud the cloud is a competitor to openai, or rather to chat gT by the company Anthropic. So it's Anthropic's Claude, yes, and this is currently the best coding LLM out there.

Speaker 5: 2:32

Oh, this was interesting. As a matter of fact, why did you give?

Speaker 1: 2:40

him, claude, because it's better. I tried it myself. I moved from OpenAI to Anthropic Define better, or define on what criteria? You him, claude, because it's better, I tried it myself.

Speaker 5: 2:46

I moved from OpenAI to Define better, or define on what criteria you found Claude better.

Speaker 1: 2:50

For me it was. I started to code the project and then I needed help on how do I structure this, because I evolved recently from a data scientist to more of an engineer and I realized that my code became spaghetti engineer. And I realized that my code became spaghetti. And then I asked Claude and very quickly it taught me about separation of concerns, something I never got from OpenAI and then it's like, wow, this is really one step beyond.

Speaker 2: 3:21

I agree. I think Claude is much better for coding as well. Would you agree, lukas?

Speaker 5: 3:25

No, oh, that is interesting. Let's start here.

Speaker 3: 3:29

Claude is more pleasant to talk to. Gpt-4.0 can be a bit verbose, but when it comes to sort of raw reasoning and but for coding, for coding, yeah.

Speaker 2: 3:43

For coding. Claude is better for coding.

Speaker 5: 3:45

yes, no no GPT-40, definitely, and what are you basing that on? I use both. Essentially, you use both and you throw tasks at it, or you have had problems and you find more to the point.

Speaker 3: 3:58

And we did some, also some quantitative experiments, where we sort of measured on its sort of productivity increase and how quickly and percentage of how many problems people were able to solve from start to finish with.

Speaker 5: 4:18

And if any one of our listeners would like to you know what form their own opinion on this, what would you recommend as the short list of four or five to try out like now? We've talked about anthropic with claude. We've talked about chat btu 4-0, yes, and what else.

Speaker 3: 4:39

There are those coding I mean, if you want a state of the gemini is behind a bit right, I'm nice behind but I will say rock to is actually strong for coding source.

Speaker 5: 4:47

It's not really there.

Speaker 1: 4:54

It's not a frontier model my go-to for evaluating for coding is the. There's an agent framework called a der and they have a leaderboard for the best LLMs to choose with AIDR, and there Cloud is the top. Gpt-40 is not far behind, but they also have DeepSeq Coder, which is interesting. It's an open source. I guess it's based on Lama, but I'm not sure. But that seems to be also very good for coding.

Speaker 4: 5:25

I think in general, it's a really interesting question how do you benchmark?

Speaker 5: 5:31

the models.

Speaker 4: 5:32

And I mean, if I think about the evolution of deep learning in the beginning, when we had MNIST and the different benchmarks that were put out there, it sort of yes, it increased creativity in some sense. But then I mean, I've spoken to a lot of researchers who got really frustrated that it was just sort of trying to get some parameter a little bit better, because then you could get it published in ICML. So in some sense I'm kind of happy that we don't have the benchmarks, but at the same time it gives rise to these sort of academic emotional discussions about which one is better and then you prefer one, maybe, but also that the standard benchmarks that are used are not the least bit representative of real-world coding.

Speaker 5: 6:13

No, because now we get to the next topic. You know, define the cycle of coding. Where does it start and where does it end, and what do you really want help with? Because, if I take the example that you want to structure code, this is part of the design step, okay, not the actual coding step, right? So I think, horses for courses, it could be that some is good at reasoning or giving you hints on how you should think, whilst others is hardcore coding. So I think the whole discussion becomes you need to go one step deeper, I guess.

Speaker 3: 6:47

And I would say, for discussions about something I definitely prefer, claude, to, the things that are open have been released publicly from OpenAI.

Speaker 5: 6:59

Because it's too verbose.

Speaker 3: 7:00

It's too verbose, but what's sort of? One typical problem with these systems is that they're getting this increased context to this shorter and shorter memory, and the way that these models work, which are attention-based transformers, is essentially that they focus on different parts in that context and the longer it is, the more it spreads its attention, and there have been benchmarks for this. Like needle in the haystack, can you find something specific? But that's like a very wrong test. The problem comes when you have nearly identical things covering the entire context like a code. You're repeating, repeating, sort of iterating Then having it to really focus on the right thing and not go off on a tangent, on something that you discussed before or something like that.

Speaker 1: 7:54

That's, I think, interesting. I think another interesting benchmark just to dwell on that is the SWE bench, where they have GitHub issues and then it's meant for agents. So you're supposed to take a GitHub issue, produce a PR, merge it and see if it works.

Speaker 2: 8:13

But isn't that a good direction for benchmark, you know, going from really really small prediction based benchmarks to something that you have a much more complicated task to do like solving an issue in a GitHub repo.

Speaker 1: 8:25

Yeah, that's pretty cool. So we're state of the art now on 30% of this, which says something about where we are in AI for coding.

Speaker 2: 8:32

How much do you think you personally would score?

Speaker 1: 8:35

I don't know, I wouldn't dare guess this is all different programming languages, so I wouldn't score anywhere near that.

Speaker 2: 8:43

Just getting back and closing your story, perhaps about your kid getting into coding without having coded anything before, now getting super excited and engaged about building his idea together with you, but still he does his own thing with AI. I think one interesting note on this is what Sam Altman and others have been speaking about. There will be a point when we'll have the first single-person unicorn company.

Speaker 2: 9:09

One interesting note on this is what Sam Altman and others have been speaking about, there will be a point when we'll have the first single person unicorn company, a single person that builds all the system and actually becomes a unicorn company. I think when they bought Instagram, there were like 16 people and it was a unicorn company or something. That was a very few number of people. What's up the same way, yeah right.

Speaker 4: 9:27

And I think I mean in general. I'm always fascinated by the argument that you can just throw more people on a problem and it'll solve it better.

Speaker 4: 9:36

Yeah, I think I mean so. I worked for IKEA for two years and they really wanted to rebuild the web completely, and actually it ended up being a team of six people that built the foundation that everything is built on now, and it's fantastic quality, and when I spoke to some of the fantastic people in that team, they said that one of the keys was that they weren't too many people.

Speaker 2: 9:59

Right Throwing people at some problem may not always be the right solution. That's super interesting.

Speaker 3: 10:05

There was speaking of Anthropic. There was an article, I think, yesterday that they published, where they detailed their process of building a core feature of Claude, their web app called Artifacts, which is that generates a document on the side rather than have it in the chat. It sort of does it on the side, can run javascript and so on, and that was done by a tiny team like three, four people over two months or something two, three months, uh, using ai, of course it's just a prompt, right, yeah, yeah, it's just prompting to to get that, uh, the argument up but there there is.

Speaker 3: 10:40

There is more to it and I I don't have a great answer to this, because I've wasted a lot of time when coding just doing what the AI tells me and then no, this doesn't work. Try again and sort of iterate At some point when it gets stuck into something you can end up wasting a lot of time if you don't know where to pull out from.

Speaker 2: 11:05

Awesome, Should we do the introductions? So very welcome everyone. We have the first podcast of this season and it's going to be very improvised, as usual, like a summer recap. So with us today we have three experts in AI. We're just going to have a panel basically discussing what we think are the most interesting highlights of the summer, and perhaps the year, in AI. But before we do that, let's go around the table and do a quick introduction. And, Lukja, do you want to start? Can you give a very brief introduction to who you are?

Speaker 3: 11:43

Sure, my name is Luka Crankovic-Friis and I'm the head of AI and ML at King, the makers of Candy Crush. It's also now part of Microsoft Right.

Speaker 2: 11:54

I'm previously a founder of Peltorion and, for disclosure, I worked with you as well before. It was a great pleasure and I consider you one of the top ai experts that I know.

Speaker 3: 12:04

So that's a great honor, thank you and I need to.

Speaker 5: 12:08

I need to give a shout out that when the whole craziness of generative ai started like you were the ones who sort of that I found in the whole media bus globally almost, that I try this, I try that and you you made very, very concrete comments and reflections on why this works and how you know that experience. You're sharing quite a bit around that experience on, for instance, linkedin, which I found really refreshing that someone is not just talking about it but hardcore experimenting with it and finding its limitations, and you were doing that really early on so thank you for and actually building applications on neural networks before they were popular and everything.

Speaker 5: 12:47

So it's impressive yeah because you came into the whole topic of knowing with Peltorio and everything. You know how we used to do it, so to speak, and how can you now push it, so I found that really good.

Speaker 3: 12:58

I got lucky in sort of choosing the right area. The right area, yeah you could say that.

Speaker 4: 13:04

Sala, please introduce yourself. I'm not sure it was luck. Yeah, my name is Sala Francén. I'm currently an investment manager at Noigara Ventures. I have a background in theoretical math and worked in finance and furniture building, AI and machine learning solutions.

Speaker 2: 13:22

Funding and furniture in IKEA, I guess.

Speaker 5: 13:26

Yeah, the first time you were here I think it was almost the first season then you were chief data scientist at SEBIA. Yeah, yeah, so you worked a lot with these topics in the financial sector in many different things to sort of yeah on many angles, from research to very, very practical building teams, everything like this and then you decided to furniture over funding right sort of yeah on many angles, from research to very, very practical building teams, everything like this and then you decided to furniture over funding right, and then you moved to Amsterdam for a while right.

Speaker 5: 13:54

Yeah, so you had that whole experience.

Speaker 4: 13:55

Yeah, and I learned how to build data and insights and AI systems for e-commerce, which is very new for me.

Speaker 5: 14:04

I never got. When you were at IKEA, how did you frame your domain or your scope in terms of analytics and AI and data?

Speaker 4: 14:11

Yeah, so well, in the beginning I was supposed to be just sort of leading a team building price recommendations, but quite quickly I became responsible for data analytics on the web and the app and kind of everything customer facingfacing in the stores.

Speaker 5: 14:26

Okay, so the customer-facing domains. Yeah, the fun part, the one you can show to people, this is what my team built.

Speaker 4: 14:31

Isn't it cool.

Speaker 2: 14:34

You're a super senior data scientist or AI expert, I would say been part of so many national initiatives and we met each other so many times during AI agenda and AI Sweden and so many other things, and public speaker and everything, so it's a pleasure to have you here as well.

Speaker 5: 14:50

And your domains you're looking after. Looking at now what are they?

Speaker 4: 14:54

Yeah, so now I go to Ventures invests in science-based innovation, which is super fun. I get to meet really excellent researchers around Sweden and try to identify if what they're doing is relevant for risk capital and whether we like each other and how we can support them really. And so I focus on everything related to AI, quantum not just quantum technologies, but quantum physics and mechanics and chemistry and everything because I think that's a super fascinating area and then cybersecurity and some sort of hardware stuff as well Awesome.

Speaker 5: 15:29

So you picked the cherries right. I'm lucky.

Speaker 4: 15:34

It's pure luck.

Speaker 5: 15:36

Once again, I don't think it's so much luck here.

Speaker 2: 15:39

Jesper, welcome here and please introduce yourself as well.

Speaker 1: 15:43

Jesper Fredriksson. Welcome here, and please introduce yourself as well. Uh, yes, for ferikson. Uh, I'm. Uh. I have a team, a data science and ai team, at volvo cars and we're focusing mainly on generative ai. Before that, I've been data scientist for a long time and have a background in medical imaging and brain research right.

Speaker 5: 16:03

So medical imaging and brain research. So medical imaging and brain research to em mobility for a while. So that's so you. You were at one. One part of Volvo on an arm's length, yes, but now that's not an arm's length anymore no, no, we got pulled into the main company. The mother took you up yeah, you're so good and so sharp, so everything you want your back, yeah closer.

Speaker 1: 16:29

It was a. It was a change of uh, change of mind, going and going from having less cars in the streets to selling more cars electric, though, but it's a super interesting journey because it's really delving into, you know, the fund, thinking about transport, and then can we servitize this.

Speaker 5: 16:50

And now this story is still of course there but it's the full automotive story now as well.

Speaker 1: 16:56

Yeah. So I got the chance to work with Genesis AI and that couldn't be only in Volvo on demand, so it's natural to move to uh.

Speaker 2: 17:04

Yeah, to uh, it's bigger than that I mean it's nice to to have a person like you that worked with a real application of generative ai actually put in that in production as well at the company that is of the size of volvo as well. But also appreciate, you know, your previous experience in tv4 and also some some ai conferences in the past in vienna, whatnot that. I have vague memories of.

Speaker 1: 17:27

I'm sure we had fun.

Speaker 2: 17:28

Yes, I'm sure, awesome, it's an honor to have all of you here. We're going to try to keep it to a bit shorter episodes this season, but I'm sure we're going to fail.

Speaker 5: 17:42

We're going to go out of the marathon league and into the more exciting, you know. Shorter, you know. Try to be more. Uh, you know, maybe easier to digest, I don't know.

Speaker 2: 17:52

Yeah we'll see we will fail.

Speaker 5: 17:54

We're mainly here to have fun and speak about what we love it's going to be too much fun in an hour to shut this down.

Speaker 1: 17:59

I promise you we can aim for 100 meters and end up at 1,000 or 5,000.

Speaker 2: 18:04

A marathon for sure. Okay, should we start with trying to see, can we identify some kind of general trends, Something that we've seen during 2024, perhaps something that has been especially clear during the summer times of where the field of AI is going? I'm sure we can all think about some trends that we've seen, but anyone that wants to start, some clear trend that we can see of how the AI field is moving.

Speaker 1: 18:36

We touched briefly on the AI hype cycle.

Speaker 2: 18:38

That's something to start with, maybe.

Speaker 1: 18:41

Yeah, please elaborate. So there's been a massive spend on GPUs and Nvidia stock is higher than ever and switched the second most valuable company now right. I'm not sure where they are, but they're up there and basically the rest of the companies are still struggling to see any any gains from from AI. Are you sure?

Speaker 2: 19:05

about that.

Speaker 1: 19:06

I'm not sure, but that's the official story at least. That is why people think that we're now in the what is it called the trough Going towards the trough.

Speaker 5: 19:17

And I saw this argument and someone said we were already past the peak or not past the peak. So are we on the peak or on the past? But it's been pouring a lot of money in, but show me the money call has been made. So that means somewhere here it needs to get real.

Speaker 2: 19:36

But if I understand you correctly and please correct me if I'm wrong but we can certainly see a lot of progress being made in AI. One progress is in the field of science, Another can be in engineering and the third can be in like a business, and I guess, if I understand it correctly, you're thinking more of the business side of things.

Speaker 1: 19:54

Yes, definitely, definitely. But it's also the maybe the realization that we all have that things aren't moving as quickly as in 2023, in the early days of the chat gpt boom, but then again, things can change quickly.

Speaker 5: 20:10

if, if rumors about opening eye, releasing stuff are true, then then this could change quickly but I, I have a, I have a theory what is happening, why this, why we're getting delusioned and how we are confusing ourselves.

Speaker 5: 20:30

And you can lean on Josef Schampeter's definition of disruption, innovation from the I guess 20s and 30s, and he makes the distinction between the technical invention, flipping a technical invention, a technique, into something that can be sort of commercially monetized he calls that innovation and then ultimately, to make that go broad into the masses to really take off we are talking about he used the word diffusion. So invention, innovation, diffusion, and I think that there is a huge speed of invention of techniques and we are, you know, research and we're getting, you know, we can talk about, you know those topics. But just because you have the technique is not the same as you have usage or you have understood how, how you bring value from it. So we have all these companies that are fantastic at the technique, but now we need to figure out where to place that technique into, to innovate, and then ultimately, when we've done that, we're going to change all the people's ways of working to adopt it, then we have diffusion.

Speaker 4: 21:29

So maybe sorry, no, because this really rings a bell with me I mean, for me the whole sort of launch of chat gpt was incredibly frustrating because it just reminded me of when deep learning came and it was impossible to hire really good data scientists who were willing to do not deep learning. And and I mean, I hear from so many of my my friends who work in companies that what happened last year was that they centralized the data science teams again and for me, the most successful implementation of a data science team is that it's out in the organization, it's working naturally in parts of teams. It's like co-creating things that are needed for the business and for the customers.

Speaker 2: 22:10

You're touching an interesting topic. Should it only be embedded or should it have a hybrid kind of organization?

Speaker 4: 22:15

I think hybrid, because I think there are things where you really benefit from having like a hive mind central team that keeps in touch and can keep sort of the cutting edge awareness and then infuse the other teams with it.

Speaker 3: 22:31

That's how we built it at King.

Speaker 2: 22:33

So we have a central AI. I think we should add this as a topic. You know organization.

Speaker 3: 22:39

I would like to comment on the previous thing, which I would like to offer a contrary opinion. Mm Like it Sala, do you remember those Ericsson roundtables? Yeah, so this was where you have various tops, major Swedish companies meeting once quarterly discussing it, and a few years ago there were two very distinct categories. So you had the old school established companies and then you had a few like Spotify, you had us from Plutareon and generally it was the older companies asking the newer like oh, how are you doing it? Is this what they are? There was a very clear gap.

Speaker 3: 23:23

Well, I attended one of these like a year ago, well, six months ago, and the difference is massive. The field has been leveled. So you have these uh companies like uh, uh, pia, who are right, piab, who are essentially, when it comes to generative AI, we're essentially on the same level, we're investigating sort of the same things and so on, because suddenly you don't need to put in all of that energy to get the data and build up all the infrastructure. It's an API, you call it so the sort of democratization of AI which sort of was very, very important it has happened.

Speaker 3: 24:10

And it's had, at least on the practical day-to-day and the capability.

Speaker 1: 24:16

It's fantastic. I was talking to the CEO of Artigie, I think, and he was like, well, we have an AI strategy and we're doing all these things and I was like mind blown. Like a company like Autogia, they're gambling on horses. That's their business. They have a super good.

Speaker 2: 24:34

AI strategy. Perhaps that's a good side effect of ChatTPT, that at least it brought AI to the mindset of every company and they had to develop an AI strategy and they had to get some kind of AI team in place and they had to let people start using AI. Are you skeptical, Sada?

Speaker 4: 24:52

Maybe Because I think I mean, for me, AI is so much more than large language models, and I think the challenge is that it's just the event, you know.

Speaker 2: 25:02

I agree what you said before of course it's just a small fraction of AI, but still the media attention that it gave had some positive side effects.

Speaker 4: 25:10

Yeah, absolutely, and yes, and I think it's a fantastic example of pedagogically making AI systems available for the many and for them to be able to try out stuff and learn how a machine would speak to you and things like that. But I still think that it's sort of I'm always interested with about what people aren't talking about, so I'm really trying to kind of keep that part alive interesting information or insights coming from neuroscience where those kinds of algorithms might actually be hugely beneficial in unexpected places. And I think large language models that we can see are useful in some cases, like learning how to grasp something or being able to implement a fantastic internal search function and things like that. But I think there are parts where we still need to do the heavy lifting.

Speaker 5: 26:07

And I think in some ways it's a little bit like oh, now the new hype is a hammer. So we need to use a hammer for every single mathematical or data problem, and that's of course you know use a screwdriver when you need a screwdriver. The AI tool set is more techniques than the generative or large language models approaches. Maybe that gets lost. Is that your point here maybe.

Speaker 4: 26:31

Yeah, I mean, it's an interesting dynamic where you have incredibly talented people that have built a fantastic machine learning algorithm that's in production and the customers are loving it, and the manager says, oh, oh, it's not a large language model. But then it's not AI.

Speaker 2: 26:47

It's a bit silly. I think we all agree that large language models is not the solution to all of AI.

Speaker 5: 26:53

Yeah, but then is the point Going back to the hype cycle. Is that referring to AI or is it referring to generative AI and particularly large language models, type AI?

Speaker 1: 27:06

I think it's about generative AI. That's what people mean when they say AI today?

Speaker 5: 27:11

Yeah, it's interesting, right? So the definition of AI we were a little bit confused about it and we were discussing it in a different way two years ago.

Speaker 4: 27:17

It's fantastic.

Speaker 5: 27:18

And now it's sharp Now. Generative AI is AI.

Speaker 3: 27:20

Exactly Machine learning and AI. It's super simple. What was complex before of separating now, at least I. When I say AI, I mean generative AI and machine learning.

Speaker 5: 27:29

Isn't that interesting that the definition is almost cleaned up itself up.

Speaker 2: 27:34

But just to also build, I think, what you said, I think a year ago, Lukia, and think about the adoption, if you call it that, about generative AI, and one thing we could see is the big tech giants were taking this kind of large language model and integrating them into all their products, into the office suite, into the operating system of Windows, into the coding and development environment and so forth. And I guess I think if please correct me if I'm wrong but you said something like it is super simple to integrate them because it's simply you put in text to the model and it's being integrated as a co-pilot or some kind of assistant and it's just text and it's super simple to integrate, would you agree?

Speaker 3: 28:16

Is that what you said? Yeah, absolutely. And then there's always easy to underestimate the engineering complexities and how people can screw up the building of products.

Speaker 5: 28:28

So then let's float the contrary argument if it's easy or not. I just saw Lars Albertson made a nice presentation. Lars Albertson is a good friend of all of us. He's a super engineer, data engineer, and he's clearly arguing if you want to do enterprise grade machine machine learning or AI that is robust and in critical systems, he argues that it's way more complex engineering around this in order to build a robust system. So now we need to go away from how we can utilize to increase the individual's productivity with generative AI tools.

Speaker 5: 29:02

And I can you know, like I give you a laptop, I can give you a cloud and you can be more productive. But if I want to build enterprise grade systems with this, with APIs and managing the data and it cannot hallucinate, blah, blah, blah. So I saw some slides where he sort of highlighted yes, to do this is easy, but if you do this and this and this and this, and then he was sort of he was referencing to the technical depth of machine learning. Again, you know that Google wrote in 2015 and it's used way more. So it all depends on how you frame this, if it's easy or not.

Speaker 2: 29:31

But it is a difference, I would say, to traditional machine learning generative AI has a big difference.

Speaker 1: 29:38

I would say the way I think about it is generative. Ai is more like software development because you're generally calling an API, but there is a difference in the non-predictability or as the word non-determinism in the results.

Speaker 1: 29:55

You have sarcasticness and you have to control various factors and then you end up in something like LLM ops or ML ops, where you have to control things over time and when you evolve the model, make sure that the examples that you can handle before you're still handling them when you make an update and et cetera. That's the main difference and that's to me at least.

Speaker 5: 30:19

But have you in any way simplified the core fundamentals of feeding the right data? So if you want to do that something in an enterprise price, setting. But don't you need to feed the right data? No, not in the same way. I would say Not in the same way is correct.

Speaker 2: 30:34

And the reason is basically that you actually don't have to train models as much as you did before. You can use models that's pre-trained and just prompt them. And that works not for every use case and it's not going to be super great, but you can get started doing that and that significantly simplifies the adoption of AI.

Speaker 1: 30:54

I still think Kandik has a point in that it's only as good as your data is. You don't have the same requirements on structured data, but it still needs to be good quality data. If I'm querying some database or accessing some structured or unstructured data, if the data is not what I think it is, then it's going to be wrong.

Speaker 3: 31:21

Larger language models are not good when you have massive amounts of data.

Speaker 4: 31:25

Just like humans, are not.

Speaker 3: 31:26

You have the same type of limitation, but also, I think, a bit of perspective. Here is that sort of the state of the art LLMs now, which is Claude and GPT-4.

Speaker 3: 31:40

Gpt-4 is a two and a half year old model. It is not the state of the art what's actually, but there is a big process of for GPT-Next. It's red testing, it's going through a number of things to make it more safe, more safe. So a lot of the assumptions about the limitations and so on that we sort of are baking in are not going to really be valid with the next generation of models which are coming soon.

Speaker 4: 32:13

And I think I'm sitting here thinking about how to frame this because it's going to sound like I'm name dropping. So I'm going to name drop. I was in Silicon Valley in February and I met with Mark Andresen and it was like a really nice fireside chat where we were I don't know 25 people in the room and we could ask questions, and he said that he sees it as, like computers used to be deterministic and then you have massive amounts of use for it. But now we're moving into the probabilistic era and so people need to learn statistics, which makes me incredibly worried, because we all know how good people tend to be at statistics. But you actually have to start to question the computer, and that isn't something that, like my generation, necessarily would naturally do. Hopefully they would Not. Everybody is a statistician and interested in actually sort of verifying the answers that you get.

Speaker 5: 33:04

But this is a huge problem because ultimately, the way we understand how to run business, organize business, plan business is fundamentally rooted in deterministic thought and we now need to switch to a probabilistic emergent view.

Speaker 4: 33:21

And I would like to pitch my idea that I'm pitching everywhere. I think that when you use large language models, you should have an avatar, and it should be. You're like an uncle and then, depending on how unsure the uncle is, the more drunk he is, because I think that's a very human approach. So I think we need to find the human approach of explaining the probabilistic part. The drunk avatar, yeah, but I would say that the human processes are for deterministic?

Speaker 5: 33:47

They're not really, but we like to pretend that they are no, I fully agree Humans and how we should really, you know, what we should be doing is not deterministic. But what has become management dogma and how we have understood and tried to model the world value chain, division of labor, all these things is very deterministic to solve the problem in a very steady state and we are super bad at dynamic adaptability because of this. So the whole management school, the whole idea how we organize, how we think about this that flips us into again agency topics and all this.

Speaker 3: 34:24

I think this is all related and it's interesting when you start looking at human behavior because you come into insights. There's this linguist, famous one, emily Bender, who invented the term stochastic parrots, supercritical of LLMs, and I've always seen that criticism is really pointing in the wrong direction, the more it's like wow, humans are really this shallow. We operate kind of close to this very primitive thing that we automated and I'm getting more and more of that impression now, especially with diffusion models, now with music. It's remarkable what has happened now.

Speaker 3: 35:09

I once upon a time I studied classical composition in parallel to electrical engineering and what these models can do. I mean people go years and years in school to learn things that you consider were complex and sublime. It's just People go years and years in school to learn Things that you consider were complex and sublime. It just creates. And right now, actually, I'm listening more to AI-generated stuff video than I'm listening to Spotify.

Speaker 2: 35:42

Really.

Speaker 3: 35:42

Interesting, not because it's AI-gener generated, because, but I actually like it.

Speaker 1: 35:47

What do you think? What do you think is going to happen to the lawsuits against judo and suno? Oh this is almost like changing topics.

Speaker 5: 35:53

Soon, right popcorn, why don't? We okay, the first topic started off with understanding or thinking about where are we on the ai hype cycle. So can we sort of now sort of just wrap up? Do we have an opinion of yet the peak? Are we seeing the trough, or how do we?

Speaker 3: 36:09

understand it, we're at the early beginning of the exponential what do you?

Speaker 4: 36:19

mean going down first no, we're from from Gartner, right right.

Speaker 5: 36:22

Is it true? Is it there or is it still on the peak?

Speaker 3: 36:26

So this is going to sort of. The next wave is going to hit.

Speaker 2: 36:32

And then we're going to, if I understand it correctly, it's going to be big advances very soon as well.

Speaker 4: 36:39

I mean I would agree. I think we're still in the innovation phase, because I don't think we've found the right usages for the large language models. I think the whole productivity.

Speaker 5: 36:45

The whole thing here is the difference between techniques versus innovation and diffusion. So when you go, you have the techniques and everybody's excited. We see we potentially can sense that something's in the air, but then we need to figure all this out in order to get value out of it.

Speaker 4: 37:05

And that's what I think is where we are right now. I'm very skeptical to the hype cycle in general.

Speaker 5: 37:07

We could even have that as its own. Should we kill the hype cycle in the magic quadrants?

Speaker 2: 37:11

while we're at it, I think so, henrik, what do you think? Oh sorry, jesper, should we?

Speaker 1: 37:17

Yeah, I just got a thought about the hype cycle.

Speaker 2: 37:25

I think it's true in many things, but I'm thinking that AI is maybe not one of them. So people should not stop investing in AI right? Definitely not.

Speaker 3: 37:31

It might represent some type of public market. Look at things.

Speaker 5: 37:39

We have a very cynical discussion on this that I can explain and we can discuss after. I have extremely cynical views on some of this stuff. Explain and we can discuss after. I have extremely cynical views on some of this stuff. Should we change topics, right?

Speaker 2: 37:49

Yes, should we try to find another? I have one, if I can steal it or otherwise.

Speaker 5: 37:53

I want to give it to our guests first. Should I go? Can I go? Yes, where do we stand now after summer, in the war or in the arguments Open source AI versus proprietary AI? Is someone moving ahead in the race? Is it more proprietary or do we think that the open source is catching up, or has it already caught up? Who will win the race? Are we in the race?

Speaker 2: 38:25

To start with, I think we all agree that the frontier models of today is very much into closed-source land.

Speaker 1: 38:33

But there has been some huge improvements. We have LAMA.

Speaker 2: 38:36

Wouldn't LAMA 3.1 be the first example of a frontier open-source?

Speaker 5: 38:40

model. Yeah, and when was that released? A?

Speaker 2: 38:43

couple of months ago.

Speaker 3: 38:45

Summer news right and it's still in training, right.

Speaker 1: 38:48

It continues to improve. But there is a released version now at least, but it's hard to get to use. I haven't been able to use it.

Speaker 4: 38:56

I was just reading an article in the Economist about the kind of future of open source. Apparently, the Institute of Standards in the US is starting to define what open source is and of course there Meta has a lot of opinions, because I mean, I think I've always been a great fan of open source which I shouldn't say since I'm now a risk capitalist but I still think I love the community aspect of it and I love the fact that people come together and they help each other and they can build incredibly advanced software for others to use. But I think it's really interesting sort of point we're at. Will there be a third kind of coding standards that you have open source and then you have proprietary and then something in between where you sort of do what Lama does release parts of the code, but not everything?

Speaker 5: 39:44

I think this is part of understanding. Where we stand in this race is also to acknowledge how the different vendors are butchering open source. Is it open weights? Is it fully published code? What is open source today?

Speaker 3: 40:07

Meta Lama is not fully open of open thing without a meta or somebody like that behind it, because isn't that a good conclusion?

Speaker 3: 40:17

I think that's really clear, that it will be a few set of companies that will produce the frontier models going forward, because no one else will have the money to do so I mean on Azure, now they're building, and this is public information they're building a $10 billion computer, $10 billion, which also, I mean I think also an interesting side theme in this discussion is, for instance, Sweden in this yes, or EU, if you want to be mean.

Speaker 4: 40:48

Yeah, and I think I mean because when I was in Silicon Valley at the same trip so I haven't been there several times I met with quite some really, really talented data scientists who'd been working for the big tech companies and building LLMs and things like that. And then they got bored with administration, they got bored with all the meetings that you have to go to, so they decided to start their own things and I think, considering kind of the amount of money that still circulates in Silicon Valley, they might actually be able to do something interesting when they choose a niche where they know that they can actually contribute something incredibly interesting. So that's kind of a space that I think is.

Speaker 5: 41:27

We have seen a couple of spin-offs lately this year. Yeah, exactly which one can we think about?

Speaker 4: 41:31

Well, I'm thinking about Augment.

Speaker 5: 41:33

Augment. Yeah, that's a cool one. So back to the question open source. Where do we stand.

Speaker 2: 41:41

I mean, you can think about the pros and cons I guess that's one way to approach it and you can think about, like, from a security point of view, what's the implications of actually releasing something open source? Jan would argue that it's actually good for security.

Speaker 2: 41:54

A lot of people would argue the opposite. But then you can also think from a more business model point of view. Some open source projects actually do result in a very good business model that you can build on top of it. And then it comes to what you think Sett Salla is some kind of hybrid approach where you have a business model that builds on open source solutions in some way and that can actually be very beneficial as well. But it still seems, wouldn't you agree, that the trend is I think I'm hard to say that in some way the trend is to move more into closed and proprietary, at least for the tech giants.

Speaker 5: 42:32

I don't agree, I don't think it's that simple. My view of the analysis of the trend is that it's emerging two types of players with the money and the muscle that builds open source models or large language models, I mean. So on the one hand side, we have the I would argue then the open AI who has the core idea, you know, to make money and monetize on the model itself, or anthropic. And then I think we have another style where you could argue Databricks or Meta, where they have another agenda with how they make money and where they now want to feed and lock in people in their infrastructure and their environment. So if I take the Databricks example, I think it's the best one.

Speaker 5: 43:23

Here. They really want us to come into the Databricks community and environment and use their compute and use their storage. So that's their main game, right? So as long as we get there, get into Databricks, we win, right. So that's why we put shoots a lot of money on trying to do our Mosaic-based LLM. So I think you can really see the difference between where the core business is LLM or where, like in Meta, and then it's sort of apart.

Speaker 3: 43:53

I would say that I wouldn't agree with that. I would say that there are many examples of more niched applications of LLMs, like Databricks. And then you have those that are shooting for AGI, and I would put meta into that category. I don't think they will stop and say like oh, this is yeah, that's what they claim.

Speaker 1: 44:13

Mark Zuckerberg says he wants to go for it.

Speaker 5: 44:15

Okay but then we can really categorize them who's trying to make money here and now with LLM only, who are trying to support their core business model and who are hardcore shooting for AGI? Can we categorize it like that?

Speaker 3: 44:28

Possibly. And then of course you have Microsoft. That essentially does both, amazon also to some extent, and Google of course.

Speaker 2: 44:39

Perhaps that's the trend to actually do both. We know OpenAI started off as open and then went more and more closed but they still have something Not much, but some models are actually still open source.

Speaker 2: 44:51

They do release like Whisper and stuff like that, so they went hybrid in some way. Google, of course, they have been rather closed, but they do release some models and now they have Gemma combined with Gemini. I think Microsoft has the same. As you say. Netta is leading a bit on open source, but they haven't done that in the past and I think they will stop doing it when it becomes too powerful as well.

Speaker 5: 45:19

But I sense here a little bit like how should I frame this? What we are seeing now, open source in relation to LLMs it doesn't really follow the traditional open source logic, like how we've grown Linux and how TensorFlow grew up and everything like that. It's like no way in hell.

Speaker 5: 45:44

Anyone else could compete with hundreds of millions of developers doing that, because now you kind of need such a huge investment here somewhere. So the dynamics to discuss open source in LLMs cannot be the same as the traditional view of understanding of open source from Linux and stuff like that. Do you agree with that? There's a different dynamic in this with a very big compute.

Speaker 3: 46:12

Yeah, but it but not only compute, there's also the sort of uh concentration of talent and uh sort of uh I think it's for instance for uh for me it's uh in the past year, two, three years, that's become very clear.

Speaker 3: 46:28

Two, three years. That's become very clear. So the people I don't work with, ai people in Sweden or Europe because they are in the US sort of the people. They are at OpenAI, they are at Microsoft. We have a close collaboration with Sebastian Bubeck with the 5.3 model. We I mean Working with the American parts. I get to meet Sam Altman and talk stuff through with him In Europe Sweden doesn't exist.

Speaker 5: 47:01

So you're not highlighting that there's a completely different dynamic in this open source, because you have a super strong concentration of talent around this.

Speaker 3: 47:09

Talent is a big investment. It was also another dimension too.

Speaker 2: 47:14

It was an article released just a few days ago where Daniel Ecke and Mark Zuckerberg actually wrote an article in some Swedish newspaper, dagens Nyheter or something. Yeah, it was American, it was for New York Times or something. Oh yeah, okay, but that was really about regulations and we probably shouldn't move into that topic right now. But it is connected to open source. So the article was about open source and how that influenced companies. So if we have the new AI Act and how they are potentially regulating open source, they do have some clauses about not trying to regulate open source, but they still do, and that will have big implications to Europe and Sweden we will see. Meta, for example, are not releasing stuff to Europe just because of their fear of, or insecurity about, legal consequences for that. So this could be a big influence on open source, because people are too afraid about the consequences for legal actions if you have it open source.

Speaker 3: 48:12

But at the same time it also diffuses the regulation Once it's open and you get like 50 different clones, then it's yeah, it will be fun to see what happens with the first court cases here.

Speaker 4: 48:24

Okay, so I have a moral question. Have you all signed up to be in the working groups around the AI Act's working group for Gen AI? Have you taken your responsibility? You have, I hear I have.

Speaker 4: 48:36

Well, I don't know if I'm going to be included, but I just felt like when I heard that there was going to be working groups around how to actually implement regulations around Gen AI, I contacted the people that I had top of mind and told them please sign up, because I think this is really important. Yeah, because I mean, that's one of my sort of other things that I'm raving about. It's the drunk uncle, and then it's that we need to take the responsibility as citizens of the EU and also participate in those discussions and try to drive it towards a more of a model that we think is reasonable.

Speaker 5: 49:14

So, instead of bitching about it, really get on board and try to make a difference and help out, to have the real experts really figuring this out.

Speaker 4: 49:21

Yeah, I think it's the difference of sitting in the backseat and asking are we there yet?

Speaker 5: 49:26

But who you know? Reaching this is a good one. Shout out who.

Speaker 4: 49:29

Who is driving the group and how it does it work, who to reach out to well, there there was a website where you had to fill in a very complicated template and then click submit you lost me already just use exactly, but so can we conclude something on the open source Just go on.

Speaker 2: 49:52

I think you put up the article there. It's just basically saying why Europe should embrace open source AI and the risk of falling behind because of the AI Act. But yeah, perhaps we should go into that. Yeah, it's paid things. Paywall Paywall yeah, okay, but it's an interesting.

Speaker 5: 50:06

The things that I cannot do in that way Paywall, paywall, yeah, okay, but it's an interesting trend to watch out for. Can we, can we elaborate anything on guidance to enterprise?

Speaker 2: 50:17

You know I think it's fun if we move to one of the big stories of the summer. I think Lama 3.1 was one of them. So let's go there because it has connection to open source and I think it's fun because you know it was a super big one, four and a five billion parameter model that has performance similar to the frontier close models and, and I think in some way, I think meta is thinking that it is a good thing from a competition point of view because because it completely removes or not completely, but partly removes at least the business model from open ai and others.

Speaker 2: 50:51

Now the company can take this open source model and build something without having to pay OpenAI to do it, and they can build their own models and use that as a teacher for so many things they can do. So it's smart from a competition point of view, I think, for someone to push out super perform models, models like like this.

Speaker 3: 51:10

right, yeah, although enterprises I mean have other parameters than cost.

Speaker 1: 51:18

Yeah, yeah, so exactly for for me to run llama 405b models. It's much more expensive, so it's uh compared to like open ai yeah I would have to have a supercomputer cluster and be able to to manage the infra and everything and it's it's so much easier to just call an api.

Speaker 1: 51:37

So I would say that uh for, uh for enterprises it's really hard, even from a cost point of view, it's still better to use open ai I mean, if you go on a full Volvo scale of things for me then of course it could be worth it, but for me, as having one team, it's just not worth it Interesting.

Speaker 4: 51:59

And I think when we started building machine learning models at SEB, I completely underestimated the amount of time it takes to maintain both the infrastructure and your proprietary models. And I think when you're doing the things that are core for your business, it makes sense to build it yourself because you want to be ahead of your competitors. So, like a recommendation algorithm selling furniture, you want it to be the best. But then, when it comes to other things, you definitely want to buy off the shelf because you don't want to end up maintaining some sort of a script somewhere that immediately everybody's going to forget and then it crashes and then somebody's angry and that's never fun or you can have a managed deep learning platform.

Speaker 3: 52:43

So in five years, what do you think the standard use?

Speaker 2: 52:45

case will be for, like a company, will it still be to use an api that you do that, to even train the model or adapt it, or will you start in?

Speaker 1: 52:55

how long time did you say? Five years two years, one year in two years probably probably still uh close source, but I think luke already hinted that there's things going on.

Speaker 5: 53:09

So when we are setting our assumptions now, don't really base them on the chat, GPT 3.5 or 4. Because there's stuff happening now where token window like limitations, we think our limitations will maybe look different in use.

Speaker 3: 53:24

I can say one thing that Sam Altman said, and I said it publicly also, so it's fine to share. So, like GPT-3 was capable of doing five second tasks basically something that takes a human five seconds GPT-4 is good for five minute tasks. Anything beyond that it's not good enough. It's not good enough. And GPT next to five, where it will be called it is capable of doing five hour tasks basically that's a big shift, that's

Speaker 3: 53:59

a big shift and yeah for for open a for everybody. The theme of the of the year is agentic agents, so maybe that's the segway here, so we talked about open and all this.

Speaker 5: 54:13

We closed that off and you open up the new topic. The theme is agentic. You said yes what the hell is agentic?

Speaker 3: 54:25

so a year ago or a year, yeah, roughly a year ago, a bit after g-4 was released, there came this program called AutoGPT, which essentially it connected GPT-4 API to be able to do searches and exit commands and so on. So essentially it can do actions. So not just output text but do actions. And what's more is that you could connect multiple agents together, working together and building own chains.

Speaker 3: 54:55

The thing is that it didn't work. It's sort of you could get it started, but it very quickly died out. And because the models aren't essentially good enough, smart enough, and when you start chaining them together, if it has 80% chance of being right, then the next level it's 80% of 80%, and so on.

Speaker 3: 55:16

So it dies out very very quickly and if you look at real world complex processes like enterprises or something like that, you have like many, many, many of these steps. So it wasn't really there, the technology wasn't really there. The technology wasn't really there and a lot of what's coming now is a sort of improvement on that sort of getting to reliability there and also I think there will be a push generally.

Speaker 3: 55:42

Right now people's experience with ChatGPT is that you type and it answers, so you take the initiatives. Here we'll go more to a system answers, so you take the initiatives. Here we'll go more to a system that autonomously works in the background. Not a big technological thing, but I think in terms of application, the way things work Can we do a practical definition of agentic.

Speaker 5: 56:05

I can do my layman's attempt and please correct me now. That's the whole point. So the way I understand it is that you go from single task machines that you want to do something on a more complex problem or steps of problem, and then we understand that to build an gigantic workflow meaning you have several algorithms that are good at different things, you maybe have an ensemble or they're helping sorting out, so you have a combination of models that you, you, you want to work in an agentic workflow and in that sense you can then come to a higher abstraction level in terms of it's even weirder than that.

Speaker 3: 56:43

It's not, uh, different models, it's actually the same model, just different things. So other instances of the model that you've prompted, yeah, okay.

Speaker 5: 56:53

So it's a little bit like, in the simplest terms you ask when you prompt something, you ask them to do. Three different persons doing three different tasks and work together as a team.

Speaker 1: 57:01

Is that simply?

Speaker 5: 57:02

summarized.

Speaker 2: 57:05

Yeah, I mean a normal workflow is simply prompt some model to ask the question or to do something, and you get an answer and then it's done. But by simply having that system to by itself take actions which could be asking other prompts to itself or to other systems or tools that it has, it can take actions basically and then, while or after having taken these actions, you do reply something so it actually becomes able to take actions which it didn't and I think I think it's a key word here to explain or discuss agentic agency.

Speaker 5: 57:43

You know that we are we're giving a clear frame for different agents to be good at or performing something with autonomy. Within this, they have agency for a certain thing and then we are stringing them together. So I think this is a really profound thinking around agency and but where this leaves us I don't know.

Speaker 2: 57:59

Actually it's an abused word I remember teaching in agent-oriented programming in 2003 or something oriented programming in 2003 or something?

Speaker 3: 58:09

And have you heard the term agentic before this year?

Speaker 2: 58:12

No, not agentic as a phrase that does, but agent based.

Speaker 3: 58:20

Yeah, agent based was the usual, but this agentic, that's a new word for.

Speaker 2: 58:23

Yes, I know you have worked a lot with this. What's your thinking about agentic system?

Speaker 1: 58:26

And can you?

Speaker 2: 58:27

perhaps elaborate on something you have experimented with yourself.

Speaker 1: 58:37

I can start from where you started um, so, um, uh. I would agree that the the current models aren't smart enough usually to to handle agentic systems. But what has happened during this year is that there's been a lot of scaffolding being built. So people use, uh Carlo tree search and they fine-tune on agentic traces and make the models a little bit better and then we can get to something that's that's uh, working relatively well.

Speaker 1: 59:00

I saw a paper from multi on recently, an American company that does rather simple things. They do things like booking a table. That was an example from the from the paper where they did this scaffolding things after, basically, the LLMS has said something and then they can get from 19% on GPT 3.5. For some reason. They took that example in completing a booking task on the web and then with the scaffolding they could get it to something like 80. So it's, it's uh, we're getting there and with the help of better models it can definitely go there.

Speaker 1: 59:43

So what I'm, what I'm working with in my daily work, is taking the first level of GPT use and taking it into breaking down the task into subcomponents and that works relatively well, and then you can sort of stitch it together with some scaffolding and get the decent result. But what I'm seeing right now is this is more a proof of concept, because everything will change with a better model. But it's a good time now, if you're thinking about the GenTIC, to start playing around with this and see the limitations. For example, if you're working with these multi-agent systems, it's super hard to see what's happening. They will end up talking to each other and decide what is the best way to deal with a problem, and then there's a lot of chatter going on and just to follow what's happened. Why did this task derail? It's like you have to go through a list of prompts and it's quite complicated. So it's good to get your feet wet in this and see what are the basic limitations of it and be ready for 2025. I think that's my guess.

Speaker 4: 1:01:00

And are you already thinking about potential applications or is it more getting to know, and can you talk about that?

Speaker 1: 1:01:08

Yeah, I think I can talk about it. It's uh, it's mostly uh playing around with it, but the obvious, the obvious step from from doing the sort of single uh asking a single question and getting the the, the agent to perform one action is uh, you're going from uh doing tasks to doing roles, so you, so you're basically taking on the role of in my case I'm trying to do a data analyst role. So if you take a high level problem and try to break it down into multiple steps and have something like a PDF report or whatever being delivered at the end, that's sort of doable today, but it's more mimicking sort of the style of of uh, some kind of worker, and that's the obvious first thing to do, at least and, as you said as well in the example, by having this kind of multi-step, kind of agentic if you call it that workflows, you can take a much less advanced and smaller model that will perform, if you do it in a multi-step way, in a much higher level, so you can actually get models more efficient.

Speaker 2: 1:02:13

You can get higher performance out with smaller models by putting them into a gentic workflow. Is that what you're saying as well?

Speaker 1: 1:02:22

I think so.

Speaker 3: 1:02:23

Yes, if I followed you correctly I mean it's an expansion of the chain of thought. You, correctly, I mean it's an expansion of the chain of thoughts. Yes, yes, the, the models here. They don't, lms, don't have the type of internal narrative that we do in our heads, but they really they think in what, what they write yeah, that's the process and this is just a generalized version of the change uh chain of thoughts and look at.

Speaker 5: 1:02:45

You know it's there a certain when we say agentic, because I think that's the whole thing that sort of blew up this year. You said it agent-based programming has been on blah, blah, blah. Is this different? Do we have a deeper meaning or do we have a clear definition of agentic?

Speaker 3: 1:03:05

No, it's just that we have actually something that's capable of being an agent yeah, I think it's a nice term actually something that's capable of being an agent.

Speaker 5: 1:03:09

Yeah.

Speaker 2: 1:03:10

I think it's a nice term. It's a buzzword kind of thing. I wouldn't spend too much time on the word itself, but it is a powerful technique and I'm eager and going to. I know you're not going to answer Luca properly here, but I'm going to read your face the best.

Speaker 4: 1:03:28

I can.

Speaker 2: 1:03:29

And of course, we have ChatDPT and GPT-4.0 today and we have the Strawberry models and the Rumors and we have the Orion model coming up soon perhaps. And one of the ideas and I'm looking at your face right now and the Rumors is that the star in Q-star, which is the Strawberry model, stands for the self-taught reasoner kind of approach. Ah, poker, face Poker face.

Speaker 1: 1:03:58

He blinked twice.

Speaker 3: 1:04:01

No, it's just like. I mean, you don't have to read into it. You know what Q-learning is. You know what A-star is, you know what A-star is Q-star.

Speaker 2: 1:04:13

But Q-star, I think, is double meaning.

Speaker 2: 1:04:15

One is the Q-learning approach and the Q-star kind of optimal policy kind of meaning of Q-star.

Speaker 2: 1:04:20

But some people claim that the star could also mean something else, which is the star of self-taught reasoner, and there was a paper about that where they tried to have a model trained by putting it into a loop, saying, okay, I have a question, I know the answer, but I want you to provide the reasoning for why this should be provided.

Speaker 2: 1:04:42

And when it first tries to give some kind of you know, motivation in Shain and Fault kind of thing, it may be wrong and and then it throws it away but it continues to loop until it gets the best kind of reasoning to say why this answer should be produced, and then it actually provides its own training data. So that means it is self teaching itself, it's self-taught reasoner, a star, and some people claim that that's what the strawberry is really all about. And if that is the case, it means that it can actually have a multi-step, similar to a genetic kind of way to train itself, to build up its own training data. So it doesn't have to fetch all the data and manual labeling is not necessary, it can actually produce its own training data and become significantly more intelligent, even more intelligent than humans, because you're not even dependent on human feedback anymore.

Speaker 3: 1:05:31

The PHY3 models are created, the training data is done by GPT-4 for them, so specifically prompted to create reasoning type of problems and with them the holy grail is to have a relatively small model that doesn't have all of the knowledge of the big models but has the reasoning capability.

Speaker 5: 1:05:57

But can we say with this.

Speaker 2: 1:06:00

What's your thought about strawberry? I like strawberries Okay sorry we're not going to get them in answers, Okay, sorry.

Speaker 5: 1:06:11

Eric, please continue. We're not going to get them in answers. No, but I think me and Luca were at a conference in Göteborg that Chalmers organized to run, I guess, the product management. You were the star of the show, we were both friends of Jan Bosch and I remember it quite vividly.

Speaker 5: 1:06:30

That sort of you you, you were there presenting, talking about ai in product management and how king was doing it, of course, but I think the the last part of that section was a little bit like um you, you got into the, the core dynamics of agentic and what I took out of that and I think I've seen that now from multiple sources. This is sort of my triangulating. What did you say? What is Andre Eng talking about? I feel very strongly that there is a move that our systems, most likely overall, the way we think about building systems, and now we're going to go all the way to enterprise in a certain years, this, this agentic thinking, this, this um, you know how you?

Speaker 2: 1:07:16

work.

Speaker 3: 1:07:17

I think it's going to be huge I think, I think, I think the the logic of how we build systems it is I think that the fundamental point is that and I think people get it now, but that wasn't obvious at the first that what we're seeing with LLMs the point is not that they can write a lot of text or code, it's that they are scalable reasoning engines. You have something that can reason and you can essentially infinitely scale it out. You can do as many copies as you want, and that's huge. Then comes the reality of how good they are at reasoning. Let's not go there.

Speaker 2: 1:07:56

Jesper.

Speaker 4: 1:07:56

I know you have been thinking a lot about strawberries.

Speaker 2: 1:08:00

What's your thinking?

Speaker 1: 1:08:02

I think it's obvious that it's some kind of inference time compute that's going to happen. That seems like a given that's what I think Just inference. Do you think training as well will have some kind of yeah, I agree with what you're saying, that it's probably some kind of um self-taught reasoning or or something like that that happens, where you can create synthetic data that can overcome the problems of uh, the limitations of human labeledlabeled data.

Speaker 5: 1:08:32

So let me play the ignorant. What the hell are we talking about? The strawberry thingy and this is sort of the ideas, and there are rumors, whispers, and we are trying to extrapolate what this is all about.

Speaker 1: 1:08:43

Can you just?

Speaker 5: 1:08:43

give us some small context.

Speaker 1: 1:08:46

I can give you, since I'm obsessed with the hype around this. Uh, I can give you the historical background of it. It started out by by q star was the first name for this, because that was something that was leaked from open ai. And then altman, more or less, uh, publicly said that that's true by saying that was an unfortunate leak. And then that died out and after a while it came out that the new code name for Q-Star is Strawberry. I'm guessing that it's because it's very hard to get a large language model to answer. How many R's are there in Strawberry? That's my thinking. And then the next thing that happened, somebody on twitter that was called, uh, I ruled the world, mo. Uh, he said something about strawberry and, uh, some old man commented on that and that was seen as he's right about that. So he got to be the strawberry man.

Speaker 5: 1:09:46

So he got to be the strawberry man.

Speaker 1: 1:09:49

It's a fun evening time pastime if you want to read the rumor mill.

Speaker 5: 1:09:56

And now we are trying to then extrapolate and guess what is in here. What can it do?

Speaker 1: 1:10:10

And the next thing that happened was that it was called. Something is called Orion and the hype around that is, there's some kind of strawberry model that will produce the output that is going to be used for training Orion model. That's the rumor mill.

Speaker 4: 1:10:22

I really hope that the communications and PR experts that are working around this project are getting well paid.

Speaker 2: 1:10:30

This is awesome. What do you call it? Like a dramatic kind of buildup for some big releases?

Speaker 3: 1:10:36

All of the OpenAI, or more or less all of the OpenAI comms people are from Apple release.

Speaker 4: 1:10:40

All of the OpenAI, or more or less all of the OpenAI and comms people are from Apple Nice. That's why.

Speaker 5: 1:10:43

I mean, they are brilliant. It's like the Elon Musk who doesn't pay for advertising and is this, you know, doesn't pay for. You know you don't need to. This is you know it's keeping its Salah.

Speaker 2: 1:10:52

Do you have any thoughts on this? You know, do you think Strawberry and the next release of the D540 or D55 or whatever it will be?

Speaker 4: 1:11:11

Well, I'm such a bore in these circumstances because I turn into the hardcore theoretical mathematician that I was 20 years ago, I mean. So my supervisor had a favorite saying when I would come and use a very complicated theorem to prove a very simple thing, she'd say now you're shooting doves with cannons. And I somehow took that to my heart because I enjoy elegance and I really want to. You know, sort of I really love solutions that are as small as possible and as high quality as possible, and I agree that today we don't have to spend loads of time like correcting old reporting, for instance, but there's still certain situations where you really need to know like in IKEA, you need to know exactly the weight of the furniture that you're going to send and you need to know the size so you can do the cardboard and yeah. So I think it's super interesting that we're sort of coming to models that can start to reason, and I'm really excited about that.

Speaker 4: 1:12:04

I can see loads of really interesting applications, and then I think there's still so much more to be found. There's like I mean so one of my apparently when I was still in risk control quantitative analysts. They were talking about big data and stuff and apparently I put my hand up during this big meeting where they were talking about data quality and we're going to have to hire 100 people and I'm the kind of person who can stay quiet if I think they're going the wrong way. So I put my hand up and I said I think we should just use machine learning because you can actually like.

Speaker 4: 1:12:36

You can systematically find the errors in the data and then you can systematically do something about the problems. And it was very quiet. I think there are still a lot of applications whether we call it AI or machine learning or something where we could be doing so good things with a bit of elegance and not just taking out the cannon to shoot the doves.

Speaker 2: 1:12:58

But is it really where we could be moving now? Because you know, as we said, you know if we have this kind of multi-step kind of more iterative reasoning kind of mechanisms even some people I don't want to name the name are calling this symbolic approaches- yes.

Speaker 2: 1:13:12

Like multicolored research. Some people I know you know who I'm thinking about are calling that like a symbolic system. But you can even say that this kind of added reasoning, iterative, both training and inference kind of steps that you have, adds more. What should you call it I'm going to use symbolic because I can't come up with a better name Some kind of more prior to it that allows models to become smaller and more elegant and actually that can make it super much more efficient and more elegant. So it could be in your direction.

Speaker 4: 1:13:49

Yeah, yeah, and I think I mean so. At IKEA, there was this amazing team of like three people and they built a translation engine to be able to translate Ikea's product information from one language to another. Fantastic quality. They really struggled even to get access to GPUs. They really struggled to get budget and I was like trying to do marketing for them, like we need to invest in this team. They're amazing.

Speaker 4: 1:14:15

And then ChatGPT comes with a fantastic prompt, and then there's a central team, in essence, trying to do very similar things, and I'm like but these people are already doing it and it's very cost efficient and it's like they've fine-tuned the model so it takes incredibly low energy consumption, and so I think two things. I think one is great with Adjantic, but I also think, as data scientists and practitioners, there's a lot that we can learn about marketing, about storytelling, about explaining what is the possible in a way that is appealing to a human who's not as excited about AI as we are, and I think that, for me, was the main takeaway when the whole large language model thing exploded. That shit, that's what we should have been doing.

Speaker 5: 1:15:00

They were so much better at selling this yeah.

Speaker 4: 1:15:04

It's really cool.

Speaker 2: 1:15:05

Well, if it's anything that's clear, it's that it's very interesting times coming ahead, I think and I mean short term, like coming months, right.

Speaker 3: 1:15:17

I'm not in charge of the timeline there. You need to drink more beer.

Speaker 5: 1:15:23

I noticed already that he's not drinking beer. He's clearly on a very strict diet today Should we try to find another topic.

Speaker 2: 1:15:33

We spoke a lot about agentic reasoning models becoming more efficient, so I have another.

Speaker 3: 1:15:42

Just something that I thought about today, which is not it's related, but diffusion models. So DeepMind released yesterday a paper demo of real time generation of doom essentially playing doom so you could in real time play and it would generate.

Speaker 2: 1:16:01

It's like the infinite Seinfeld episode.

Speaker 3: 1:16:05

Yeah, exactly, but this is interactive. Oh, so it reacts to your input.

Speaker 2: 1:16:14

And just for information, there was a YouTube live stream where they used AI to continuously just generate Seinfeld episodes continuously for months. Okay, but this was interactive, this was interactive. Yeah, playing Doom AI to continuously just generate Seinfeld episodes continuously for months Okay, but this was interactive, this was interactive, yeah, playing Doom, and it sort of looks exactly the same way.

Speaker 3: 1:16:26

It could play 20 frames per second running on a single TPU. So this was sort of an efficiency bit. But it got me thinking of, like I've been seeing in the we've seen in the visual arts, sort of image generation, of how that has sort of there's a big implicit understanding of things in these models. And then we have with audio, where it sort of it's like can compose music and in this way I thought games like when you develop a game with the complexity of it of where not just the design but the, the ai of the orbit, the all of the aspects that you have teams thinking about, and this one, just like, generates image per image and which implicitly has all of that what do you think will be the future?

Speaker 1: 1:17:18

will it be that we're just filming and then building a diffusion model from that?

Speaker 3: 1:17:24

that's a question. Okay, I can see it's like I'm thinking out loud here. I mean, I can see for things like movies and games and so on, so where the output is something that you just watch and sort of consume. I'm wondering if this could apply to more general problems.

Speaker 1: 1:17:40

Yes, I got the same thought. I was talking to Robert Luciani yesterday and he mentioned this paper, so I'm happy that he tipped me and it really got me thinking as well, like what if all software is AI? I mean, you're not going to like this because this is now we're really shooting with cannons.

Speaker 1: 1:18:01

But we already moved from zeros and ones to machine code and to assembly, to C++, and now we have all these layers and we're doing very advanced stuff. We have a very advanced machinery to do relatively simple things in many cases. But what if in the future we don't have software, we just have a model, an AI model that can generate anything you want?

Speaker 3: 1:18:30

That's really mind blowing when you start to think about that, if you think I mean in the gaming instance, it's relatively simple to see from this example. Okay, now they traded the Doom, but imagine that they had traded it on thousands of different games. And you give it a description. I would like to play Candy Crush shooter that blah, blah, blah, blah blah, and then you play it in real time. And then say okay, but let's not do Candy Crush, let's do accounting software.

Speaker 4: 1:18:54

Oh God.

Speaker 5: 1:18:59

There are some nerds who love to play that game, but okay, but there is, because what you open up the box now is a little bit another way of looking at what happened, by looking at slightly different industries. So you talked about gaming. What were the biggest steps we saw in music, for instance? I think there was a couple of different big releases of different types of products. That is kind of starting to be mind-blowing about how you produce and develop music.

Speaker 3: 1:19:25

I would say music is the bit that I've been most impressed about, but by the progress, because we went from like, ooh, it can, after 10 hours of inference, it can generate a 60-second clip that maybe sounds like a piano or something. You could hear it to like boom, boom, the like. Complete compositions with musics and lyrics. I also demonstrated the power of multimodal models.

Speaker 5: 1:19:51

Was it in one of the bigger models, or was it also some products built on this? Because I saw a couple of different clips. So it's Udio and Suno. Suno is what I was thinking about.

Speaker 2: 1:20:01

I'm using Suno for like having, like when having parties at home or something. Do you want me to play a song for you?

Speaker 4: 1:20:07

Yeah, search for this type. No, no, no.

Speaker 2: 1:20:10

Tell me a song that you want me to create and actually it works surprisingly well, even for a dance band or this kind of top music genres.

Speaker 5: 1:20:20

So now you're with us. We talked about this already three or four years ago. You've been waiting for this so you can be a musician.

Speaker 2: 1:20:25

It works surprisingly well.

Speaker 5: 1:20:26

You just write your prompts.

Speaker 2: 1:20:29

And it can generate the instruments, the voice, the vocals, the lyrics, everything.

Speaker 5: 1:20:35

So you use your solo privately for fun and everyone loves it.

Speaker 2: 1:20:39

This is a song for you.

Speaker 3: 1:20:45

And then it has the names of it right what's the window was like?

Speaker 2: 1:20:51

uh, I don't know, 30 seconds no, it can generate like, uh, sooner can generate like four minutes, four or five minutes, yeah, full song, and it's a few seconds like it's surprisingly fast.

Speaker 5: 1:20:58

So it sounds good and so. So this is the tip for fun.

Speaker 3: 1:21:02

For fun.

Speaker 5: 1:21:03

Instead of having a music quiz night, have a music sono night with your friends.

Speaker 3: 1:21:07

Right and come up with the best song, and for kids I mean I've spent so much without generating music with them.

Speaker 5: 1:21:15

And is it expensive? I haven't used it yet, I'm so ignorant I used it for free.

Speaker 3: 1:21:21

I have a lot of free credits there and you have this too. You have Suna, which generates quickly, and I would say I'm better at catchy type of dance band. I can see Suna really.

Speaker 2: 1:21:37

Excel Hard to understand but really great music.

Speaker 3: 1:21:43

And then you have Yudio, which is very high quality. Yeah, royal voices sound really realistic.

Speaker 5: 1:21:50

So which models has it that has been? There's one of these techniques that's gone more into the professional studios. More is this.

Speaker 3: 1:21:57

Yudio. Yudio right you have stems and you can separate out studios.

Speaker 2: 1:22:02

More is this uh, you do, you do right stems and you can separate out. Uh, yeah, but yeah, should we go back to the question about diffusion models, because I think that's an interesting topic and, uh, perhaps people should try to understand what diffusion models are.

Speaker 3: 1:22:11

Do you want to give like a short intro to it, or, um, yeah, sure basically, it's a different type of neural network that's built on that you rather than LLMs, where you start to sort of chain things up, of trying to predict the next token. Here you start. If you're doing, say, image, you start with noise and then, during inference, that noise gets more and more shaped into the actual end result, based on the conditions that the neural network provides.

Speaker 2: 1:22:45

I think it's interesting because it has a number of steps. It can't do things in a single prediction step. It has to be split up in like 50 or 100 steps and what you do is you add noise to it and you learn to predict, to remove the noise Surprisingly simple task. And that is actually, I would say, the way to reason in some way. Now it's simple is for images and audio, but it could be for other things in the future as well. So it's some kind of multi-step reasoning happening here, which is interesting.

Speaker 2: 1:23:14

And and there was this other oh big yeah of, of course, the flux model, the flux 0.1 that Elon is now using on X and together with Grok2.

Speaker 2: 1:23:26

So they are using this model called flux and they use diffusion models as well. I haven't I don't recall it was a time since I read it, but I think, in short, they're using this kind of latent diffusion transformers. So the thing that they're doing is they are you know, this is a pet peeve that Henrik and I have been speaking about before they're doing latent reasoning. So if they're not working on like the time or each audio sample in 16 kilohertz kind of room, kind of room that they first move into latent space by using this kind of autoencoder to move that into some kind of much more compressed space, and in that kind of compressed latent space then they do the diffusion process and by doing that they can create pictures that are, I would say, state of the art right now in the Flux1 thing, and it works surprisingly well. The interesting part here is also that there is no guardrails on this.

Speaker 5: 1:24:25

Yeah, I was just waiting for this. We're going to talk about the Kamala Harris Trump pictures.

Speaker 2: 1:24:31

But I think for one this is I'm very interested in myself I think it's wrong to do the reasoning, the multi-step kind of steps that we're now seeing LLMs taking. They are doing it in token space right, and it works well, because text has a rather high level of abstraction. It's not really sensory input, it's not like audio directly, it's not, you know, pixel input, it's actually tokens that are high abstraction level already. So then it works to do reasoning in token space, but for images it doesn't work at all and for audio it does not. So they have to compress it first and they have to move into latent space, and this is what Stability AI did for many years back and that's why they got really good into this. Flux is actually built by the founders of Stability.

Speaker 5: 1:25:18

Yes, that was the argument right.

Speaker 2: 1:25:20

So, flux, they jumped ship, but they started even sora from openai and others are actually moving in and doing the, the process, generation process also gpt 4.0, the mobile the dali 3.

Speaker 3: 1:25:32

Thing no, no, not the dali 3 thing, but uh, the audio is a native modality Audio and vision.

Speaker 4: 1:25:41

I didn't know that, so can I? Just so I understand. So you have your input data and then you do like a tokenization or something, an abstract tokenization. So then you have a X-dimensional space and then you look at the geometry of that space or you.

Speaker 1: 1:25:59

What is it?

Speaker 4: 1:26:01

I mean, and then you apply some sort of machine learning model. I mean the shape of that space in itself, like the geometry, the topology, the sorry that should actually matter.

Speaker 2: 1:26:11

Machine learning is really about compression in some ways. So if you take, Flux1, you know, it moves from this kind of highly dimensional two-dimensional images, moves it into a much lower dimensional two-dimensional images, moves it into a much lower dimensional, two-dimensional representation. And then that means you know it doesn't have the syntax of green, uh, red and blue. It's some kind of more semantic pixel in some way, and then it operates on these kind of latent representation could you speak understood why latent space is better for reasoning.

Speaker 1: 1:26:40

I think that's a good point.

Speaker 2: 1:26:41

Yeah, I mean, it's obviously so much more efficient. To try to do reasoning where you have to generate every pixel every time, for multiple steps would be extremely inefficient and very costly. So if you can avoid all the syntax, you want to remove the need for being able to encode things. You just want to do work on the semantics.

Speaker 2: 1:27:05

So by removing the need to encode things into pixels or to audio space and just work in semantic space where actually the semantics of text, images and audio are the same yeah, then you, it's much more efficient but you do also have that in the, in the traditional lamps and in the tokenization, but the reasoning is happening usually in the token space because you have to generate one token at a time yes, right, so it's a little bit like.

Speaker 4: 1:27:30

It's a little bit like doing pca, but you have more complicated uh functions to collapse your dimensions.

Speaker 3: 1:27:35

Remember word to veck? Yes, I do, thank you.

Speaker 4: 1:27:37

I think you know what john lecun would say yeah, yeah, the he doesn't. I remember Word2Vec. Yes, I do, thank you.

Speaker 1: 1:27:41

I think what Jan-Luk Hjulm would say he doesn't like the autoregressive nature of LLMs and I don't either If you can move the generating next token out from the token space into the latent space. I think that would be awesome. What I'm thinking, what I was looking for in your answer and I may be wrong here, this is not my special speciality, but I'm thinking that the latent space is easier because there's some kind of continuity here. It's like a neighboring point in the embedding space. It's a manifold, that's more easily.

Speaker 1: 1:28:14

Yes, it seems like it's a better place to do the reasoning in, because things that are close together in embedding space have similar meanings and not just the token distance.

Speaker 3: 1:28:28

Yeah, and I don't know if I would agree that the reasoning happens in token space.

Speaker 4: 1:28:35

I mean there's a deep neural net.

Speaker 3: 1:28:37

You have a lot of different layers.

Speaker 2: 1:28:39

Of course there is a part that happens in the latent space, for the LLMs of today for sure. So there is reasoning and there is a lot of stuff happening in the latent space. But I think it could be significantly improved in efficiency if you don't have to generate this specific token every time. You should do the encoding from the latent semantic representation to tokens when you're complete, when you're done with the whole answer.

Speaker 3: 1:29:02

So the loop is inside rather than going out, shouldn't?

Speaker 2: 1:29:06

that be obvious more.

Speaker 1: 1:29:09

So we talked about this before, and the thing that hit it for me was when you said that the training is done in token space. I mean, the loss function is in the token space. That was what.

Speaker 2: 1:29:23

And this is exactly what Jan Lekun is speaking about in his JEPA models. The loss function is after you encode it into this energy landscape, and then the loss is just for the energy errors. It's not in the sensory space.

Speaker 3: 1:29:37

That would sort of kill the real-time aspects if you have to think fast about it before you speak.

Speaker 2: 1:29:46

I think for some tests it's going to be faster.

Speaker 4: 1:29:49

And I think I mean this is a pet peeve that I have as well about real-time I think there's this expectation that everything should be real-time, and yes, when humans are interacting, but quite often it's enough to have a bit of a latency if it really saves you loads of money and it saves you a lot of effort. Yeah, I like real time, but I don't think it's necessary in more than 1% maximum of applications. Let me try out.

Speaker 5: 1:30:16

I couldn't follow half of what you're talking about because I'm not a scientist here, but what I'm picking up if I'm trying to zoom out what Luca is bringing in here now is that in this quest to be more smart and do it more efficient, there are magical things happening in different parts of the whole game. We are learning more about diffusion models and can we move into latency space, and this is one type of technique and this will solve some parts. And then we are moving over here we talk again and then we are talking about the whole fundamentals of the LLM. So what I'm getting out of this conversation is that beware that the way the space is moving now, a long time ago we were spread out on working on machine learning in many different ways and then said oh, everybody's now working on transformers, blah, blah, blah. We're very converged on one type of approach, but now it starts to once again here in this neighborhood there are different things that together will bring a better story. Have I understood sort of what?

Speaker 5: 1:31:18

you when you highlighted the fusion part here it's a little bit like.

Speaker 3: 1:31:21

I mean, the big thing is that, say, in all of these cases there's a big deep neural network in the middle.

Speaker 2: 1:31:27

And generally, I think, at least for me, what's?

Speaker 3: 1:31:30

always been appealing with that is that it's not a special case, it's not sort of something that you hand engineer, but sort of the capability comes from the structure and now, whether you have diffusion at the ends or if you do Burt's town masking or whatever it is and those sort of different variations of interfaces to it. But the thing is the thing in the middle. The thing in the middle now centers. The thing is the thing in the middle.

Speaker 5: 1:32:01

The thing in the middle now centers the community on the thing in the middle, but now we can, in order to go further in generalization, in precision.

Speaker 3: 1:32:12

We need to now work on the different things, every modern model like GPT-4 or Claude or whichever you choose essentially throws the bad thing, they throw everything at it, so you have a combination of everything possible in there. So it's not like Claude is purely a simple token-based transformer.

Speaker 5: 1:32:42

There are many things happening here.

Speaker 3: 1:32:44

Especially those, all the vision-capable models.

Speaker 5: 1:32:47

But I think this is a misconception, right? So in the general media now, it's transformers, transformers, transformers and this whole nuance that we are bringing to the table here.

Speaker 3: 1:32:55

I think it's quite important when we understand where this is going, or yeah, quite important when we understand where this is going, or yeah, I guess also it's interesting in the discussion of closed source versus open source. We know what's going on.

Speaker 2: 1:33:11

Okay, so we were saying we were going to try to keep this episode short. We're in one and a half hour now and perhaps we can try to come to some kind of closing discussions. Could we perhaps have a short discussion just about, like abuses of AI? You know the election time in US and we've seen so many fun stories.

Speaker 5: 1:33:32

Kamala Harris, donald Trump pictures is was that Grok? You know Elon Musk having no limitations, right? Other videos, of course. Right, there's videos as well. Other videos, of course. I've seen it.

Speaker 2: 1:33:43

I can start a bit and it would be fun to just hear. Let's try to keep this discussion short, but still it would be fun to hear what do you think the future will end up with. Will there be more abuses of AI, or will it actually be controlled by having AI controlling in some way? Or what will the future be here? So in one very interesting recent release was the GROK2 model from XAI and launched on Twitter on X, and the different thing there was it's basically uncensored and you can ask it to generate whatever text, but also images and videos. So they use this flux.

Speaker 2: 1:34:25

One diffusion latent diffusion model to generate really awesome looking pictures. And what were the examples? It was one, you know Trump going on the beach with Kamala Harris, kissing her, touching her pregnant belly hanging out and with the Trumpet, baby looking baby.

Speaker 5: 1:34:47

There was many versions of this craziness, so people having fun and testing the model in the way that, of course.

Speaker 2: 1:34:55

What do you think about this? You know rock to or X AI approach of having more or less AI, model generative AI without any kind of gold rails. Good or bad, sada? What do you think?

Speaker 4: 1:35:09

I was listening, so I need to now quickly think no, but I think it's the drunk uncle right, it's like your drunk uncle comes to the Christmas party as an incredibly racist, horrible person, and then you know that, and I hope that in three years time we'll have come to that realization.

Speaker 4: 1:35:28

I mean, I've found this fantastic podcast. It has nothing to do with AI, but it's called the Rest is Politics and they have, like these, very well thought through discussions about political situation in Britain. And then there's one for the US and it's like it's it's what I grew up with. They have a discussion about history and how things connect and everything, and so I hope that we will be able to keep that part, because I think it's very important for us as humans to be able to have discussions where we don't always agree. And then I hope that that we will be able to learn when is it the drunk uncle and when is it actually something that I should be listening to and maybe I should be listening to drunk uncle sometimes, because he might be right sometimes, but just sort of really understanding the probabilistic.

Speaker 2: 1:36:09

So if you label a piece of content like an image or video saying this is parody, this is not facts, Would that be okay then?

Speaker 4: 1:36:19

That would be so much better than what we have today. No, I think so. I mean Ozzy Mann, the Australian Instagram guy. He did an apology from Australia for this breakdancer, reagan, and people were like I'm unfollowing you. And so he had to say this is a parody, no, but I think it's important.

Speaker 5: 1:36:40

But this is so tricky because some of the best parody the whole thing is about. You know, we grew up with very ironic humor in Sweden with Schilling and everything like that, and the whole idea is that the kids love it because we understand irony, we understand this is fun, and there's always someone that gets itched by it, and if you get itched by it, you clearly are equally dumb, right, so, so, so, but so. But the tricky point here is, like there is, we will be no telling. You know what I?

Speaker 1: 1:37:08

what I liked about the grok to grok 2, when the flux one images, is that it puts really the finger on that. This is uh this is now something that you have to think about that anything you see can be false. It can, it can be generated. So you will need to start questioning everything you see, and so that's what you're saying, basically, that there's a drunk uncle. And now you know it. Now everybody knows that there is a drunk uncle out there putting out images, and you will need to question, you will need to learn how to question the sources. And is this really true? Could this really be kamala harris and donald trump? You don't take images as as proof anymore of anything happening. That's good, because it's been. It's been happening for a long time, but it was a little bit harder before.

Speaker 5: 1:37:58

You're making the point I was sort of joking about, I think. In reality, it shifts our abstraction level of how to understand, and you know what?

Speaker 4: 1:38:07

you know that we can take nothing for granted, yeah, so I have a friend, a really good friend, who lives in New York and he's, of course, then bombarded by all the different fake media news and for some reason he trusts me. And bombarded by all the different fake media news and for some reason he trusts me. So he sent me a link and he was like is it true that Alabama is reintroducing child labor? And I was like that's a really good question and so I went to the official governmental webpage for all the legal decisions that have been taken in Alabama during the last five years. There's incredibly bad search. It's almost impossible to find anything like even just understanding the housing regulations. So I think what I'm hoping for is that the fact that we have this big news situation and all the generated information is that our governmental agencies and people who actually should be sort of guarding the truth will have to invest into making their information easier to access and easier to find. I think that's really important for democracy in general. Yes, I think so.

Speaker 3: 1:39:15

In general, it's going to be about you don't assume. You won't assume, you already don't they assume, because there is an image that it's true on the. On the contrary, it sits in a neutral, undefined space unless you have proof, sort of exactly that it is, and there I think we'll see new ways of authentic, of sort of signing things uh of, but the sooner we leave this false security. But who has that false security? Seriously, do you believe a picture that you see like, ooh, there's a photo, so it must be true.

Speaker 4: 1:39:49

I think a lot of people do. I think so. I think so too.

Speaker 5: 1:39:54

I mean, I think not here. Come on, not if you're an AI expert in the AI space.

Speaker 3: 1:40:00

But I think it depends on the context and obviously if Fox News shows an image, you will believe that it's true because it's a news agency.

Speaker 5: 1:40:08

Fox News right. I picked the example on purpose, I don't know, but I really like that this becomes super exposed. I actually saw a documentary where someone was sort of how do you? This was a documentary of how they were. I think it was in somewhere if it was Ukraine or somewhere where there were extreme amounts of propaganda I can't remember which banana republic it was and the core strategy was the way we're going to educate the masses is that we're going to expose them to how it works. We're going to expose them to something that is provocative, that they get mad about, to make them build thick skins, so to speak, so when everything else gets bombarded, they can start judging. Seeing this, and I think there is a spectrum now where, of course, the initiated has already passed this test. But if you look at the bell-shaped curve, where are we? Have we even reached the majority?

Speaker 3: 1:41:12

But do you need to fake images for that? Look at the sort of now Palestine-Israel conflict. One of the typical things that was sort of social media. Of course. I read this on BBC so it might be propaganda, but they had these examples of where sort of the same images, like of a kid, were used on both sides with a different story. So the sort of photo itself wasn't really relevant.

Speaker 5: 1:41:40

No, I agree with that. What we're talking about here is fake news and not necessarily fake images, but it becomes very vivid when you see the growth tool examples.

Speaker 2: 1:41:49

I think actually I'm positive in this. I think humans will adapt quickly and start to have the default thinking. Is this really true or not? So the AI generation will move the default opinion in some way. So I think that's a good thing actually. But I get, I guess, another question that I was thinking about really should we do it the google way or the xai way? Should it be uncensored or should it have a lot of guardrails and biases built into?

Speaker 3: 1:42:17

before we, before we get to that, I want to return the favor and look at your poker face and ask that.

Speaker 2: 1:42:25

Good luck with that.

Speaker 3: 1:42:28

Are deepfake images? Are they a threat to Swedish democracy?

Speaker 2: 1:42:34

Let's move on to the next question. That's a very good question.

Speaker 5: 1:42:41

Thank you for that.

Speaker 2: 1:42:44

But I'm sure that we all are aiming for a positive development of our society and democracy and we should find the best way to do that and to figure out how to get to a proper democracy and society. We need to figure out how we build AI models. Should they have built-in biases and guardrails or should they be, as Elon Musk called them, truth-seeking, Maximally truth-seeking and uncensored?

Speaker 1: 1:43:12

But how is this truth-seeking?

Speaker 2: 1:43:15

Okay, let's go in the other direction. You know the Google.

Speaker 4: 1:43:18

Gemini, you know horrible example about. You know?

Speaker 2: 1:43:20

show me a picture of the founders of the godfathers of US and they had, like black women you know being portrayed. And if you enforce that kind of strong bias into a model, that it becomes, you know, fake and certainly not true's not good either. So too much biases, too much force on you know, diversity or whatever bias you put in there, or too much guardrails, actually makes it censor stuff that actually should be put out. That's not good for sure. But then the question is is completely uncensored also bad? I mean, if you were to produce like child pornography or something like that, that can't be good either. So I mean it's bad in both extremes, I would argue.

Speaker 5: 1:44:06

I think this is a very important question. Where do we stand?

Speaker 2: 1:44:10

Yeah, jesper, if you produce that.

Speaker 1: 1:44:16

I think I would probably place myself in the non-censored category. Yeah, it's.

Speaker 4: 1:44:27

More towards that.

Speaker 1: 1:44:30

Anything that tries to bias in any direction will always have artifacts. I don't think that's a good idea.

Speaker 5: 1:44:39

I kind of second that.

Speaker 4: 1:44:41

But I mean the inherent bias in the model comes from your choice of data.

Speaker 1: 1:44:46

They will always be, but to put in some artificial bias seems like a bad idea.

Speaker 4: 1:44:54

I mean. But it's also I mean, depending on what part of the internet you scraped you're definitely introducing a lot of bias. And I mean because I think that's one of the sort of discussions that goes on sometimes about large language models whether the culture of a country is inherently in the model. I mean, we know that there are examples of countries where there are very different cultures than the Nordic one, where they actually sort of require the models to be trained on their culture, and I would not like to go to that extreme either. So I think a bit of both. I think the model, like the foundation models, probably great to use all the information and then really use guardrails fit for purpose.

Speaker 5: 1:45:37

But you're really opening up a really good, nuanced discussion here, because we come all the way down to. You know languages. English is a commercial language. It has depth around those areas, and maybe the Swedish language is stemming from different values. And so we had a guest here what's her name? At Rice Johanna and she had this obnoxious opinion that of course we want biased models, the values of you know, in effect, of this value discussion and all that, and I think it's really tricky here what's your thinking?

Speaker 2: 1:46:25

Luca Godreils or Ancestor?

Speaker 3: 1:46:28

that it's very hard. First, I don't think there's something as a neutral model, because it depends on the data that you train and especially when you start gathering all the data that you can all historical data it's not going to reflect a really great value system for the AI to follow. So I do think that a minimum curation of the data is needed. The second thing I know this for a fact that companies building these models are deeply uncomfortable with having to take the responsibility of essentially enforcing values and that there should be a broader discussion about it. But it's not like I don't know. Like I know, openai, for instance, has, among other things, said that they do see individual cultural adaptations of mobile sales as possible. So then I don't know. You have a Taliban model and then I mean where do you draw the line? I don't think there's any good answer there. I mean, where do you draw the line?

Speaker 5: 1:47:40

I don't think there's any good answer there, but do we need a Swedish language model in order to have the core, fundamental Swedish values and feelings? I mean, we had all this great discussion with Erik Hugo. If you go to Greek, you have so many different names for blue right. Because it's part of their culture.

Speaker 3: 1:47:57

Right To understand all the nuances of blue, shove it in the training data of the biggest model of the model.

Speaker 5: 1:48:03

But maybe that's the way to go, right? So instead of having a Swedish model, maybe we need to have a fine tuned. I don't know. Do we need fine tuning for Swedish?

Speaker 4: 1:48:11

Yes, yeah, I think we would need fine tuning for every specific purpose we want to use it for. This is a good answer.

Speaker 4: 1:48:17

Yeah, but I mean, then it's the purpose that you're using it for that you need to train the model for. I mean, yeah, I just remember when Google Assistant was released for the first time and we got some sort of a pre-test at SEV and I was invited to be in the test group and I was like, yeah, sure, that sounds fun. And then I'm standing outside the room where we were going to test it and the other Finlandsman comes in and we're like looking at each other like ah, did not understand what I was saying. So I mean, in that sense, if we're going to use it in governmental agencies, if we're going to use it for making our systems for offentliga more accessible, then yes, we need to invest really into seeing to it that they can understand people with speech impediments and people with different dialects. And then I think it's super important to spend the time to find out.

Speaker 3: 1:49:09

Won't that be included in the more general so far, no.

Speaker 4: 1:49:13

If you look at the future, maybe.

Speaker 3: 1:49:14

Like 3.5, gpt-4,. People are really bad at Swedish. 3.5, gpt-3.5, really bad at Swedish. Gpt-4 was better, but still, if you ask it to write a good bedtime story for the kids, it would be not really nailing the language. Gpt-4.0, a bit better, claude, significant improvement.

Speaker 4: 1:49:39

So I think these, yeah, they're getting better, definitely yeah.

Speaker 2: 1:49:42

So I think these big yeah I mean they're getting better Definitely Cool. Um sorry, I think I'm not sure what the answer here will be, but, uh, some kind of log on as they say in Swedish perhaps it could be a good one. Anyway, it will be an interesting future, as we said many times before. I'm thinking about how to end it. Should we do it the same way as usual, just asking for the future and thinking there in a short way.

Speaker 5: 1:50:06

Okay, let me have a go at it, based on this discussion and our general feeling or summary of the recap of the year so far. Uh, how do we, how we change our ideas or how we fine-tune our view of the future, of the vision, in relation to? You know, are we going more darker in relation to the consequences of AI and AGI, or are we more positive, or has something in this year sort of I think that's a good way to phrase it.

Speaker 2: 1:50:48

I mean, if we think about the scale of the utopian versus dystopian future, when AI will potentially be AGI or super intelligent, have you, in the last year or over the summer, moved more towards, uh, utopian, positive or more towards a dystopian, negative kind of future?

Speaker 5: 1:51:06

is it piling up on the on the plus side or the minus side?

Speaker 3: 1:51:09

look, so far. So if you're looking at uh agi, then the sort of utopian and dystopian are a mirror image, like the potential for. But I would say on the ooh, we're going to get run over by deep fakes and fake articles and that's going to destroy democracy. I'm more on the positive side now.

Speaker 5: 1:51:34

I think the at least what I it's already here and we're already managing it.

Speaker 3: 1:51:39

Yeah are we? No, at least what I, it's already here and we're already managing it yeah, are we?

Speaker 5: 1:51:44

No.

Speaker 3: 1:51:48

I'm just looking at the reference points of my kids who think like DALI and generating images automatically or music is just perfectly normal. There's nothing strange about it. We adapt fairly quickly to technology and this is yeah it shifts things a bit, but we'll adapt to that.

Speaker 5: 1:52:06

But Nuance, we haven't solved it at all, but we are adapting day by day to it. That's what I was trying to say.

Speaker 3: 1:52:14

I would go more on the utopian side, Still on the large existential bits unchanged.

Speaker 5: 1:52:20

But is it unchanged, the way you have viewed? I believe asked you this question way back, but it hasn't really shifted no, no what about you?

Speaker 4: 1:52:27

salah um, I think, when it comes to the expectations on llms in general, I feel like it's starting to sober up. Um, people are starting to see some applications that can be interesting. People are starting to see some applications that can be interesting, but it's not as sort of insanely. Agi will be here next year, which I think is good because let's be balanced humans, yeah.

Speaker 2: 1:52:53

And I yeah, if AGI will be here in 3, 10 or 30 years. Have you moved more towards a positive or negative view of that future?

Speaker 4: 1:53:04

I think I'm still equally skeptical to why we would do that, because I don't really see the value of it. As a business person, I don't think we, at least in industry, do things that don't create business value. We might try to do it for a couple of years and then it gets shut down for budget reasons.

Speaker 2: 1:53:19

I wish I could continue speaking about this, this we will continue after the camera shuts off.

Speaker 1: 1:53:24

Yes, uh, I think, uh, I'm also so, to be uh frank, the the reason that not much has changed is because the field has not evolved. The models are still the same. So there's there's not so much that has changed on the um, maybe, maybe the. The feeling now is that since things are moving more slowly, it could take longer. That's sort of the general feeling, and we have some some other thinking that that might indicate that it's not going slow, um, but I think that um, the the main problems are still unsolved. I think.

Speaker 1: 1:54:02

Fake fake news, fake images that's a problem, but we've lived with that for some time already. I think we will adapt to that. I think I. I think the main problem is if we really get to agi, uh, what will we do as a society? And that's that. I see no, I see no thinking around how we will solve that. We were talking about a UBI before, universal basic income. What happens when? When the AI can do 90% of all the jobs? Some will be quick to adapt and will find their feet quickly and know what to do. But for the people that don't go into ai and don't learn these things and aren't on the, on the bleeding edge of technology, I don't see what, what we will do as a society to to solve those things, and that's that is my. What do you?

Speaker 2: 1:54:58

do on a weekend? Yes, what are you on a weekend? Can't you simply do more of that.

Speaker 1: 1:55:04

Yes uh, definitely, uh, the thing is, uh, I don't see any solutions. I don't see anybody working on solutions in the society. I'm talking about people getting unemployed and finding no value in life. If we can solve that through basic income, I don't think that's the solution. It could be in some form, but I hear no thinking around that. That is my biggest fear.

Speaker 2: 1:55:34

I'm biting my tongue here.

Speaker 1: 1:55:38

Generally I'm positive, but that's just because I'm positive about technology and future. But I see great challenges that I see no solutions to can I?

Speaker 5: 1:55:46

can I, as as a way to give my answer, into this? I'm trying to figure out if there is a new burning, pressing topic, or you know, because sometimes we get into this philosophical AGI discussion it almost feels like it's blinding us to some of the real stuff we should be talking about right now. So I've been kind of thinking to answer that question is there anything that has sort of moved up on the agenda that I think is way, way more important that we should be discussing and arguing about or debating or trying to find solutions around based on this year, and I think to me that's why I've been circling back to this whole agentic thing, because the whole, you should have a t-shirt, yeah.

Speaker 5: 1:56:44

I will have an agentic, agentic Henrik. No, I will have a t-shirt called agentic reform, because I think the way we have structured public sector, the way we have structured enterprise, has not really discussed agency and how we understand ownership and mandates and accountability and autonomy.

Speaker 2: 1:57:11

We simply use the same way of working as we do for humans, but for AI systems as well.

Speaker 5: 1:57:16

Exactly. But there needs to be an alignment here around agency, between the agentic AI systems and the teams who have agency, and I don't think the way we've organized division of labor fits anymore with this agentic view. We are moving into a way where we have systems making decisions, so automation on one layer has moved up the augmentation and higher abstraction layer. So we are automating stuff and we can have decision-making in our systems. But the way who is making the decision and how have we distributed decision-making in organizations is not really agentic, it's division of labor, it's.

Speaker 4: 1:58:03

I think that's very depending on the organization and the culture and the organization.

Speaker 5: 1:58:06

Not really.

Speaker 2: 1:58:07

It is really deeply rooted. And I would like to cite also John the cool one more time. He basically says you know, we have been working with intelligent agentic people for thousands of years and we have developed ways to do that, and why should not those rules and way of working work for intelligent agents or agentic agents as well? I think that's an interesting question, so I'm hoping we're not developing special rules for AI too much it's not about special rules for AI, it's about cohesion.

Speaker 3: 1:58:41

Let me close this on a much more darker note. One of the reflections that I made this year was, again when the whole Israel-Palestine thing where I was frankly embarrassed to be human I was like, okay, we're talking about here, ai ethics and we humans haven't got the basics in place. We're killing each other like crazy, and are we really ready for it? How can we possibly teach AI what good ethics is when we don't?

Speaker 5: 1:59:19

know how can we teach AI systems about decision-making when I don't think we are doing it so well?

Speaker 2: 1:59:27

That's an awesome way to end the podcast.

Speaker 1: 1:59:29

I think it's a very profound way at least.

Speaker 2: 1:59:32

Thank you very much, luca, for that great ending and the rest of awesome discussions, as usual. Sala, thank you very much, as usual, great to have you here. And Jesper, awesome to have you, and I'm really looking forward to the after work now to continue the discussions. Thank you as well, henrik, great to see you, and Goran.

Speaker 5: 1:59:52

And I concur.

Speaker 3: 1:59:54

Thank you so much for having us.

Speaker 5: 1:59:56

Thank you.

People on this episode