AIAW Podcast

E116 - RAG models to autonomous AI agents - Jesper Fredriksson

February 02, 2024 Hyperight Season 8 Episode 2
AIAW Podcast
E116 - RAG models to autonomous AI agents - Jesper Fredriksson
Show Notes Transcript Chapter Markers

Tune in to Episode 116 of the AIAW Podcast for an insightful exploration into the world of AI with Jesper Fredriksson, AI Engineer at Volvo Cars. This episode delves into the complexities of RAG (Retrieval Augmented Generation) models and their transformation into autonomous AI agents, spotlighting their impact on various industries. Highlights include Jesper's role at Volvo, the application of RAG models in the automotive industry, the evolution of language models, and a thought-provoking discussion on GPT-5 and the implications of Artificial General Intelligence (AGI). This episode is a treasure trove for anyone keen on the latest in AI technology.

Follow us on youtube: https://www.youtube.com/@aiawpodcast

Henrik Göthberg:

or chat GPT experience with your father. What was that all about?

Jesper Fredriksson:

Super, super funny. It was the best recent experience I've had with chat GPT, maybe the best ever. So my dad is 76 years old and he's going blind because he's retinitis detaching. It's kind of a disease that can happen. So by now he's almost completely blind and he doesn't have a phone because it's hard for him to operate a phone. But I tried to get him to use chat GPT in the voice mode, which was mind blowing in a way. So let's see, I was. I was going to answer a call from him and I accidentally answered it on the phone.

Anders Arpteg:

You can still make a call on the cell phone or two.

Jesper Fredriksson:

It doesn't call from a cell phone. He has an old, old style phone. I mean, on the back end it's sort of a mobile phone, but it looks like the old phone with a cord and stuff A Batmobile, you just pick up the.

Jesper Fredriksson:

Exactly, it's a Batmobile and I picked it up on my computers instead of on the laptop, instead of on the phone, because it was closest. I just answered it, which was a good thing because then I could use the phone. And I realized that that you want to talk to to the AI because it's been, is always obviously heard about chat, gpt, and now we can try it. So I said, do you have any questions to to the AI? And he was like yes, of course I do. And then he was asking like what would happen if I was in the middle of the earth? Would I be weightless? I was like that's a pretty good question to ask.

Anders Arpteg:

So you're about the answer to the? Ok, please continue.

Jesper Fredriksson:

And he doesn't speak English, but a chat. Gpt is good at Swedish as well. So I talked to the phone and asked the question and it responded back in Swedish, saying that yes, you would be theoretically would be weightless if you could stand in the middle of the earth, but it's very hot and etc.

Anders Arpteg:

You would be squashed as well, probably.

Jesper Fredriksson:

Yes, there is no, it didn't say that, but yeah, it made it. It made a pretty good answer.

Anders Arpteg:

So your father? That is going blind. He could suddenly then have a conversation through chat to be his voice mode in some way. Yeah, you couldn't really read or type, but he can still speak properly. And then you connected him through the computer to the.

Jesper Fredriksson:

Yes, yeah, so. So now, whenever I talk to him, is always like can you get me to talk to you again?

Henrik Göthberg:

But, and then that of course evolves into the next project now. So how can I help my father? Yeah, what should he have? So, what you're thinking you.

Jesper Fredriksson:

Yeah, it started. Actually, I think it's the first time I got the idea was with the launch of the rabbit R1 that we mentioned previously.

Anders Arpteg:

I was like the new device, that is, like just an AI phone without the phone features Exactly.

Henrik Göthberg:

We, we, we joked about it. You call it the AI walkie talkie or something like that. I mean like it's a really slim down Feature set.

Jesper Fredriksson:

Yeah it's pretty neat.

Henrik Göthberg:

Like two hundred dollars.

Jesper Fredriksson:

you said yeah, two hundred dollars to preorder. It's also made. The design is made by Swedish company Teenage engineering. They're super famous in in like the world of synthesizers and stuff like that, so it's an interesting collaboration. So you were, you were starting to scan.

Henrik Göthberg:

Ok, what is the most simplest approach to to figure this out? And then this is. This came out. This was the start.

Jesper Fredriksson:

This was the start of the whole thing, and then I realized OK, so I missed the first patch and I'm going to have to wait for that phone, but maybe I can just buy him a cheap Android phone and customize it so he can start the the chat In the face. I think that's one of the problems, Like if you start chat GbT you also need to press a small button at the lower end of the screen, which would be difficult for him.

Henrik Göthberg:

So I think maybe it's possible to generate a macro that would open that up at the press of a button and some fiddling, and then you can do it.

Jesper Fredriksson:

Some fiddling yeah, but it would be worth it, I think, just because he's so happy when he's talking to it.

Henrik Göthberg:

I think this is a nice little project right. Yeah, it's super nice.

Anders Arpteg:

It must be amazing for a person that is otherwise, you know, going blind. And having I mean speaking to people, of course, is the best thing, but if you don't have that, just being able to speak to someone that you can ask anything and get so much knowledge from would be an awesome thing.

Jesper Fredriksson:

And somebody that doesn't get tired of you.

Henrik Göthberg:

But how far away are we from doing that? Yeah, via Alexa or via one of the other ones, like, do you get a cold assistant?

Jesper Fredriksson:

Who knows? I don't know if anybody knows when they will catch up.

Henrik Göthberg:

It's super weird, but Because you could imagine that you could. You know, thinking out of the box.

Anders Arpteg:

You have Alexa and you have a certain chat app where you press it and it goes into chat GPT mode, so you literally can take away all the sort of so instead of I guess now with if you saw Samsung 24 release, they have the Gemini Nano, the latest kind of Google model that is actually built for embedded devices and phones. So they can have like live translation now in Samsung 24. And I guess now with that they could put it in the Google Home Assistant as well.

Henrik Göthberg:

Yeah, I mean, they're not far away now they cannot be, far away they're already working on it, of course.

Jesper Fredriksson:

I wonder if there's another reason. I mean, why haven't they done it already? It should be so simple, as you're saying. If it's already in the Samsung phone, why isn't it in my Alexa?

Anders Arpteg:

It's weird Alexa, but Google.

Jesper Fredriksson:

No, that's a Google one.

Anders Arpteg:

Do you have an Alexa at home or?

Jesper Fredriksson:

I have one Alexa and one Google Home and which one is better, in your opinion.

Henrik Göthberg:

In terms of understanding, you are giving the expected answers that you really want.

Jesper Fredriksson:

So the Excuse me. I think the I'm using Alexa more, but that's just because of its placement. I think Google Home has always been better because I had it in the kids' room and it's easier for them to talk because they can speak in Swedish. I don't think Alexa speaks Swedish.

Henrik Göthberg:

I remember this vividly. You know driving cars and driving with Apple Siri versus driving and doing small tasks with the Google car. And to me, I completely stopped using Siri quite fast because it's like it never got the address right, it never got the. So I think for me the Google incarnation has been superior in terms of understanding.

Anders Arpteg:

That must be a perfect product for your dad as well, to have just a good home assistant and just speak to it all the time?

Jesper Fredriksson:

Yeah for sure.

Anders Arpteg:

Then you don't need the rabbit or the Android phone.

Jesper Fredriksson:

Yeah, and I forgot to mention. Maybe the best part of that interaction that I had with my dad was I realized that chatGPT can also take photos, or you can read photos. So I tried it when he was still online and I took a picture of my window. So I still had my Christmas decorations and I had this light, whatever it's called. I know what you mean. Yeah yeah, so I took a picture and asked Google. Sorry, I asked chatGPT to describe what was in the picture.

Jesper Fredriksson:

And he said there's a black star. That was the light I was talking about, and I could see the building on the other side of the outside and I could see the pots, the plants in the window pane and he said it's probably a living room, which it was. That's amazing.

Anders Arpteg:

That must be awesome for your dad to just ask you to explain any image.

Jesper Fredriksson:

Yes, I mean he goes out for a walk sometimes and that's a real journey for him because it's not certain that he won't get home again. It's super hard for him, so, just for him to be able to take a picture and say is there motorway in front of me? So this is really important.

Henrik Göthberg:

But it's an interesting anecdote here because it opens up how little we have scratched the surface of the applications and how we can tweak it for different situations. So I think Voice is a quite nice addition, right? Have you started using?

Anders Arpteg:

it Not that much now.

Jesper Fredriksson:

I can't say I'm trying to use it now. I think it's interesting to. It becomes more of a discussion. In that way it feels more natural. It's always difficult to get it to not start talking again, because once you make a little bit of a pass, it's hard to not get it to speak, even when you're not done with the prompt.

Anders Arpteg:

I usually ask questions with weird technical words in it and that's really hard to do through Voice. Exactly, yeah, so interesting, and I wish you the best with your father and hope that Google actually do their job and probably put Gemini into the premises.

Henrik Göthberg:

I'm a little bit curious about the rabbit, if that's a gimmick. Is the rabbit gimmick, a gimmicky stuff, or is it useful?

Jesper Fredriksson:

I think it's probably useful. It seems legit from what I can say.

Henrik Göthberg:

The bottom line is that there is this Japanese saying, shindugo. Have you heard about that expression? So you can Google Shindugo and it's Japanese words for useless inventions.

Henrik Göthberg:

And you can find so many fun pictures of the most useless things that people have sort of. I saw chopsticks with a fan on it in order for and it's like, well, it's a good idea, but it's kind of stupid, right? Et cetera, et cetera. So Shindugo is the cool question, right? And now time will tell. In a way, is this an invention that has a place? Potentially has, I think.

Anders Arpteg:

Well, with that, we'd love to welcome you here, jesper Freirexon. You're a dear friend of mine and we've known each other for a long time. I'm trying to think. When we met, it was at the Rex's, the Rex-Megacystem conference in Vienna. Maybe 2015, maybe 2016, somewhere under 16, 15 at some point.

Anders Arpteg:

Yeah, that was awesome time, I remember yeah yeah, yeah, and the following conferences in the coming years as well. And we've enjoyed I've enjoyed at least a lot of very deep discussions with you and your friends and it's a pleasure to actually have you here, and hopefully we can have some really deep discussions now as well here in this podcast. Pleasure to be, here Awesome and you're an AI engineer at Volvo and you had some other positions. That's very interesting previously, but let me start simply by asking who is Jesper Freirexon?

Jesper Fredriksson:

So, as you said, today I'm an AI engineer. That's where I call myself, at least.

Anders Arpteg:

What do you think about these roles? I mean, there are so many different roles now. You can be a data engineer. You can be a frontend engineer. You can be an ML engineer. What's?

Jesper Fredriksson:

the difference between.

Anders Arpteg:

ML engineer and AI engineer Is that one?

Jesper Fredriksson:

I think the way I think about it is we're moving into generative AI and then it becomes more of a focus on the coding skills, I think, to understand the technology behind it. I think there's been a long progression where the science part of this becomes more and more commoditized, whereas the engineering becomes more and more important.

Anders Arpteg:

Because you've been a data scientist, of course, previously as well.

Jesper Fredriksson:

I mean formally my title is still data scientist because I just moved over, as I was from my previous company, so Volvo brought in a subsidiary from them. Sorry, ai engineer describes better what I'm actually doing, which is to make really useful products out of AI, basically to use. In many cases, it's just using an API. It could be using a local model as well, it doesn't matter really, but using engineering to solve interesting problems.

Henrik Göthberg:

But this is a deep question here because I mean, like, what is more important, the science part or the engineering part? And we can put that on the list because I think they are in some ways equally important, but they are very different actually. And the next question is do we need more scientists or more engineers? And that is maybe the bottom line question and I have my view on that.

Anders Arpteg:

I call myself also primarily an engineer before a scientist. I also have been a scientist as well, but let's go down the rabbit hole just like this.

Henrik Göthberg:

That's a good one isn't it?

Anders Arpteg:

We can easily get stuck here.

Jesper Fredriksson:

I've heard this phrase before.

Anders Arpteg:

But just very quickly at least, if I give my view a bit and then please let me know what you think about it.

Anders Arpteg:

For me, like, science is for the purpose of building knowledge and engineering is for the purpose of building a product or finding value. And you need engineering to build knowledge. So there are overlaps and you need also science to build products and value. So they are both overlapping, but there is a clear distinction, I would say, in the purpose of what you are doing. So if you more into the data scientist role, the purpose is really to show, to do proof of value in some way that there is. You build the knowledge that there is some kind of value you can find, but you don't build the value, you don't produce the value. But if you are an engineer, the purpose is not necessarily to understand if you can do something, it's really how you should do something.

Jesper Fredriksson:

I totally agree. I think the movement I've been seeing is that this thing about having the original idea, which is sort of the science part of it, the improving concept, that's always going to be important. But that's only so far along the way. It's only the start of the journey and if you look at what companies are doing, there's room for maybe a few data scientists in a company, but there's room for a lot of engineers. That's so much more needed Because making the proof of concept, that's easy, you can do it in a notebook, but then you have to do all the hard part of making it into production, which is also very interesting.

Henrik Göthberg:

This is a huge dilemma because I think we set us up for a problem in 2012, 13, 14, when this was evolving because we hired people for the science part when we really needed a shitload of engineers and I think we know for a fact how many notebooks are out there in production right, which is a testament that the wrong person got the engineering job and I think this is a major shift that started maybe 2022, where we are talking more and more product. I've been involved in large enterprises and we have AI initiatives and AI projects, and then when I talk and listen to you and how the real hyperscalers are talking, it's product teams and there is a huge difference in understanding for funding, governance, steering and setup, and we go all the way back to the technical depth of machine learning.

Jesper Fredriksson:

I would say the movement started earlier. But yes, you're definitely right. I think when I started at my previous company, volvo Car Mobility, which is the company behind the service Volvo on demand, then everybody in the data team was data scientists and we quickly realized that we're not doing data science work, we're doing everything but data science, basically because that's what you do in the startup which it was.

Anders Arpteg:

I think also Elon Musk has really shown the status of engineering.

Goran Cvetanovski:

Before you know.

Anders Arpteg:

I remember when I was Spotify. We call everyone data scientists, but in reality, they were building like a click view dashboard, and that is valuable as well, but don't use the wrong term for it. There are still value, and people value different things and have different passion and interest and skills. Some people should be more of a science people and other people that are more interested in engineering, which I think is super fun and more fun even. They should work with that and for a company. I think it's important also, as you say, to understand the proper balance here, because you need both. What is the balance?

Jesper Fredriksson:

It's not 50-50. It's a question of expectations also. If you come into a company and think you're going to do data science and then you do click view dashboards, then you're not going to be happy.

Henrik Göthberg:

And if you're higher data scientists and they do something else, then maybe you're not happy either, but to some degree we are ourselves as an industry at full tail, because we are trying to find words and figuring this out and some words stick right and that just takes the majority of the population, who's not really into the details, into the wrong path.

Anders Arpteg:

But I like your title. That's basically what I'm trying to say. I remember still speaking to John Bosch, a professor in Gothenburg, and Charles Merchant we had it on the show as well. He spoke about AI engineering.

Henrik Göthberg:

This is a couple of years ago. I would concur that the mature companies that actually works hardcore. They understood the product angle and the engineering angle earlier, way earlier, but the majority is still almost getting there now. This is my observation.

Anders Arpteg:

Well, okay, we established, you have the coolest title, I think in AI engineering.

Henrik Göthberg:

Yes, so data science is not the sexiest job, ai engineers are the sexiest job on Earth.

Anders Arpteg:

Cool, perhaps you can just elaborate a bit more. Before you came to Volvo what did you do then? What did you work with?

Jesper Fredriksson:

Right, so I can start from the beginning. I'm an engineer from the start and then I moved into, sort of by chance. I ended up in brain imaging. I was in a project to generate databases about brain imaging and then doing analysis on those as a PhD project. I did my half-time dissertation and then I ran out of time. Then I moved into doing the same thing more or less for a company. I moved from working with the healthy human brain, understanding the function of the human brain, to looking at aging in Icelandic people of all things. They have an interesting population with genomics and stuff, so there's a lot of gene data on people.

Anders Arpteg:

Are we going to be able to reverse aging?

Jesper Fredriksson:

That is a very good question. Now we're going into rabbit hole. Let's skip it, ignore it.

Jesper Fredriksson:

I think so, but let's skip it. Okay, so that was the start. Moving into aging and brain imaging, I was working for a consultancy company that did all sorts of things. This was one of the things we moved into doing some sort of quality assurance of the scanners that produced those images, which was also fascinating, even if it sounds maybe boring with quality assurance, but you have to really understand the dynamics that create the images. It feels like I spent a year or two in Fourier space doing that, which was super fun.

Jesper Fredriksson:

Eventually, it felt like I had sort of hit a dead end on this brain imaging path, which is maybe premature. Maybe we'll get back to that later. Save that, I moved into data science instead, and now we're back into our discussion about data science and engineering etc. I worked at different companies in Stockholm with data in some respects instead. Then I ended up in this startup owned by Volvo doing this product of car sharing. After that, in October of last year, we moved into the mother company as a strategic shift from the Volvo organization. That's where I'm at now and I'm currently working half and half. I'm working 50% as a data scientist at the car sharing service and 50% with generative AI at Volvo cars.

Henrik Göthberg:

You're actually reporting it to two different streams in this sense.

Jesper Fredriksson:

It's not ideal. I'm trying to make that work.

Henrik Göthberg:

One is around the algorithms and optimization of the car sharing service. When it comes to generative AI topics, are you looking at use cases across Volvo cars more broadly?

Jesper Fredriksson:

Exactly. Volvo cars is, of course, a huge company, and I don't even understand how huge it is yet. I started in October. I'm working at this small part of Volvo, which is maybe 1,500 people, which is called something like commercial digital. That's the sphere that I'm working with now. That's the part of Volvo that's in Stockholm.

Henrik Göthberg:

Volvo made a strategic choice to set up this commercial digital setup A couple of years back, I guess, and Mobile M was part of that.

Jesper Fredriksson:

M was actually a precursor to the company Volvo Car Mobility. We started a little bit earlier and then Volvo cars followed and started a tech hub in the same offices that we had.

Henrik Göthberg:

That's the story.

Anders Arpteg:

Perhaps we should try to move into the main theme of the discussion. We were planning to speak a bit about RAG models and then perhaps moving also into autonomous agents.

Henrik Göthberg:

From RAGS to agents. So how did we frame? It Sounds like from RAGS to riches. So, RAGS to riches becomes RAGS to agents in the space and time of AI. Perhaps, we can start a bit.

Anders Arpteg:

Moving into the RAG part, what is really retrieval, augmented generation? Perhaps you can put it in the context of Volvo as well, if you can have some complete example and start to explain a bit. What can you do with RAG at Volvo?

Jesper Fredriksson:

I want to start from chat. Everybody knows what chat. You ask a question to the AI and it answers. In the best of worlds there are some problems with that. If you want to work with company data, you have some proprietary data that you want to ask about, then chat Chativity doesn't know about that data. Also, it may produce hallucinations. It may say things that are not true If they don't know, if it doesn't know, then it may make things up. The solution to that, or rather a way to alleviate those problems at least, was the retrieval augmented generation that took really shape in last year, I would say.

Henrik Göthberg:

It became a hyped word, one of the best words of 2023, the RAG. I don't think we talked about RAG in 2022. We couldn't have Not in 2022 now 23 years.

Goran Cvetanovski:

Relax and boom.

Jesper Fredriksson:

RAG. The solution to that is we try to bring in our proprietary data or some kind of data we have. That can both ground the model.

Anders Arpteg:

Just to elaborate on what you said before. One problem is hallucination. The model makes up stuff unless it actually has some data related in its parameters of the billions and trillions of parameters that the chativity has. But I guess another problem is it can't really find new data. So unless if you have your own data and you want chativity to be able to reason about that or find that, there is no way to do that unless you use something like RAG.

Jesper Fredriksson:

Yeah, and it's basically prompting. It's basically taking part of what you already know. You bring it into the prompt to say this is what I know about this problem. Can you help me solve this problem?

Anders Arpteg:

I would say even it's like a hack, Since you can't really change the parameters of the model. You just put some data you wanted to have access to into the prompt instead.

Jesper Fredriksson:

It's a very pragmatic solution and of course, the alternative is to do something like fine-tuning, where you would continue the training of the model and use your data to make the model actually understand what you're doing. But this is sort of the hacky way to do it.

Henrik Göthberg:

Yeah, good, and I mean like elaborate a little bit like what has been the limitations? We know and we have talked about tokens and all that kind of stuff. And so in the space of thinking about doing these kind of hacks the RAG hack what has been sort of the constraints on that approach? Like that you needed to think about from the beginning how to come around tokens, size of the context windows yeah, exactly.

Anders Arpteg:

So 32K right in the biggest.

Jesper Fredriksson:

So if we're talking about the context window, I don't remember what it started with. I think maybe 4K from the beginning, and then GPT-4 came out with a version that had 32K, but now we're far ahead of that. So let's take a look at the graph. Is it 100,000 tokens in the newest GYPT4?

Anders Arpteg:

I don't know actually, is it really that much? We can google that.

Henrik Göthberg:

I think I remember a number like 128 or something.

Jesper Fredriksson:

Yeah, I think it's 128k.

Anders Arpteg:

I think Claude has that, but I'm not sure if.

Jesper Fredriksson:

Yeah, the newest GYPT4 has that the GYPT Turbo.

Henrik Göthberg:

This is the Turbo. I mean this is part of the incremental opportunity, because I think the big deal is like, wow, if we have 128k, that context, we know, is quite large, you can actually Super large.

Jesper Fredriksson:

It's super large.

Henrik Göthberg:

You can take a lot of data now.

Jesper Fredriksson:

It is super large, but there's a lot of buts to that, because you can't effectively use all of those 128k.

Anders Arpteg:

So the Turbo goes to 128k, yeah.

Jesper Fredriksson:

The problem is you can't use all of those, because then you will end up with problems. There's this famous paper called what is it called? Forgetting in the middle, something like that it always ends up. It's better to put things in the beginning or at the end of the prompt, because it somehow forgets the middle A little bit, like people as well.

Henrik Göthberg:

Like people right how you do a keynote speech. It's funny, it's really interesting isn't it?

Jesper Fredriksson:

Yeah, so anthropomorphizing already now, but it is inevitable.

Anders Arpteg:

It's quite a lot of tokens or words you can still put in the prompt, so you can use the rag to put in quite a lot of words, and is that?

Henrik Göthberg:

the primary, is that one key constraint when you're thinking about rags. Can you circumvent this constraint?

Jesper Fredriksson:

So it definitely is a constraint because you can't put in all your data in there. If you have a software project, it can easily go beyond that, and even if it were I mean you don't have the full 128K to work with let's say I wouldn't use more than, let's say, 40,000, 50,000 at max.

Henrik Göthberg:

And this is learned by experience, sort of thing when it's cost-deteriorating.

Jesper Fredriksson:

Yeah, I'm still finding out where is the optimum.

Goran Cvetanovski:

Where is the limits?

Jesper Fredriksson:

The limits is the hard limit is 128K, but the optimum that's the optimum.

Jesper Fredriksson:

That is really a question, and I so. The day when this was released, this 128K I was implementing this rag system for one of our things that we're working on, and the solution that I was going to talk about is to use semantic search to know what to put in the context to what to put in the prompt. So you have all this big thing that you want to put in, but you select only a few relevant documents based on how similar the documents are to the question that you're asking.

Henrik Göthberg:

So you're doing other things now to build a system to be smarter, efficient, more to the point.

Jesper Fredriksson:

Yes. So this hack that we're explaining now is like okay, so we can't put everything in, but we're trying to find, using embeddings to find what are the most similar things in your document store to the question that you're asking, and then you use that as context. But when this model was released, it was during the OpenAI developer day or whatever it's called. Then I realized this is so much context, I don't need to do the semantic search, so I tried to just stuff everything in there and it worked, but I could see that there was a limit to it and it was far smaller than 1 in 28.

Henrik Göthberg:

So you were sort of coming out of the developer day. Oh, I skipped two features on my backlog and I can do this instead, and then I'm fuck, it's not that simple.

Jesper Fredriksson:

Yeah, but I could get to the proof of concept at least. So this example is I think I showed it to Anders at some point. It's a BI tool where you can ask questions to your BigQuery database by sending the natural language question to OpenAI and getting a SQL query back and then sending that SQL query to BigQuery and getting a result back.

Henrik Göthberg:

So we are now talking about another key topic in itself when is the evolution of BI going?

Jesper Fredriksson:

But that's a super interesting.

Henrik Göthberg:

Because that's exactly what you're working on.

Jesper Fredriksson:

That's one of the things.

Anders Arpteg:

yes, I think we need to just elaborate or explain a bit more. So we had a natural language question that you asked to ChatTBT. But you can get a BigQuery, which is basically an SQL database that Cloud provides and it has an SQL interface. But BigQuery or sorry, chattbt can actually output an SQL in BigQuery language that you can send into it. Exactly Then it gets some structured data back, but I guess it's also get a lot of texts.

Jesper Fredriksson:

Is it mainly texts that you retrieve, or is it more structured data or I mean I get the results of a SQL query and I mean it's just the BigQuery API calling it through Pandas and then you get back numbers, or it can be.

Henrik Göthberg:

Whatever is in the BigQuery, it's whatever.

Jesper Fredriksson:

So you get structured data back from the database. It could be unstructured as well, but it's mainly structured.

Anders Arpteg:

Text fields, I guess.

Henrik Göthberg:

But without going into details, could we elaborate a little bit about the chain here? And what is that? What's the use case? I mean like. So someone wants a report on something. I guess, or what is it. So, without going into specifics like what is the concrete manipulation, here.

Jesper Fredriksson:

Yeah, so this is sort of a Somewhat of a hobby project of mine, but it is built on my sort of experience, and sometimes frustrations, of being in a data team. In this case it's about In a data team you typically have a lot of people from the organization asking you questions about In the car sharing business. It's like how many journeys did we have yesterday, or how many users signed up, or something, and they usually post that to Slack and somebody from the data team picks it up. It's in a team channel and it provides a lot of value to somebody of the People in the organization. But it's often not the most fun thing to drop everything that you have and answer that question. So it seemed like maybe with chativity I was like maybe we can automate this. So then me and a backend engineer decided to have this as a hobby project to see can we put in a Slack bot that can answer these questions? So that's now in production in this Volvo Car Mobility Car.

Henrik Göthberg:

Shareing Company. But I think the key point here is this is also what becomes engineering, because it's not only about making the actual query work, it's about contextualizing and putting it in a way so that it's adoptable in the larger context of the enterprise. And in this sense, people well, they use Slack and they go there. So we don't change their behavior. We create a Slack bot.

Jesper Fredriksson:

Yes, yeah, I think Slack was a real good vehicle for this. Everybody's in there, and it doesn't matter if you're technical or not. Everybody's in there all day.

Anders Arpteg:

So it's a non-technical person can ask a natural end question about the number of cars per month or something, and that gets translated by chativity to a BigQuery question, sql query. It actually replies back the results in some way.

Jesper Fredriksson:

Yeah, we currently don't have any graphs or something, but that could be an extra. It's basically the table that is directly returned.

Anders Arpteg:

But how does chativity know about the schema in BigQuery? Do you provide that in a prompt?

Jesper Fredriksson:

So that is the context. That is what I was talking about. This thing, when it got up from 8K or 32K up to 128K, all of a sudden I could just stuff in the whole schema, the whole schema. It's a little bit more than the schema. So we're using something called DBT, data build tool, which is standard, and ETL tool. Yeah, yeah, exactly. And then we have, as the last step, there's a Jamel file which has the schema in a way, but it's more than the schema. It also has some information.

Henrik Göthberg:

It's a metadata. Let them answer. I'm trying to see if I can get this.

Jesper Fredriksson:

Yeah, yeah good, You're on the right track. So it's metadata in the form of, for example, enums. If there's a value saying B2B, B2C, so it will know that, because that's otherwise a hard thing for Chatchivity to know about my database what are the different values in a table, in a column. But that's if you have these enums, like something that's Enumerations. Enumerations. Like you, have a limited number of values and that's in this file, for example. That's a big help. There's also a little bit of descriptions in this file, if somebody has made documentations.

Anders Arpteg:

Cool. So in that way, chatchivity understands how to formulate the BigQuery. Yes, simply works.

Jesper Fredriksson:

Yeah, it's really like magic when you see that the first time Like I can answer more or less any question. It's like within.

Henrik Göthberg:

But let's dissect now for me, who doesn't understand shit, the tool chain here, because we talked about pandas as well, you talked about Jamel here, so we're sort of okay, we're going from Slack and we're doing the query comes out and then basically so that whole chain. What are the key components here? The panda, the Jamel file, the, et cetera, et cetera.

Jesper Fredriksson:

Yeah, I think we've been through all of it. Really, it's like a Slack bot has its own API towards Slack. And then there's the context. That is stored, not in a vector database, because we didn't go that far it's still too small to use a vector database but it's just stored on file. So there's Jamel files and at query time, when a user question comes in, we're trying to find If we're doing the semantic search which we're not doing at the moment, but we started with doing that then we're trying to find which of these parts of the schema or the Jamel file are most relevant to this question, and then we provide that, basically as if you're using Chattabiti, as in cutting and pasting into the window. That's what we're doing. That's a hack.

Anders Arpteg:

I'm really happy with that. It's cutting out the job of the data engineer, in this case right, or allowing correct access to data from non-technical people.

Jesper Fredriksson:

I'm thinking of. We're cutting out the data analyst Got the Slack.

Henrik Göthberg:

The Slack prompt will be picked up by the data engineering team who had to go in and do an SQL query.

Anders Arpteg:

And a data analyst it just.

Jesper Fredriksson:

It's the data team who wouldn't need to do it, and it's At least in my team. It could be a data engineer, it could be a data analyst or it could be a data scientist and it's fairly junior task in terms of SQL or whatever it is. But I mean, I think you can have have juniors in your data analysts and data scientists as well.

Anders Arpteg:

Right, you can be a senior as well. For more complicated queries, you need a senior analyst as well.

Henrik Göthberg:

And how important in this context then in the old way, was it to really know the schema well, I mean so it's like really really deep domain knowledge.

Jesper Fredriksson:

Yeah, yeah, this is actually one of the most profound things about this For me, as a data scientist working on this. It's hard to keep track of all the tables Because I mean it's evolving all the time, exactly so somebody builds a new table or somebody changes the definition, somebody changes the name of a column. It's really hard to and I need to remember which table was this information in. I think I would need like a chat to be, just to remember all the different tables and columns, but that's super easy to chat to be, so I don't need to think about anymore like where is this information, because that is just automatic for but this is profound because one of the key challenges has always been we need Data Stewards.

Henrik Göthberg:

Who actually knows what the content is?

Goran Cvetanovski:

in here.

Henrik Göthberg:

And then you know, but the Data Stewards doesn't get what the schema is. You know it's useless, right, and this is a major headache for most companies who are not living in sort of the coding world. And here all of a sudden now you kind of circumvent the problem.

Jesper Fredriksson:

Yeah, yeah, and it's a super interesting topic.

Anders Arpteg:

Also. So we had an introduction to RAG models and you also given a use case example. That sounds really powerful. Involved as well in having an automated data analyst in some sense. Yeah, it's really cool. And if we started now on this end, we found this kind of limitation that ChatTPT has, without being able to access your own data, but now it can with RAGs. But if we go to the other end and speak more about autonomous agents, what is your preferred way of describing what is really an autonomous agent.

Jesper Fredriksson:

It's a good question it's also much harder to talk about because we're not really there yet. I mean, there are very few agents that we have. I think the first thing that comes after RAG is sort of actions. So we usually don't want just the information, we want to take an action. You could say in this case that we talked about this data analyst, that it performs an action. It actually gets the job done. It's like you're asking a question and you get a number or something like a number back and that's getting things done. But in many cases I mean we started talking about the rabbit R1, for example, this small device Maybe you want to book an Uber or something, and then you need to take an action to do that.

Jesper Fredriksson:

You need to do the actual booking. You're not just happy with saying, go to the Uber app and select. That's not what you want to do. You want to do something. So that's one part of it, I think, to be able to perform actions. But there's also this thing of autonomous. What does that mean? And the way I think about it is it's something that can do things on their own, obviously, but it usually pertains to some higher goal, like you can specify. I want to perform this task, but it really needs to perform a lot of sub tasks to be able to do that. So you need to divide the problem into smaller tasks. Find out a strategy for how to solve the smaller tasks and tie them together and solve something bigger.

Anders Arpteg:

Could we first that we have to quote Lomas at least a couple of times. And when it would take self-driving cars, a properly level five self-driving car, that would take a lot of actions and it would be fully autonomous as well. If it's level five at least, but you don't even have a wheel to take control anymore, right, that would be like the optimal or the ultimate kind of autonomous agent, in some sense At least in self-driving world. It could be even more general, of course.

Jesper Fredriksson:

Exactly so. It's definitely an agent in a constrained but still very general setting. It's a super hard problem and it's so interesting to take that in contrast to chativity, for example, where you just feed it a lot of textual data and all of a sudden it knows how to answer Everything, whereas the task of self-driving is much more constrained, and we worked on it for years and years, and years and we're still not quite there.

Anders Arpteg:

I think it's kind of interesting. It's like a paradox in some sense, because you would normally say that chativity is general, or it's not super general, but it's more general than a lot of other AI models at least. And then if you take self-driving cars, it's rather a specialized kind of task. It's in reality just in a brake, accelerate, turn left or right. That's the only thing you do. But to do that you need to have so much general knowledge, much more than chativity has. So even chativity would never be able to drive a car today.

Henrik Göthberg:

But isn't this conversation somewhere about talking about generality in at least two different dimensions?

Anders Arpteg:

Yeah, I think so.

Henrik Göthberg:

And because there's one generality about all the stuff that you can search, but it's very narrow in terms of what you can do with it. And here you have the opposite.

Anders Arpteg:

I think if you just use the terms that Elon used, I think it will be a bit clear. I mean, the chativity is really good in perception, I think. But if you have a self-driving car, you have at least three different parts of it. Perception is one just taking the sensor data and trying to understand what that means and move it into vector space. But then you have the planning problem, and I think that's what you started to speak about, about breaking down into smaller tasks and being able to plan what to do and when to do them. But then you have a third problem about taking action, the control, as he calls it. So control is basically deciding if you should break, accelerate, turn left or right, and all of these are really hard to do, but I think you need all of them, I think, to be a properly autonomous agent. Yes, that's definitely the case.

Henrik Göthberg:

But if you start to summary the definition of autonomous agents, I don't know what are you thinking?

Jesper Fredriksson:

Yes, but I don't have a good definition for autonomous agent. I think, with the way I explained, that it has to do with actions and planning.

Anders Arpteg:

And that's basically it. You had the perception you need to also have some input, right?

Jesper Fredriksson:

Yeah, you need to have some input.

Anders Arpteg:

The perception and planning control.

Jesper Fredriksson:

Yeah, I'm still stuck a little bit with what you said about chativity, that it has good perception. I mean, in a way, at least the first model it has just reading. So it has very limited If you compare it to a human. We have many more perceptions that we can.

Anders Arpteg:

Chativity can have vision today, and audio as well, so at least it's a bit more.

Jesper Fredriksson:

Yeah, and it's really.

Henrik Göthberg:

But that has something to do with perception. Yes, but what about? I'm trying to use yours, the world model, here or something like that. Because, if you think about it, they have the cameras, which is sort of perception. Yes, they have the vector space that creates the world model. Then they have planning and action. The vector space is the fourth part, with the world model here in some sense right the fourth part. I mean, if we say perception, planning and control, are we done?

Henrik Göthberg:

And I think the the perception moves into vector space, where you do the planning and the yeah, and what I mean is that actually the vector space itself is what creates the world model where you can do the planning and the control. So actually there are four problems then.

Anders Arpteg:

No, no, no, I don't think so it's still free, why not? Perception moves into vector space. That is the world model, that's perception.

Henrik Göthberg:

But you have a world model, but I have a world model, but I still haven't said where I'm going to go, so I can't start the planning.

Anders Arpteg:

But all of them happen in the vector space. Right yeah, the world model.

Henrik Göthberg:

Yeah, so clearly, clearly. But isn't it a separate problem that you have an accurate world model?

Anders Arpteg:

Perception moves Compared to I would say, like perception moves from the sensory input to the world model vector, space Planning operates in the world model. Control moves from the world model to back to actuators, the actions.

Henrik Göthberg:

So I get it and I'm going to push you here If I'm going to put engineers now to do the different things to make this work. I need engineers to work on the perception part, I need engineers to think about what the vector space is all about and I need people to think about and engineer how the planning and control is taking place. So I get it to four problems that they actually think. Why four? Because the vector space itself needs to be created.

Anders Arpteg:

But that's not a problem in itself, is it you?

Henrik Göthberg:

want.

Anders Arpteg:

I mean that's just the representation. Yeah, all right.

Jesper Fredriksson:

I'm thinking a little bit about what we talked about ether, like there's something, there's some medium around us Exactly.

Goran Cvetanovski:

It's a little bit like that.

Jesper Fredriksson:

But I think we don't have engineers that are concerned about ether. I think that's.

Anders Arpteg:

That's a point to it.

Jesper Fredriksson:

Interesting.

Anders Arpteg:

Okay, but okay, so autonomous agents are. You know, if you compare that at least to shat-t-p-t, it's clearly not an autonomous agent. I think we can all agree that the shat-t-p-t is not Rags add a bit more closer to an autonomous agent. At least it can, at least if you do it like you did, because I think you added more. You added also the ability to take action and ask questions to BigQuery, and then it took some action. It's limited but at least some action it could take, and that created a bit more into the vector space which is basically the results from BigQuery.

Anders Arpteg:

But then to have a properly autonomous agent, I guess it needs to also be able to have much more control abilities or actions to take Right.

Jesper Fredriksson:

Yeah, I think it needs to be. You can have very limited agents and maybe you could classify this product as a very limited agent, but I think where both you and I are going with this is towards AGI. That's because that's why this is interesting, or?

Anders Arpteg:

maybe not. Can I just go in a small rabbit hole here? Let's take a chess-playing computer machine. Is that an autonomous agent? What do you think? What do you think?

Jesper Fredriksson:

Yeah, I don't know it's. I'm going to take a cheap way out here. If it moves the actual pieces, then it performs an action.

Anders Arpteg:

Just in a virtual world, so to speak.

Jesper Fredriksson:

I don't know. I would think of it as an agent. Yeah me too.

Henrik Göthberg:

So a digital chess game that actually can play is in a sense, an autonomous agent.

Anders Arpteg:

A very limited one. It's very limited what it can plan. It can only the vector space, for it is just the chess board, so to speak.

Jesper Fredriksson:

It's like Did you see this research paper on learning an agent to play Minecraft? I thought that was quite.

Anders Arpteg:

I talked to somebody who said that they were cheating a little bit, because Nvidia came up with the paper on this very recently.

Jesper Fredriksson:

I think this was maybe fall of 2023 that I read this.

Anders Arpteg:

This is just one week ago, but Nvidia came up with another agent paper for playing.

Jesper Fredriksson:

Minecraft, Okay, yeah, no, I did not see that it makes sense. I think Minecraft is maybe, maybe an optimal way to study agents. It's like an open world where you can do a lot of things. I have not played so much Minecraft myself, but I have kids who played it, so I know how difficult it is to learn all of these skills. It's like you have to combine different materials together and then you get this other material. It's like in real life you take coal and you make diamonds out of it or whatever, so you have to learn to do that and in this framework, the agent learned how to get these skills and Minecraft just by playing.

Henrik Göthberg:

Can we talk about some sort of spectrum here in terms of super simple, super narrow autonomous agents, and then further and further, because one way of I think it's really hard to think about this is that you go too far, so you're trying to figure out a very generalist autonomous agent.

Anders Arpteg:

Well, I guess the really far one could be the Tesla autonomous bot thing, basically a humanoid in a physical world that is supposed to take an action that it can.

Henrik Göthberg:

Yeah, that would be when it works in production and is doing certain tasks at the production plant at manufacturing.

Anders Arpteg:

I guess at the home of some elderly person or at your dad trying to help him with the analysis, then it would be an autonomous agent for a specific task or a specific set of tasks. Or if you can do anything, then it's very general.

Jesper Fredriksson:

Is that the optimus? Is that what we're talking about?

Anders Arpteg:

Yeah.

Jesper Fredriksson:

Yeah, I think in a way, this human is the perfect agent and this is the perfect formulation.

Anders Arpteg:

I think humans are very limited as well, right.

Jesper Fredriksson:

I think as agents, we're pretty good.

Anders Arpteg:

We're still leading, I would say, in generality, but it's easy to imagine an agent that is much more general than humans are.

Henrik Göthberg:

But it's interesting, right? Because let's take the human example. If I want an autonomous agent we are not autonomous agents for everything from birth. Right, you need to learn a craft, you need to learn a skill. So if I want an autonomous agent playing football, you need to acquire the learning and skills and the rules of the game, right? Then of course, it's generalist, because I can like a computer, I can program another sport into it, and now we can have an autonomous agent playing Isoc. So in a sense, it's building something which is an autonomous agent means, yes, we have the container for it now, but then you kind of need to set it. Then the learning comes in for each and every fundamental frame, I guess.

Jesper Fredriksson:

I would say humans are super plastic. In that way we can change a lot. And there are, of course, machines are getting much more plastic.

Henrik Göthberg:

It feels weird to say, but I think they're getting plastic brain.

Jesper Fredriksson:

We're talking about plasticity in the brain, like how actual physical change happens in the brain when you do things. So they talk about when the brain changes shape as plasticity and you have these things like if you lose part of your brain, then you can actually other parts of the brain can take over and make up for that, which is, I mean, can you imagine what that would be in a computer system Like if parts of the computer dies and then it will just move over to another part of the computer?

Anders Arpteg:

Very far from that.

Jesper Fredriksson:

Yeah, so it's like humans are super plastic in many ways, and I think we're far, far away from that when it comes to computers. But computers are good at other things. It's like if they do a calculation, usually if it's not chatty, but then it's always dead on, of course.

Henrik Göthberg:

So that's that's the argument, or the discussion is here. Then a little bit about autonomous agents, a generalist autonomous agents that we can then frame for different purposes versus, you know, having one single purpose.

Jesper Fredriksson:

Let's go back to your question about like a spectrum of very small yeah, tasks and bigger tasks.

Jesper Fredriksson:

I think that's that's a good, that's a good sort of dichotomy or a way to think about agents. I think so. I think we're now at the baby steps of this. We can do things like I mean this again rabbit R1, this small phone like thing that can, that can maybe order a number for you if we ever get that device, but it seems likely. So that's that's the level we're at now. We can, we can perform relatively small things, but the you can easily see how this gets more and more complex. I have my favorite problem that I think about. This is coding, being able to. Let's say, we have an agent that could code. So right now we're at the level when you can do like a function, you can. You can have kind of relatively good results from chat tpt or similar tools to create one function.

Jesper Fredriksson:

That's a very limited task but it's, I think, already with a little bit better language models, like, say, gpt five, I think we're we could be at the level where we could. So I had this task of reorganizing a Jupyter notebook this week to make it into a standalone Python program. That's a that is a very tough job because it's a lot of it's a lot of moving parts to it. But if you can imagine a little bit better language model that has a more, a more reliable context that you can work with, you know that when you put context in, there is going to stay there and you can work with that. And I think you could easily have, let's say, let's say, one small notebook. Let's say, let's say, 500 lines of code or something like that.

Jesper Fredriksson:

It's not a small notebook but it's it's it's relatively limited. I think that's the level we could be at of this year that we can have. We can reorganize stuff without having to interfere too much. So that's that. That's, I think, what's what's really fascinating about the, the agentic movement that I think we're going to see now with both the models becoming better, but also I think we're going to have scaffolding around it that makes it easier to do these things. That's like we talked about Lang chain before and things like that.

Anders Arpteg:

I think we're going to see that kind of GPD engineer that is from here, Swedish friend here that.

Jesper Fredriksson:

I didn't know.

Anders Arpteg:

He created GPT engineer and it's a perfect example of scaffolding.

Henrik Göthberg:

Yeah, definitely, but maybe this is for the last bits of this podcast and not not now. I think when we start talking about agents we can have quite deep philosophical questions on the utility function and what type of agents do we want to produce. And you can take this really deep into principal agent theory and the agency problem, which is all about even down into how we organize companies. And you know that it, the agent agent, is living in a principal agent relationships to something. Someone wants something done right. So it's super important that the way that the principal is fair, that the principal is right, and it's super important that the principal is able to put the task in the right way to the agent. Otherwise you will have agent deviations. So how can you minimize deviations?

Jesper Fredriksson:

So there's a huge rabbit hole here in if you take, if you put agents everywhere, and what happens if they are sort of not sort of done with the right it feels like we're going towards a paperclip like problem of if you design an agent to produce as many paperclips as possible and it takes the whole resources of the whole world to produce all the paperclips paperclips, I mean like all these things.

Henrik Göthberg:

Right, if you put this at scale, what happens? You know, when we go into autonomous agents, there is some fundamental big questions. That opens up but that's the philosophical angle on this.

Jesper Fredriksson:

Yeah, we have all the ethical problems of what. What if the the agents are done for for bad purposes? Let's say we want to always throw a government somewhere.

Henrik Göthberg:

Yeah, and you can have the good intentions, but there is blind spots in the way you set up the agents. That has unintended consequences.

Jesper Fredriksson:

Yeah, I think we're also. This is also like we're already going towards AGI when we're talking about this. This is like once we have this, this super powerful thing, we can call it agent, so we can call it AGI we're gonna have to think about how we use it.

Anders Arpteg:

Awesome.

Goran Cvetanovski:

It's time for AI News brought to you by AI APW podcast.

Anders Arpteg:

Awesome. So time for a small middle break before we continue discussion with Jesper Fredyksson about going from RAG to autonomous agents in some way. But before that, let's pick up some of the favorite news that we heard over the last week, and each one of us can choose a couple of topics. Who wants to go first, jesper? Do you have a favorite news article that you want to? I can start.

Jesper Fredriksson:

So I think I hinted towards in the beginning that I maybe stopped the brain imaging work prematurely. I've been following eagerly for many years now the mind reading AI story. Already in 2011, the first paper was released from a research group where they tried to reconstruct an image, or rather film, of what participants were seeing when they recorded fMRI scans. So they showed something to the participants and then they observed the activities in the voxels of the brain scan.

Anders Arpteg:

Can you do that, fmri, or am I in real time? Really, how did they do that?

Jesper Fredriksson:

No, I think they did it. They probably recorded it and then they used it afterwards. You can do fMRI in real time, but the question is what you do with it. I don't think you can get the results and work with it. It takes some time.

Jesper Fredriksson:

Anyway, this is a long ago, this paper, but what caught my attention this week was something that looks very weird. It's called Morpheus, which is a transformer model for a specific purpose that has to do with brain imaging. It's made by a startup company called Prophetic and they have a very overloaded video where they describe this. It's, in a way, very American and very over the top, and they talk about lucid dreaming. It's a device that is made for getting you in a state where you have lucid dreaming.

Jesper Fredriksson:

I have my doubts that it's going to work, but the underpinnings of this is like a trend that's been going for a long time this mind-reading thing. What they're adding in this product is something that would be super fun to have worked on if I was still in the brain imaging game is something that can while you go to sleep, you're supposed to wear this headband and it can register what your activities in the brain are. It uses EEG, and already that is questionable. How well it performs fMRI definitely, but EEG a little bit shaky. And then it reads off this activity and tries to steer it towards activities that look like lucid dreaming.

Anders Arpteg:

How do they steer it? They put electrical impulses somehow.

Jesper Fredriksson:

They use a new technology that I haven't heard about before. That's called ultrasound. It's something I'm forgetting. It's called, let's say I think it's called transcranial ultrasound, something. I don't think that they will succeed with this, but I still think it's going to be an interesting story to follow Is it ultrasonic sound that they actually use to modify the brain waves somehow yes, supposedly they modify.

Jesper Fredriksson:

There are other experiments with this where they are trying to do things like cure. I'm not sure if they really work, but there is something about curing addictions using this technology. For me it's a new thing. I've heard about transcranial magnetic stimulation before, where you sort of induce currents. This is a new thing. I hadn't heard about it before. It's going to be interesting to follow. I don't believe in it, but I believe in a lot of the.

Anders Arpteg:

You think he's faker?

Jesper Fredriksson:

No, I don't think it's fake. I think they're really trying to do this. It seems like it follows some kind of trajectory. I buy into what they're saying, but I don't think it's solid enough to work to really put you in a lucid dreaming state.

Anders Arpteg:

I mean, I've been to meditations, you know, and I think you know, all this kind of interesting technology and trying to understand your own brain. It's really interesting, and perhaps we should describe what lucid dreaming really is. That's true, yeah, you have a good definition or so.

Jesper Fredriksson:

lucid dreaming is when you're, when you're sleeping and you're aware that you're dreaming, you have some conscience in your dream that this is a dream. If you, if you, if you get to the, to the stage where you realize that you're dreaming, then you can sometimes also control the dream. Have you had a lucid dream? Yeah, me too. It's very cool when you, when you get into that stage. But it would and it would be cool if, if you could use a device to to control it.

Henrik Göthberg:

And is there a thing here where you know sometimes it feels like the human brain. We need to process stuff when we sleep or you need to sleep on that.

Jesper Fredriksson:

Are you talking about the meaning of sleep?

Henrik Göthberg:

No, I'm actually. I was going down the path how we lose a dream. Potentially can be useful as a way to process complex topics yeah, maybe.

Anders Arpteg:

I mean, I think it's a way to try to understand yourself a bit and get a closer understanding of your mind.

Henrik Göthberg:

Perhaps some thoughts or is it gimmicky or is it useful?

Jesper Fredriksson:

Are we talking about lucid dreaming? Yeah, I think it's. I say it as a fun pastime. I don't think it's particularly revelatory to me. I don't think it brings any new insights about me. Maybe that's just me. I think it's. It's. It's like a good VR show in a way.

Henrik Göthberg:

I'm not sure. I'm thinking that I'm processing stuff sometimes, that maybe you can unlock processing power in your brain.

Anders Arpteg:

I think we can all agree that if you don't sleep, you're going to go crazy, very, very and die very quickly.

Jesper Fredriksson:

Sleep is one thing, but lucid dreaming is totally different.

Anders Arpteg:

I'm trying to get the point Then. So then the question is really can we in some way start to to sleep in a more efficient way, if you think about sleeping as some kind of defragmentation of the brain or some kind of way to reset the brain to make sure it's operating properly, but then have some kind of more conscious part of it? Because this is really moving between the land of being unconscious and being conscious and finding the intersection of the tween, exactly, and that is kind of an interesting land where I think you're trying to make sleep more useful, exactly.

Jesper Fredriksson:

So how can I?

Henrik Göthberg:

get to 23 hours work.

Jesper Fredriksson:

Yes.

Henrik Göthberg:

All right, let's go for the next one.

Anders Arpteg:

This is awesome, that's very cool that AI can actually potentially get you more access to lucid dreaming. Yeah perhaps even control. It Awesome, should I go, or should you want I have them?

Henrik Göthberg:

I have two. That was two very small stories actually, so I have one small story and one which we could discuss. I can start with a small story and then you can do yours and then we'll see. Did you all see Elon Musk? Do you all see that there was some announcement around Neuralink Been doing a?

Anders Arpteg:

very big one.

Henrik Göthberg:

I thought about that when I did this more first time. I think that sort of couldn't. You know, we're still in brain land now. But so Neuralink, they started human trials and they have some kind of first results or whatever. That's what I know I was just going to.

Anders Arpteg:

I think it's a fairly big topic the first human implant and the person is now recovering from that surgery.

Jesper Fredriksson:

Do we know who it was? Do we know what? Was it somebody who had disabilities?

Henrik Göthberg:

or why the storytelling has been to cure Parkinson's or different things. That's the storytelling of course You're focusing on.

Jesper Fredriksson:

how can we make people who are lame I think what they said about this was that the purpose of this implant was to control a cursor.

Henrik Göthberg:

Yes, that was the experiment, but the bigger picture storytelling, that's what I was referring to. Someone who cannot walk Can we make them walk again?

Anders Arpteg:

Someone who had this kind of spinal cord injury or they cannot really move their limbs in some way For them to start.

Henrik Göthberg:

But of course this is a big picture topic where, if you figure this out, the use goes in many directions. This particular one was a steer cursor, I think it was.

Anders Arpteg:

It's a really big milestone. We've seen the videos of the pigs that had the implants and the apes and what not, and now it's actually the first human that have a ship in the brain that they can read and control the brain with. It's a super big milestone. I think We'll see what happens with it.

Henrik Göthberg:

We'll see when we look back at this mind capsule. Yes, kevin, so that was it.

Anders Arpteg:

Okay, let's see One, should I take? Okay, quick one, just to continue on the Elon Musk and Tesla. And they actually released the big release now, the version 12 of their full set driving. Yeah, so now it's released, not to everyone, of course, but to a selected set of better drivers. And the big news with the version 12, which is this is.

Anders Arpteg:

It is the first end to end neural network version. So before they just had neural networks for the perception part of the full set driving, not for the control or planning part. That was hard coded with rules saying if there is a traffic sign that is green, then accelerate, otherwise do not. So they had humans adding these kind of rules, hand coded rules. Now they removed them. Now they have networks that control both the perception, the planning and the control and that opened up so many new possibilities because now you don't need to annotate anymore saying this is a lane in the streets, this is a traffic sign, this is a human or pedestrian on the street, this is a car. You don't need the whole object detection kind of annotation that otherwise is necessary. It just can learn end to end from breaking, accelerating, steering to the sensory input.

Henrik Göthberg:

Yeah, and of course, one of the key unique things has, of course, been the smartness of Tesla, you know, recording and building up the training data sets, that is.

Anders Arpteg:

Yeah, more training data than anyone.

Henrik Göthberg:

By orders of magnitude, by orders of magnitude. I put them in a position where this is viable.

Jesper Fredriksson:

Do they do the annotation automatically, or is that out of the picture?

Anders Arpteg:

I think it's out of the picture but I think they start with still having the old models that have the annotations in them and that's like an initialization of the other models, so they don't have to train from scratch, because I think that would take too long, but at least they have a starting point. Then they just let it train end to end without having the annotations in place.

Henrik Göthberg:

Yeah, and we talked about this before and it is a fairly big deal.

Anders Arpteg:

It is a really big deal.

Henrik Göthberg:

Interesting.

Anders Arpteg:

And I guess it will take some time before it catch up. Because the problem with this is, if you want to control it, it's really hard now, before you could go in and say that now. If you see your dog, then run it over. If you see humans, then stop. Now you can't do that anymore.

Jesper Fredriksson:

Yeah, that's what I was thinking when I was asking about the annotations.

Henrik Göthberg:

I don't know the details on this level. You know how it really works.

Anders Arpteg:

I mean hard coded, then you can simply add the rules. But in your network you can't go in and change the parameters one by one. I mean it's too hard. So the only way to change the parameters then for the control part if you should break or not is really to add data to it.

Jesper Fredriksson:

Yeah.

Anders Arpteg:

And they still don't know if it will work or not. So it's much harder to control a system that you can't add hard coded rules to.

Jesper Fredriksson:

Do you know if they use synthetic data or if it's only real data that they're working with?

Anders Arpteg:

I don't know the details. I'm sure they use some synthetic data as well and simulate the shit of this all the time I would be surprised, but I don't know.

Jesper Fredriksson:

I was talking to a company, swedish company, who does synthetic data, and I think they did specifically for the car industry. They had some I don't remember the name of the company, but I remember seeing a video that they generated like they rendered photo realistic scenes where somebody in a car stretches over to fetch a mobile phone and then jiggles a little bit with the steering wheel just to see what happens in a car when you do things that you're not supposed to do. I think it was cool.

Anders Arpteg:

Did you have one more? I have one more. I think it's interesting given all the super election year that we will have in 2020. And, of course, we had the big election in Argentina in October and it was so much generative AI and fake images and fake videos People putting the opposing candidate in like a zombie picture and themself with a teddy bear.

Anders Arpteg:

I mean it was like a weird thing. Of course this will be a very interesting year with all the elections, and not the least in US. Now it was a fake Joe Biden robocall. Basically, they had an AI that called home to citizens in US sounding like Joe Biden. I can't really make.

Jesper Fredriksson:

Joe Biden's voice.

Goran Cvetanovski:

You know how he sounds?

Anders Arpteg:

I guess you should be sluddering a bit with the voices. Sorry, I shouldn't. Anyway, it sounded like Joe Biden and it was telling voters not to go and vote in New Hampshire primary election because the real election was later, so you're not supposed to vote it just helped Trump if they went to vote, so don't do it. And it is like crazy I listened to actually the phone call.

Anders Arpteg:

They had a clip of it so you can listen to it. It was actually kind of poorly done. It was very distorted, very noisy kind of sound, but it sounded like Joe Biden and it's telling people not to go to vote.

Jesper Fredriksson:

Did you hear what happened after that? So they figured out that it was the company behind this robocall was 11 Labs I mean it sounded like that. And then they talked to 11 Labs and they told… Suspected.

Anders Arpteg:

Yes, suspected 11 Labs does so much better than this, but okay, at least what I heard.

Jesper Fredriksson:

So they found out that it was through 11 Labs, and then they reached out to them and they found out who it was.

Anders Arpteg:

Really. Who was it then? I don't know.

Jesper Fredriksson:

That was not in the story, but they found…. I think it's interesting if we're at that stage where people do this fake news. They use AI to produce fake things, but we're actually catching up with it.

Anders Arpteg:

Yeah, I saw in this story that I read that they will certainly investigate this and prosecute them to the fullest extent of the law or something, because it's clearly illegal to try to influence the democracy and voting in this way. So, yeah, interesting.

Henrik Göthberg:

I have one more On a different note To end this.

Anders Arpteg:

Oh sorry, I mean it's the first one of the bigger ones. I'm sure you will see so much more of this coming year.

Henrik Göthberg:

And used all the different variations on this. Here you have the call not to vote. Okay, what else can you do to create mayhem Interesting ideas, anything.

Goran Cvetanovski:

There's so many variations.

Henrik Göthberg:

Yeah, on a different note, there was an announcement on the 23rd of January that Germany, France and Poland announced the Weimar Triangle for artificial intelligence.

Anders Arpteg:

Once again.

Henrik Göthberg:

The Weimar Triangle, and the word Weimar Triangle was first coined and used between these three states in 1991. And literally that was Poland joining EU and France and Germany stepping up to really help them accelerate, to come into the Western standards and all this. And now they are joining forces, literally going back, and now Poland is not the state they're helping, so now they're more in equal terms. They said the core idea is, like three leading members, that has formed a political alliance to push for better coordination between national plants and investment in artificial intelligence within with EU policies relating to the sector. So of course we are doing so much things in EU level in one way, the sort of the bureaucratic way with horizon line, you know whatever it is called.

Henrik Göthberg:

But I think this is an interesting topic, right, if we want to stand up and be competitive in Europe, there needs to be some hard, real hard work and money and focus. You know, if you contrast what, how China is doing things or how US is doing things. So I you know, will we see more of this? I think this is an interesting angle that the states are sort of starting to. Really, can it be done through states? I'm not sure. I really not sure. That's what I thought was interesting right.

Jesper Fredriksson:

I mean, I mean mistrally super interesting in this respect that there's finally a European company that does generative AI, and they really have to forefront. But I mean there's so many of these government initiatives that seems to not do so much, or or what do you think understood? Maybe this is more your game.

Anders Arpteg:

Yeah, I had one last story than trying to keep it short. But I was yesterday, at this big conference in Sweden, called on the interest, or AI, for the good of nations.

Goran Cvetanovski:

Good of the nation, which is basically Sweden nations.

Anders Arpteg:

Yes, yeah, good, of the nature of the Swedish nation. Yes, thank you, but the core story and it was a story pushed in the Swedish Dagens Nyheterna, also collected to this saying lagom is not enough for Swedish AI, or lagom is a Swedish word which I started translating English, but it's basically saying you know, being halfway through or doing it semi good enough is just good enough.

Henrik Göthberg:

Good enough is not enough.

Anders Arpteg:

Good enough is not good enough. I guess that's a way to praise it. And then we have the minister of digitalization there, erik Slotner. We had the CEO of Estonia, lukas, and so many other.

Anders Arpteg:

The AI Commission and I spoke a bit as well and we were trying to see, you know, for one, to recognize that Sweden actually has fallen behind a bit. We previously, like 10 years ago, was really leading or was in at least the top in some aspects when it comes to digitalization, but now, when it comes to AI, we are trailing a bit behind, and that is if you compare to other countries in the Nordics and Europe. So even in Europe, we can say we are average. Basically, we are not certainly not in need, but we are average or a bit above average. If you then think that Europe on its side is behind US and China, then we're, compared to the rest of the world, even further behind a bit, and I think it was really good to hear that also. The minister recognized that and said this is not okay. We should fix this For me and for all the other people that are interested in these fields and believe AI will do society so much good. It was really pleasant to hear that they all recognized this.

Jesper Fredriksson:

More AI engineers.

Henrik Göthberg:

Yes AI engineers for the world. But the interesting topic is, I think, back to the core question is this a state matter or is it? What is the state matter?

Goran Cvetanovski:

in this and I think, that's the core crux here.

Jesper Fredriksson:

What can and what should the state be?

Henrik Göthberg:

Should the state become a startup? Should the state become a hyperscaler? I do not think it's about that at all. It's about creating the environments.

Anders Arpteg:

Isn't that the state's purpose, then?

Henrik Göthberg:

Not to create the startup, but to create the environment.

Anders Arpteg:

Yeah, but that's versa.

Henrik Göthberg:

Yeah, I think so.

Anders Arpteg:

It's just a question of what they should do Exactly.

Henrik Göthberg:

That would be my. I'm on the other side here. It's definitely a state matter, but it's about what is the state matter?

Anders Arpteg:

it's the interesting question really I shouldn't linger on this topic, but still because I got this question so often and some people ask are the small country of Sweden, what can you ever do? But then I if you compare to China or US or whatnot, but I think I would like to quote. I think it was Eric Schmidt, when he was leading Google, that said we don't fear the enterprises. We don't fear the apples or the Microsoft of the world. The one we do fear is the innovative startup, the small startup that have the really innovative idea that we haven't thought about, that can really revolutionize how we're going to use the world. And in that sense, a small country like Sweden can, if we are really innovative, make a big change. So I don't think we should count size necessarily as the most important thing.

Henrik Göthberg:

And then back to my the news tour on the Weimar Triangle. Are we better off used to being nimble and just focusing on Sweden as a nation state, or do we want to collaborate with Norway and Denmark? You know there is there's an logic behind that argument. That could be that language is so profound in this world that you know there is an argument here that the way we're using English language is, by the whole, large language models towards certain cultures. So if you really want to have something that is AI for good, that sort of works, for us it's super important, most likely to to have the strength of your languages as part of this, and you know so there are many angles here. But are we better off doing it alone or we're better off doing it with our Scandinavian friends, or should we, you know, tag along with Germany? Eu is EU the right place? Yeah?

Henrik Göthberg:

I think, yeah, should ask Shatibiti. Good, I think it's enough.

Anders Arpteg:

Okay, should we move back to yes? No, go ahead, go ahead, go ahead First a little bit of comment.

Goran Cvetanovski:

It's super exciting to hear that that event happened yesterday. Maybe they should listen to the episode zero one of the podcast three years ago. We have been discussing this for three years. We brought, like so many people here, politicians and et cetera to discuss just that and how we can find solutions. So maybe they should listen to this. They should have, and but that is another story for another day. Yeah, coming back, very few small news. This is the freshest one. It's only six hours old, and that is that Google basically announced today that bar chat, but it's not powered by a Gemini Pro model globally to support with over 40 languages. They also announced quite a number of new inventions, including see Swedish, though, of course, because Sweden, as you mentioned, is a small country. They just started talking about like how they should take this seriously and being a little bit sarcastic here, but I think this is only part of the languages and it's 14 total, so I'm sure that Swedish is here as well, I don't understand.

Jesper Fredriksson:

Isn't wasn't a Gemini Pro? Wasn't that powered all along?

Anders Arpteg:

And BARD was powered by. Paul too. Yes, before now it switched to Gemini. Yes, so Gemini was announced in December.

Goran Cvetanovski:

No, no. Gemini was announced in December with you know, the story when they were like marketing advice.

Jesper Fredriksson:

Yeah, I know Right.

Goran Cvetanovski:

So now it's that in a Gemini Pro, they also announced the update on the music effects. So now, basically, you can release videos up to 70 seconds in length and music loops, and this is actually making quite a lot of noise in the music industry, because now you can actually do quite a lot of new loops and etc. So those ones that are working as a freelancer making jingles, goodbye. And the interesting another story was actually regarding open AI. You remember that Italy forbid the open chat, the chat GPT, at the beginning, and since then they open it, but they have been investigating actually how to catch them in court. So finally now they have sent like a cease to disease or almost type of a letter where they're saying that the open AI is working against the GDPR in Europe. Wow, yes, so actually they haven't. If I understood correctly, they haven't defined on which basically grounds they are convicting AI open AI for that. But they demanded answer and of course, open AI said like, oh, we are always working towards that.

Goran Cvetanovski:

But as you can see in this text, actually it's like most most likely it's about the data that they have been gathering, which is crap from internet. It includes also personal data of individuals, and then there is also a violation in the output, which is basically hallucination. There is a concern about like what is called like I think it was child violation or something like that. I'm not sure about that. So, yes, also flag child safety as a problem or whatever. So they have some explanation to do and I think eventually they will get fined as well, because we in Europe we are so good in collecting money. It's like the kings in the past.

Anders Arpteg:

That's how we survive. We sue all the big tech giants. Yeah, I mean, why should you?

Henrik Göthberg:

You know European business owner right, why should you know where you can just block it?

Goran Cvetanovski:

You know somebody needs to be a policeman. You know if everybody parking on the wrong street. You can actually punish and get money from that as well.

Anders Arpteg:

I mean, imagine you know, one part of GDPR is the right to be forgotten, so if someone comes, you know remember Spotify days you know some person came remove everything about me. And of course it's super hard to do that, even for Spotify, even though it didn't have an AI model to remove it from. Well, they partly did, but anyway, how should chat or open AI, remove people or personal information or a specific person inside the parameter space of almost impossible? It is yeah.

Goran Cvetanovski:

They also if you're dead. We didn't cover on the last episode, but actually there was a discussion about the New York Times. You know like.

Goran Cvetanovski:

Los Angeles et cetera, and Sam Altman went out and just frankly said, like how do you think it's going to be possible to build a large language model of this size without actually using any intellectual property data? It's impossible. Then you will have like completely different and it basically it's a question so like, if we are going towards artificial general intelligence and large language models of this size, you cannot build that without basically having all the human knowledge gathered into one. It's impossible, right?

Anders Arpteg:

I wish someone were to sue me because they forced me in my human brain to move all information.

Goran Cvetanovski:

But yes, for FedExon, yeah, and if I don't?

Anders Arpteg:

do that, I get sued 4% of my revenue.

Goran Cvetanovski:

Well, I mean Elon Musk probably will.

Henrik Göthberg:

Oh when you have Neuralink, they will come in. Yeah, Elon Musk is probably working on that.

Goran Cvetanovski:

And the last, basically to finish with what is called understarted. There are different ways how to do defects. So somebody this Wednesday, this was the biggest, actually news. So, as you know, taylor Swift I'm in the news for a wrong reason this time. So somebody has produced like a number of explicit pictures of her and this would this over social media channel, especially especially X and, I think, reddit and etc. So now, all of the social media this was the interesting part all the social media basically moved or canceled, or basically, yeah, canceled all the searches for such pictures and now they don't know exactly where this is coming from the pictures, because probably it's either stable diffusion or dali or mid journey, one of them for sure, and they will probably find it if they want to.

Goran Cvetanovski:

But it rises a concern. So how far we need to go in order for us to start thinking about like, okay, maybe there is a misuse of this, and I don't know if you know, if you have worked with the mid journey version six. Actually nudity is back, yeah, so it's very, very, very interesting because, even if you don't want it, at some point of time it just pops out. It's very interesting, interesting. I think it's scary because at some point of time it was basically at the beginning, it was like that, then it was canceled. Now the six it came back as well, so interesting. So this is basically for me very short and sweet, so let's continue.

Anders Arpteg:

Yeah, really good news. Okay so we have little time but a lot of topics to cover. If I go to pick one, unless you have something.

Anders Arpteg:

Henry. But if we take, you know, the whole RAG and autonomous agents, thinking you have the basic model If it's a Lama 2 or a GPT-4 or whatever it is, it's trained once, it's not real changing Then you add things out of that RAG, like a vector database or whatnot, or you add some kind of actions that you can have that you can choose from to take action on, like plugins in GPT-4 or in Chattapiti et cetera. I guess the question is, should you go to the Tesla approach, saying it should be end-to-end, where you train them all together, or should they be trained separately?

Anders Arpteg:

So let me I think you know where I'm going here, if you take the John Likun kind of approach with the JEPA paper, they basically outline different modules. One is the perception part, another is the planning part, the third is like the configurator, as you call it, and so forth. So he has different modules. They seem to be separately trained and like separate modules in some sense, and I'm thinking potentially that is a good approach to take, but then you can train it end to end. What do you think here Are we going to see, if you take like a full autonomous agents in the future, is that going to have one module that is trained and built and constructed completely separate from another module that is handling the memory part perhaps, and the third module is the perception part and so forth.

Anders Arpteg:

So you have a set of different, separate modules. Or is everything going to be? I guess you know is GPT-5? Okay, let me phrase it as a question. Sorry for being very flusy here in the description. If we take GPT-5, is that going to be a single model that had the reasoning, the taking actions, everything done as a single system? When you call the API to GPT-5, is that going to do everything for you or is it just going to take one part, the perception part. You get an answer, then you call another API. It's going to do the rag thing and the third API is going to take an action searching on the internet, calling an Uber, booking a hairdresser appointment or what not. What's your thinking? Eri Espar, super easy question.

Jesper Fredriksson:

Oh, it's so easy. So I have my opinions here and I think it's hard to give a right or wrong answer. I think it's in my experience, usually those end-to-end models are, in the long run, a better solution. It's usually better to have that, but then again, so we're talking about rag being a hack, and hacks are very useful. I think we should be pragmatic when we do this and I think we should use the tools we have and combine them in all the different ways we can. I think in the long run it will probably evolve so that it will become more contained. That's what I would say.

Jesper Fredriksson:

But I think and we could talk about just from an agent perspective, I think that's what you alluded to that when training a large language model, we're training on the, I guess, the next word all the time, and that doesn't really. That is not really good, for I think if you're thinking about actions, so maybe you need to do it another way to be able to capture a better sort of trajectory of what the agent will do. Maybe there will be an agentic version of GPT-5. I don't know. That would be one solution that they train on.

Jesper Fredriksson:

Let's say, I think again, rabbit R1, I think is an interesting example in this respect. We still don't know exactly what it's doing because there's no description of it, but they're hinting to a little bit of what they're doing and they call it a large action model, and I think there's something to, if you want something to perform, an action, as in their case to call a number or whatever it is they're learning on. They're training on interactions, if I understand correctly, interactions with computer interfaces, and then it's not so much about guessing the next word, but it's more about guessing the next action or the next, the next path, the next point on the trajectory towards achieving a goal. If you're doing that, if you're training on that, then I think you have a better chance on achieving something more agente in the end. That would be answering your question, but not answering your question.

Anders Arpteg:

Yes, so two thoughts on this. I think, you're for one right that we need to move on training on next word prediction. I think also that having some kind of reasoning I think we all and we even heard Samulton saying this reasoning is the next part of GPT multimodal. Yes, but reasoning is really the big thing and to reason you need to have some kind of more than just scanning through the tokens in the context of window and taking next word. It needs to move on. That Guess that's what he's saying.

Jesper Fredriksson:

But he's also talking about. It's a little confusing because it's also talking about inference time compute. So we're doing more things at inference time. That I seem to. It seems so that that's going to be the plan forward, all this talk about Q star and whatever. It seems like we're going to simulate a lot of different stories.

Anders Arpteg:

And it could be simply that they only do it in inference time and take steps and just see what happens. But then I think they need to change the training as well. I think that's what you're saying as well, that they can't simply train on the next word all the time. They need to have at least a set of actions that take and then use that reward or that kind of state to train on.

Jesper Fredriksson:

That seems like an interesting direction to go into. Maybe it's difficult to do it. Maybe it's too hard to combine it with the kind of knowledge model that GPT-4 is Could be too slow to train actually.

Jesper Fredriksson:

Definitely. It could be that we're going towards two different models. If we're doing that, as you were saying, maybe it's a different thing. So I think we're going to see a lot of different whatever works. That's the lesson from chat GPT. We're just scaling it up and all of a sudden it works. The model is, in a way, super impressive, but yet sort of basic. It just works.

Henrik Göthberg:

Is there another topic here that is stemming from this question? It's like when we now start operationalizing stuff like an example, like we took the right example. From that perspective, the first baby steps is probably easier to get something off the shelf, off the ground. That is sort of one end-to-end solution, because you're pragmatically duct-taping together a shitload of different technologies and this is now one duct-taped system. But when you get to robustness and scaling this out and actually having an architecture that allows us to know, in a very fast way, change the actual agent objective, then most likely you get into the whole modularization as a means to scaling topic, which kind of infers the JEPA paper at scale. So is this also a difference between maturity, where we are now to get shit to work, versus where this needs to scale? What do you think about that? Both are, then, true on different horizons.

Jesper Fredriksson:

Yes, I think that's exactly what I was trying to get to in the beginning. I think probably taking a more holistic approach is the best way to do it, and we will get there in the end. We will get to version 12 or whatever it was with the Tesla self-driving in the end, but we have to find the next step. From where we stand now, it's again engineering. It's like we have something that works. We have a way forward from where we stand now and try to not do too much science at one point.

Henrik Göthberg:

Can we learn something about this trajectory by simply looking at the big data evolution and how we started to build the pipelines in 2012 and how we are now trying to build a much more modular distributed? As long as you only have one pipeline, don't you really worry about it. Build it as one system. The problem starts when you have hundreds of pipelines. Can we do some sort of analogy here?

Jesper Fredriksson:

Yes, I guess I still don't have them. I have no other answers to the question.

Henrik Göthberg:

What do you stand on this? Because I think long term I would understand. It's like the brain, like the JEPA paper, but it's a pragmatism here.

Jesper Fredriksson:

By the way I'm thinking about when it comes to the brain. I think that's very much. I mean, you have to think about evolution. It's been forced into this way because of evolution and without the constraints we don't know what it would look like. So I think there's no saying that the brain is optimal in any way. That's just the way it happened to end up from a starting point and with evolution and the constraints of the skull, and I think people don't recognize that the human brain is so limited in so many ways.

Jesper Fredriksson:

There's a lot of constraints if you compare to a large language model.

Anders Arpteg:

Especially my brain at least.

Jesper Fredriksson:

It's like 30 watts or whatever it is that the brain runs on.

Anders Arpteg:

That's like amazing. When we actually get properly working in an efficient and economical way in neuromorphic computers, it will be a completely different game. But we're not there and therefore I think we need to find other pragmatic engineering solutions to the problem. So I think it still will be modular in the beginning. I think also hoping AI. If you just had an API, the only thing you could tell it is ask it in the prompt like when should I, how should I get to point from point A to B? And then it could ask you, or it could ask you you know, stock on taxi or whatever, and do so many things. I don't want it to do everything. I want to have a bit more control as a user Do you really.

Jesper Fredriksson:

Once you trust the system, you will be fine with just taking anything.

Anders Arpteg:

But at least you know it could at least. As, let's say, I have a building application, I mean opening. I just have the AI part of it. The application has a lot of logic on top of it. You want to have the user the ability in the user interface to choose some, give some preference to it.

Jesper Fredriksson:

I don't think so. I think that's that's what I think is beautiful about the RabbitR1, that we're getting rid of all the user interfaces. It's just text, it's just speech in this way. But I don't want any options, I just want somebody to fix it for me.

Henrik Göthberg:

Yeah, maybe this is also interesting, right, because, in the end, what is useful, what is adoptable, what are we as, what are we actually? What is boosting our productivity or efficiency? And sometimes, maybe the simple way I mean like, which one is the most loved app in Sweden? I would argue Swish, swish.

Jesper Fredriksson:

I would say Spotify, but Swish is probably up there.

Henrik Göthberg:

But but you mean like, like, where they? I know the conversations, I've heard them, the back doors or the thing that, oh, we could put this feature in, we could put this feature, we could put this feature in. And they're like fuck, no let's keep it let's keep it real to the core what is supposed to do, right?

Anders Arpteg:

But also I think you know there is a danger if the only way people build business in the future is simply to ask one big, large tech organization called OpenAI to do everything for you. They need to open up for people to build new businesses and be innovative. Unless they have some freedom there to say how are they going to use the output from GPT-405. That's going to be a dangerous thing.

Henrik Göthberg:

So I think still oh well, you will. In the end, you are moving away from differentiation. What is the competitive edge if everybody has exactly the same?

Anders Arpteg:

That's what I'm trying to say that you know it's dangerous if they actually do not have a large number of APIs, that you can first get the first answer from. Then you can choose. You know, for my use case I want to do this or that. I think it would be really dangerous if they put everything too much end-to-end directly.

Jesper Fredriksson:

I mean you could hope that there would be more than just open AI. It would be scary if it's just one company, but I mean, historically it's usually one or two companies, maybe three.

Anders Arpteg:

But I think they should open up for being able to build a business model that have, you know, added value on top of it. If it's too little added value, you can actually add on top of it, what do you mean with added value?

Anders Arpteg:

I mean, if you're a hairdresser and you want to have some way to let customers choose their hairstyle, let's say that you just have an API, you just ask it. This is a picture of the user Choose, you know what hairstyle it should have or that person should have. Okay, you get some answer. If you have no way as a business the hairdresser or a farster or something to add some kind of special value on top of what that model produce, it's going to be really dangerous. I would say then they're going to choose everything for you. They need to have, either through just prompting or through some other way. Add some added value on top of the language model to say this is what we specialize in.

Jesper Fredriksson:

You mean that there's got to be room for somebody else to do something on top of OpenAI that is in there right now, I mean when we get to AGI, I don't know, but today it's like a lot of companies use OpenAI and produce something on top of it.

Anders Arpteg:

Today OpenAI do so small part, it's just a perception part, If they get increasingly into the control and planning part, that's I think would be dangerous.

Jesper Fredriksson:

It is scary, this thing.

Henrik Göthberg:

It's wonderful and scary, so it matters. How they start tackling those topics really matters.

Anders Arpteg:

I think it would be dangerous for OpenAI to do it because they would lose money on it. I don't think it will do it. I think they would earn more money if they have bigger freedom and ability to add value on top of their language model. I don't think they will add too much end-to-end. I think they will add a lot of freedom on top of it so they can earn more money.

Henrik Göthberg:

We'll see. Do we know anything about the next release?

Jesper Fredriksson:

Speculations I heard speculations that it's obviously going to take some time to train it and we don't know when they started. We have some indications because they released these new embeddings, which is probably part of the new model training. It's supposedly much better, especially multilingual, which is a good thing, and it's something that's probably going to be more important than the newer model. The speculation I heard was probably after the election, for obvious reasons. It would make sense, but I would hope that they would be released sooner because it would be more fun.

Anders Arpteg:

I think there will be a connection to the election as well.

Jesper Fredriksson:

Do you think it would be after or before?

Anders Arpteg:

I don't know.

Jesper Fredriksson:

I think it would be after, if anything.

Anders Arpteg:

It's a risk.

Henrik Göthberg:

I can even see how they get bullied, lobbied into. We have just enough getting control of what you have released in the past, so please.

Jesper Fredriksson:

Don't you think that this pause on training aged very badly? That feels like so long ago when they talked about we need to halt all training of large language models? That feels like was that last century or when was that? That's crazy.

Anders Arpteg:

I have to quote someone from the conference yesterday. I laughed so much regarding what you just said. I'm trying to think who said it. He said basically we have a lot of success stories in Sweden.

Anders Arpteg:

We have the Spotify's, we have the Skypes, we have the Minecraft, we have the Klarna's. We have so many true colors and Minecraft, so many awesome startups that we built in Sweden. What do we have today? We are leading on the AI pessimists. The Nick Boostrom and Max Thierry are some of the world. Everyone starts laughing like crazy. I love Max Thierry, I must say, and Nick Boostrom as well, but in some way we're lacking some of the enthusiasts, some of the really visionaries. How can we take it to the next level and not be an AI doomer? It was just a funny story. Okay, time is flying away. It's a philosophical question Sandesh.

Anders Arpteg:

Yes, yes. Should we skip the ethical and societal kind of implications thing? I think so right.

Jesper Fredriksson:

I think so. Specifically to agents, it's obvious what are the risks and we don't know what to do about it because we don't have the agents yet. So it's hard to say.

Henrik Göthberg:

Okay, I have an angle on that question.

Jesper Fredriksson:

So we're not skipping it.

Henrik Göthberg:

No, we're not completely skipping it. For me, the philosophical question in here is like we will clearly need to have engineering conversations on pragmatic ways of doing this. What other conversations should we have in parallel? Or is it after? Is it before? Is it parallel? What are those questions? We don't need to solve them, but sort of. I think the agent topic opens up a fundamental, deeper question about principle agent theory. But I think he's out there.

Jesper Fredriksson:

To me it's a subset of the AGI dilemma. What do we do when we have AGI? What do we do when we have agents? I think it makes sense to think about what will be the problems that we will run into, but it's also something that it is very hard to think about before it happens. It's like all these doomers things about AGI. It's easy to come up with stories of how it could happen in a very simplified and weird future, but it's so disconnected from reality so it doesn't really make sense.

Henrik Göthberg:

So let's go down this path, exactly this path, because literally the doomers. Hypothesizing doesn't help, so it's more about focusing on the imminent problems. I think is the number one core answer here.

Jesper Fredriksson:

What are them? I think there could be. You could have an argument like if you're seeing that this is definitely taking the wrong turn and you're thinking about this, then maybe somebody needs to step up and say that this is going the wrong way. I don't think we're there yet.

Henrik Göthberg:

That's why it feels weird and this is also if I connect it now to the regulation topic. We are trying to regulate something. It's the wrong way around, right? You kind of need to look at it. You need to have regulatory frameworks. Yes, how can you regulate on something you don't really know yet where it's taking place? So if you look at regulation that really helped us and the order they came around like take aviation industry and all the regulation around flying a plane and how that works, and of course that had to be regulated and of course that happened in parallel Could you have invented in 1927 the regulation? We are literally at the time in aviation when we did the first sort of flight? It's ridiculous, right? So how to tackle this now?

Jesper Fredriksson:

It is a deep rabbit hole, this, but you could say the same thing about the large language models, the GPC4s, that are now being sued by New York Times and etc. Because it happened so fast and it happened behind the scenes, so we didn't know that it was being trained, we didn't know what it would end up with, and now we're standing here and we have it already in place and it's like fatta complet. What do we do now? That is, I guess, the problem with this.

Henrik Göthberg:

And this is maybe the doomer's approach to this that this is so fast. Now, we don't want the wrong things to be fatta complet.

Jesper Fredriksson:

So there is something to it. I don't know what it is, but I can agree with that that it is a problem that is happening so fast and that it's because that makes it harder to regulate, Because once it's there, it is already fatta complet.

Henrik Göthberg:

But there's another angle to this then. Don't we need to innovate how regulation is done? Is it a fundamental problem of the length of getting bills and regulation in place and how that works? In relation to the fast-moving productivity frontier, With accelerating returns, this problem is just going to increase, right? So how do we then have actually the regulation process? Is that a need of innovation as well? I?

Anders Arpteg:

don't know. Actually, another topic that I had privately yesterday was why don't we have agile legislation? This is what I'm talking about Now.

Henrik Göthberg:

we're getting there. This is exactly right. What would that look like? How can we have waterfall legislation in an agile?

Goran Cvetanovski:

way in the world.

Henrik Göthberg:

That's a very fundamental question.

Anders Arpteg:

I don't think it's impossible actually.

Henrik Göthberg:

No, if you think about, how can we sandbox? We're talking about regulatory sandboxes, so we're going in this direction already. So it means we need to have an iterative, agile view on how this is done and we don't.

Anders Arpteg:

Actually, I think one really small step in this direction is actually from the AI Act. So the way they're trying to be agile, at least partly, is that they have appendixes, so for the set of use cases or for what techniques that is included in the definition of. Ai they don't add it in the core text. They have an appendix and the appendix can be updated In an easier way than the legislation.

Henrik Göthberg:

This is a profound conversation because actually the regulatory process is equally in need of innovation as the actual discussion here.

Anders Arpteg:

I think it would be a very interesting discussion to have. Okay, let's move to the final and, I guess, mandatory question that we need to have as well.

Henrik Göthberg:

It has become your mandatory question. I noticed this and I was thinking. I wonder if Anders is keeping some sort of tabs. Is he doing a questionnaire, a macro research on?

Anders Arpteg:

I'm trying to keep some statistics on it and I think you are, aren't you yes?

Henrik Göthberg:

I am, and I think it's very interesting how many doomers, how many.

Anders Arpteg:

Yeah, and it's very clear statistics so far. Anyway, yes, there, if or when we have an AGI, what kind of world do you think we will have? Will it be the dystopian nightmare like the Terminator and the Matrix, where the machines are killing us all? Or will it be more of another extreme? You think the two extremes?

Anders Arpteg:

So the utopian paradise where humans are free to pursue their passions and creativity. We don't have to work 40 hours a day or a week anymore. We can perhaps go down to 20 hours or less and still have all the luxury and abundancies, as Elon Musk called it as we need. Where do you think we will end up? And do you think we will end up there?

Jesper Fredriksson:

I think I have to start with. The last time I was speaking publicly about this, I ended up saying something like Saving the world or something like that, and then I got the remark afterwards from somebody saying I should have said sex, drugs and rock and roll. So that's probably the best way to answer this question, that's the utopian. That's what will happen.

Henrik Göthberg:

Not very PC I love this answer.

Jesper Fredriksson:

Anyway, I think it's, of course, a very philosophical question. I think it's about looking yourself in the mirror. Who are you? What is humanity? What will happen if we get something, this kind of windfall thing, that falls into our lap? I think it's about it's very existential what do we want to do when we have everything we can decide? That is, to me, the why I'm not a doomer, because we are in control. We are the ones. It's not the terminator coming to kill us all. It's like we are doing this, so we can, of course, choose what to do with it, and let's just hope that we choose wisely.

Henrik Göthberg:

I was having exactly those words. Let's choose wisely.

Anders Arpteg:

Okay, but if you still were to try to put some probability on it, what's the chance that we have machines that create a new coronavirus that is going to kill us all? Or someone going to abuse AI to create a fleet of drones that is going to use nuclear weapons to kill us all? Or are we going to have an AGI that is going to solve future energies? Free energy for everyone, we solve cancer and all the medicine problems we have, and we have abundance of everything.

Jesper Fredriksson:

We've had weapons of mass destruction for a long time. We seem to be doing fine, even though that I think there's no difference this time. I'm afraid of what some people will do, regardless of if we have AI or not. And yes, that risk becomes a little bit bigger with AGI because it could potentially be something that's within reach for more people than, let's say, nuclear weapons. And, yes, it will be abused by people, but there will also be people fighting back. I'm positive about this. I don't think it's going to be as bad as the people guessing the worst scenario Hard to get it right. Anyway, I don't think it's that different this time. I think it's like all the different things that happened before, even though this is maybe the biggest invention ever.

Anders Arpteg:

Maybe, maybe it's not certain.

Jesper Fredriksson:

No, I'm not certain about it. I think it's definitely one of the biggest inventions.

Anders Arpteg:

If we compare this, if you compare it to Max Thiermars' comment that his son will not survive because of AI will kill him.

Jesper Fredriksson:

I didn't hear exactly what he said. I'm not going to comment directly about it, but no, that's not what I think. I have kids. I'm not afraid for their sake. I think they're going to do just fine.

Henrik Göthberg:

But Max Thierkmark is in his book Life 3.0. He sort of tried to quite well, I think, paint a picture of the different scenarios and the different arguments. And one of the key discussions is that has a quite big impact on how scary this is. Do we believe that AGI will be a scenario where it sort of happened within nowhere and then it went super fast? So the whole event, the whole scenario is about when it happens, is like a flick of a switch and it happens in days, versus a scenario where this is going to be evolutionary and we're almost going to look back at it and say, well, I guess we kind of stepped over that definition last year. I didn't feel it, but we did so I think for me that is a sub question here around AGI, which then takes us into different types of risk scenarios. So do we think it's going to happen super fast when we reach AGI? Will that be like a moment in time where we can have a 7-11 moment, or will it just be evolution?

Jesper Fredriksson:

I think that's a good question. If you would ask me 20 years ago, if you would show me the tip for and show me this 20 years ago, I would say that this is AGI. This is now. We've reached the end point of this, and now the gold post has moved so far already. It's something completely different. So my guess now is that it's going to take longer than we think to reach the final end point.

Anders Arpteg:

What's the estimate if you were to guess a year?

Jesper Fredriksson:

We talked about this at some point after many years and I still think that estimate wasn't too bad for one definition of AGI. The way I think about AGI is economically profound AGI when it becomes something that has, when basically most work becomes, let's say, 100x more inexpensive. So everything you do AGI can do, more or less Everything you do AGI can do faster and cheaper, and then it will have a profound economic impact and societal and everything and that I think will happen relatively quickly. But then we talked about what was it?

Anders Arpteg:

Akerchval 29?.

Jesper Fredriksson:

Something like that. I think maybe before that, depending on exactly where you draw the limit, like, what do you require of an AGI? Just to take the, for me, the most common job in Stockholm is a programmer. When will AI do better programming than I do? I think that's probably going to happen much faster than that.

Henrik Göthberg:

So it becomes this topic then that you know, is AGI even a good definition? And it's like we're talking about superintelligence and we have the. Is it the Google paper, which uses sort of five levels argument, which is probably the best definition we've seen so far, and we are moving the goalpost? I think that's a good way of looking at it and in my opinion, then we need to prepare ourselves for the risk that is inherent in the evolutionary approach, because that is essentially what is playing out in front of our eyes right now. Even some alt-money Changing is rhetoric, I think towards this.

Anders Arpteg:

I still understand by what I've said all the time. I just can't wait for AGI to happen, because I feel that will be a safer place to be. What I'm really scared about is the AI today in the hands of bad people.

Henrik Göthberg:

That's the scary part. But what do you think? Do you think there will be a moment in time when there's this flick of a switch moment where something accelerates super fast, the fast takeoff thing? Do you understand AGI's scenario like this, or do you more believe in an evolutionary scenario?

Anders Arpteg:

I think if we do nothing, it can certainly be a fast takeoff. I think we need to make sure it doesn't happen. I think a fast takeoff is a really nasty scenario.

Henrik Göthberg:

That is the nasty scenario, right?

Anders Arpteg:

Because then you're on the concentration of power that is so dangerous and I think we will control it, and it's the super fast takeoff that you can impossibly control.

Henrik Göthberg:

Does the FETA comply? You don't want, I guess.

Jesper Fredriksson:

It depends on what we say. Let's say that we let the AI program itself. We get to a point where I mean this is this classical argument that I think would be one way of thinking about the fast takeoff, and that's letting it out of our control. I think we're not going to do that because we're not stupid.

Henrik Göthberg:

We already understand that risk.

Jesper Fredriksson:

Yeah. So I don't believe that that will happen, but it could happen and then it could get out of hand. I can't see this fast takeoff how it would happen today. I don't see a scenario where that could happen, but I'm sure there are scenarios. But my imagination is limited.

Anders Arpteg:

The crazy thought that I actually have a Q-star implementation, that is, an AGI that can do it. In the lab already now they have a prototype that is potentially doing this. If you just put it out there, it could actually be the thing. Wouldn't that be the fast takeoff then.

Jesper Fredriksson:

But how would that happen? How would they let's say, they have this prototype, then they can do something right now, but they wouldn't do that because they're not stupid. Exactly Okay, they could release it, but what would happen then?

Anders Arpteg:

They say, somebody's been to their lab and steals it.

Jesper Fredriksson:

Almost nobody else can operate it anyway. There's very few places where you can scale it to that level that it becomes something really big. So I have a hard time just thinking up the scenarios where this would happen. It's a little bit like the Max-Tagmark stories, which to me it's just too far away. I don't have that kind of imagination.

Henrik Göthberg:

I wish I did. Still, science fiction is a good bed read. The evolution approach is much easier to kind of grasp and it's kind of also history has played out that way.

Jesper Fredriksson:

Yes, that, I think, is a good argument. Usually things happen the same sort of ways I always did. I think it's useful to think about the industrial revolution and those things. How did things happen then?

Anders Arpteg:

Things move slowly. That's a good thing.

Henrik Göthberg:

And still, of course, we will say it's super fast compared to then, but it's still not fast fast. Humans will still have control and actually humans will put the pace in this. We can't forget that In the end, we will be limited by our capacity to imagine this.

Anders Arpteg:

Unless we have a new rolling ship in our way. Well, we need to break it here. Thank you so very much, jesper Fredrikson, for coming here discussing deep philosophical topics, as usual, and I hope we're going to continue speaking even more about that after the cameras turn off. Thank you very much for coming here. Thanks for letting me be here, Thank you.

Henrik Göthberg:

Jesper.

Voice Assistants for Visually Impaired Individuals
Science and Engineering in AI Discussions
Brain Imaging to Data Science at Volvo
Using Slack for Natural Language
Autonomous Agents and Their Generality
Generalist Autonomous Agents and Ethical Considerations
Exploring Sleep, Neuralink, and AI Driving
State AI Collaboration for Language Models
Future of GPT-5 and End-to-End Models
OpenAI Language Model and Business Innovation
Regulation and Future of AI
AGI and the Potential Risks