
AIAW Podcast
AIAW Podcast
E153 - Mastering AI Prompts - Magnus Gille
Join us for AIAW Podcast Episode 153 as we sit down with Magnus Gille—Product Owner AI Enablement at Scania and Swedish Champion in AI-prompting—to explore his evolution from AI enthusiast to Prompt SM victor. Discover how persona creation and iterative refinement propelled him to the top, what it’s like engineering AI under tight time constraints, and how an AI-first future may reshape our world. From the challenges of rapid innovation to the tantalizing promise of AGI, this freewheeling and in-depth conversation reveals powerful prompting techniques, offers a fresh perspective on AI’s potential, and might just transform how you think about the technology shaping tomorrow.
Follow us on youtube: https://www.youtube.com/@aiawpodcast
It was interesting with the. You know how valuable are prompting techniques in some sense and you did some kind of test for that or what it is Right.
Magnus Gille:So I did a deep research and I wanted to find out. Well, since I'm the Swedish champion in prompting, I wanted to know well, the techniques that I add to my prompts are they helping or is it more like best practices? Add to my prompts, are they helping or is it more like best practices? So I used deep research from chat, gpt and tried to find out. Well, what does the recent research say? What type of techniques are actually helpful and what type of techniques are more akin to throwing salt over your shoulder or, you know, knock on wood and stuff like that and what?
Anders Arpteg:did you come to some conclusion? Can you, could you see you know some techniques that actually have a clear added value?
Magnus Gille:Yeah, and I think the one that has the best clear added value is if you ask the model to always ask you a question back. So here's the thing I would like to solve, but please ask me if you need clarification, Please ask me if there's any information missing. Please ask me the questions and the context that I have not provided to you. And that was actually the one that seemed to give the most value. And then they had a little bit more uncertain results on stuff like should you be polite or should you be commanding. Well, sometimes politeness actually helps, but not for all type of questions. Sometimes being commanding helps, but not for all type of questions. It kind of evens out over a large set of different use cases.
Anders Arpteg:It sounds like a proper research topic. Someone should do a proper research article about this and try to see what the added value of different techniques are in some ways, and prompting in many different, various contexts, for coding or for papers.
Heeenrik Göthberg:What is your sort of? What did you distill it down to yourself, like when you go into the deep research mode for prompting? How would you summarize what is deep research prompting and in what way? Do you think that's slightly different, in in how you approach it, than other prompting, or is is it the same?
Magnus Gille:Yeah, I mean, it's kind of the same. The thing is that, interestingly enough, if you do a deep research, at least in ChatGPT, it has this kind of built-in function that it does ask you questions. So regardless of how specific you are and how much context you give for the original question, you will always get the question back. So it's one of these kind of best practices that is apparently part of the mode of the program itself.
Anders Arpteg:So it has basically the technique that you found the most useful is more or less built in when you go into deep research mode.
Heeenrik Göthberg:Can we summarize that? I see that too. Can we summarize that? It's almost like in deep research mode, the machine goes more into a scientific mode, so it approaches you, of deconstructing the question, so to speak. Is that a fair summary that it's almost like approaching it like a scientist itself, like your question, that you're asking it how it's approaching you, compared to when you chat, gpt 4.0 can seem like a happy servant. Don't really ask, don't really care, you blurb something back.
Magnus Gille:Yeah, you could kind of think of it that way. I mean, it asks a question. That seems to convey that it needs like very detailed and nuanced understanding of the question. It's not just oh, you ask, ask a question, here's some answer back, but really are you thinking about it in this way or that way? Is this angle or that angle? Should I focus on recent stuff or older stuff?
Anders Arpteg:so a bit more interested in the objective truth. Perhaps if we call it that, and and if you don't do the deep research mode or jump or early versions of LLMs, they may simply just want to please you in some sense.
Magnus Gille:I mean I would be careful of using the word intent and aim and preferences. At the end of the day it's just the programming of the thing and the scaffolding of it. I mean someone have made a design decision that for the mode deep research, we have put in the prompt that it should ask questions.
Heeenrik Göthberg:It's not like an inherent desire of the model to no, it's a different scaffolding because it has a slightly different design.
Anders Arpteg:I'm going to add this as a question Does these kind of models have feelings and intent? Oh nice I think that could be an interesting topic towards the end, when you get a bit more philosophical. You also mentioned, your wife did some research as well, right? What was that about?
Magnus Gille:She's interested in local history and we live in a neighborhood in a small village that has old houses, so we have quite a lot of local history to dig into, and she did a search, using deep research, where she asked what type of families and what have they worked with who have lived in our neighborhood for the last 200 years and what happened for the last 200 years.
Magnus Gille:And what happened? What happened was that she got this kind of question back Okay, what are you interested in? And she answered them. And then she got the report and it looks kind of nice. But then she looked at the sources for the report and the sources contained both public archives sort of thing, and then they had some more like blog posts or pdfs provided by the local history society hembyx hembyx in swedish, exactly.
Magnus Gille:And then she had one of the sources that was radio islam and she kind of reacted a bit. I've heard that before. I'm not quite sure about this source. And then I mean, since you sit down with AI, you can just ask AI this Radio Islam thing, what is that? So apparently that was a very anti-Semitic local radio station that was on the air between, I think, in the 80s, 90s, something, and then they transformed their activity to a web page, something. And then they transform their activity to a web page and, as chat gpt itself claimed, it's one of the most anti-semitic places on the internet in sweden it said so when you ask explicitly about that source and and and the.
Heeenrik Göthberg:This then sparks this is, this is the hook at the key point that with deep research also comes you. You know the guardrails, you know what other features do we need when we do deep research? To also vet, so to speak, which sources to use and not use. Or what was your thinking here and what was your reflection in the end on this?
Magnus Gille:Yeah, I mean, my initial reflection because I'm not a very big fan of antisemitism of course was that I felt uncomfortable. Then I went in and checked that in the report there was nothing antisemitic, it was just a very general description of people who have lived in our neighborhood. The specific factual claim that was collected from this webpage seemed to check out because I could then kind of look it up and verify it from other sources as well. So that is of course an interesting philosophical and ethical question, that if you have a piece of information which is factual correct, should you still have that as part of one of these deep research?
Heeenrik Göthberg:searches, or should you filter out sources?
Magnus Gille:Yeah, exactly.
Heeenrik Göthberg:Even if the fact is correct.
Magnus Gille:Exactly, you should filter out stuff which is not correct, presumably, but you should not filter, or it is a question if you should filter out the source as well.
Anders Arpteg:So sources that are deemed to be antisemitic, for example, as Chetwitty said, this source was, but they still use it as source for facts, in some sense.
Magnus Gille:The funny thing is that the next question I asked to ChatGPT that okay, that doesn't sound too good. Is that a usable source for stuff? No, absolutely not. You should never use this source. You cannot trust it.
Heeenrik Göthberg:So, in this sense, then, deep research finds sources, and it turns out that the facts that they are using in the report is correct. So all this is good, but even itself would not say we should not have this reference or this source in the report.
Magnus Gille:I mean, this is, of course, a design decision from OpenAI, either they could say that. Well, I mean, there's several things you can argue here. You could, on the one hand, argue that if you have a page which has a very skewed political leaning, maybe you will, on an average, have a harder time finding unbiased and correct information from that page. So maybe it's just good practice to say that. Well, maybe there are some facts here which are correct, but given that the percentage of those are low, we should not take the risk. That's one thing you can say that's a good one, the.
Magnus Gille:The second one is, of course, like the, the business model, both from a attention perspective but also from from a monetary perspective. Do do we want to promote this type of sources in reports which can then be handed around and people can read it and say, okay, here's a source, and here's a source, and I guess I should trust them on an equal basis, because they are all in this kind of nice report.
Anders Arpteg:Would you have been in favor of if your name I mean I know you wouldn't be connected to this source, but if some name were to be connected in this source and it's being returned by ChatGPT, would you be okay with that? Reporting your political interests, for example?
Magnus Gille:If I would be okay to have my political interests being reported back by ChatGPT.
Anders Arpteg:Yes, if I do deep research on research on chat, gpt, I'm asking about something from your local neighborhood and I get your name back and it also says you know this person, magnus, has these kind of political interests, for example I mean on.
Magnus Gille:on the one hand, given that I'm like a middle-aged white guy, I I'm in the safest position ever to like it's no worries, but I could find that it would be deeply problematic for like a more general part of the population. It's like I would maybe not be that scared for my or that upset for a personal level, but like on a system level, I would find that very disturbing and it goes all the way into something we need to talk about later.
Heeenrik Göthberg:It's about the topics in deep research that we should not allow people to do deep research on. Like, for instance, I want a detailed report on my neighbor's political leanings, or whatever Is that?
Anders Arpteg:okay, yeah, I mean, it goes back to GDPR, of course, and this kind of sensitive personal data that this should be processed in a much more careful way.
Heeenrik Göthberg:So from a GDPR perspective that should be impossible.
Anders Arpteg:Yeah, so many interesting topics and we're getting into rabbit's hole very quickly. But before we go even deeper into the wonderful art of prompt engineering, I guess let me first welcome you here, magnus Gille. You're the product owner of AI Enablement, right, that's gone. Yeah, yeah, that's right, but even more so the reigning champion, swedish champion of AI prompting Is that the proper way to phrase it?
Magnus Gille:Yeah, I always stumble when I try to formulate what I should refer to myself in that context. But yeah, let's go for that one.
Anders Arpteg:Yeah, awesome, we need to unpack.
Heeenrik Göthberg:What was the Swedish champion all about? And then we also need to unpack a little bit at some point. You know AI enabled at Scania, which is a very large place, and there are many people working with AI in Scania actually, so we can also think about in what context you are working with AI enablement.
Anders Arpteg:That would be fun to know a little bit about. But before we go into the super interesting topic and all get experts into prompt engineering hopefully may not, but try to at least let us know a bit more about you. Who is Rida Magnus? What's your background and how did you get into prompting?
Magnus Gille:I guess in the end yeah right, so, magnus Gylla, I live in Marie Fred and, yeah right, so, magnus Gylla, I live in Marie Fred and I have a family, two cats. Career-wise, I have been mostly spending time with big industrial Swedish companies, so I had a stint at Ericsson for seven or eight years, and now I've been with Scania for around eight years. I had an extremely short period at Northvolt as well, but that was very, very short.
Anders Arpteg:Well, good for you, I guess.
Heeenrik Göthberg:And I have some Scania friends that are still there wherever it's going now.
Anders Arpteg:Cool, and you've been to Scania for some time and perhaps you can just speak about your role currently at Scania as well.
Magnus Gille:Yeah, absolutely so. Since a little bit more than half a year, I am a product owner for a team called AI Enablement, so my team is part of another organization where we are thinking very deeply about and trying to enable developers. So it's all about software developer enablement. We have a vision within our team that we would like to enable software developers to be a thousand times faster, if you measure from. Yeah we I mean aim for the sky, you know know.
Magnus Gille:Organize for the trajectory Exactly. But I think the thing here is that when we say that, we say that in the context of what we want to enable is a software developer having a nice idea of something, writing that in code, deploying that code on either a real live vehicle in the rolling fleet or a virtual vehicle. Have the rolling fleet or a virtual vehicle, have the virtual vehicle or the live vehicle, render data from that experiment, of course in a safe and secure and over there software update manner, and then get that data back the very next day, because that type of loop is currently in the vehicle industry quite long. I mean, we're a heavy regulated industry. There are a lot of processes and guard rails and a lot of stuff that you need to do.
Heeenrik Göthberg:And to get a little bit more nuanced on Scania, which is a global truck manufacturing company in the transport ecosystem. So we can then talk about, I guess, the we have the industrial system of scania, we have the commercial system of scania and industrial system scania, we have the supply chain and we have then ultimately, as I assume, now r&d, which is a very strong, a very proud part of scania, which which has really done so many amazing things, thinking about modularity, thinking about world-class combustion engines. So I assume you are a part of R&D and you're looking at software development in relation very much to the truck and the ecosystem around the truck.
Magnus Gille:Yeah, that is absolutely correct. So the scope here is software developers developing software both for applications that will be deployed on board the vehicles, so in embedded software, but also quite a lot of connectivity stuff that is deployed in in the cloud yeah, and over there updates and yeah, and I guess the whole automotive industry is going and talking more and more about the software-defined vehicle.
Heeenrik Göthberg:Is this something that you've heard, or relate to software-defined vehicle as part of this vision, why we need to be so good at software and doing it a thousand times more effective?
Magnus Gille:Oh yeah, absolutely. That's very much top of mind among me and my colleagues within Scania. We very recently went public that we have a collaboration with a company called Applied, who we will work together with to define and refine what we mean with software-defined vehicle.
Anders Arpteg:Is it the Applied AI in in Germany, or what company is this?
Magnus Gille:No, applied intuition. I think they're based in the US, okay.
Heeenrik Göthberg:So software-defined vehicle. For someone who's not into automotive, I mean like in one way we saw it the traditional automotive industry saw it with Tesla. You know it is software and wheels, or was the joke, but could you elaborate on what do we mean with this in the automotive industry and why is this a thing that is discussed quite heavily?
Magnus Gille:Yeah, exactly, and I think you're kind of hitting the head of the nail there when you're referring to Tesla, because that was really the company that took this into automotive this way of thinking about a vehicle, because previously and historically, if you have bought, say, a car, you would expect that car to be the best it would ever be when you buy it, and then it would just degrade over time.
Magnus Gille:The brakes would wear down and the seats would be less comfortable and stuff like that. But if you're buying a new mobile phone, you're not really expecting that to be worse over time. Of course I mean there will be a new generation of hardware platform that will have more memory and stuff like that. But you're kind of expecting your phone to have new features and new applications and new stuff for you to do. You're expecting the value of that phone to rather kind of increase over time. And I think that this is one of the aspects and one way of thinking about software defined vehicle that you're buying this big and very expensive thing, but it's something that is an investment that might even increase in value because the manufacturer will have the ability to create software, create and deploy new features and do that on a kind of regular basis.
Heeenrik Göthberg:Yeah, and I think you can even push it further. So, with AI and the productivity frontier increasing and the intelligence in our ecosystems, so we can't have assets that is at the level of intelligence of 2025, that has not been following to 2030. And the phone, I think, is a perfect example. What defines my personal phone is, yes, it's the hardware, but then it's the fundamental apps that I choose to use. For me, if it's a Spotify app, if it's a weather app or a stock app, so in the end, my smartphone is more than the hardware. It's what I decide to do with it.
Anders Arpteg:Can you go a bit more into what do you actually do to enable software engineers in Scania?
Magnus Gille:All right. So if I zoom back from the big organization I sit in and the part of the organization and I go back to my team, it's AI enablement. So we do two things. First, we try to be kind of ambassadors of using AI, being an AI first developer and all what that means, trying out the new tools as soon as they become available for software development within the company, recommend the kind of tools which are not currently available, that maybe we should have them available within the company, best practices and stuff like that. We're also building application that aims towards improving the life of software developers. And one thing, one concept that we talk quite a lot about within the organization not only only my team but my part of the organization is developer experience. So developer experience is this notion that has been quite popular in the general software industry, I guess. I mean I'm sure you heard about it, dx yeah.
Magnus Gille:And you can define it in a few different ways, but the way that we define it is in a very general sense. We're talking about developer experience, both when it comes to the tooling and really the tools being made available for the software developers. Are they good, are they bad, are they hard to use? We're talking about processes and methods and how do you work on a day-to-day basis and stuff like that. We're talking about culture, like feedback. Is it easy to open, raise your opinions and are you seen, are you heard? Do you feel happy when you go to work? So it's quite broad topic that we want to capture and the thing that we found was that it seems to be at least if you look at the research and literature, this seems to be a very key concept Companies that have high developer experience. They have happy employees, and that's great, because then I get happy colleagues at work, so that's nice. But it also leads to measurable improvements in code quality and the speed with which you can develop stuff. So it seems to be really a win-win situation here.
Anders Arpteg:So improvements both in terms of speed and quality.
Magnus Gille:Yes, exactly Good things all around. And how can you then measure it? Because if it's a good thing, you should probably try to measure it and understand it and see where you're at in the organization and see if you can take action to improve stuff, to where you need to improve it, or if you can continue to do things which are improving and promoting developer experience. The thing is that that's not so hard to do. You just go out and you sit down with developers and you ask them a bunch of questions and interview them about their daily life, but that's not a scalable solution, right? I mean, if we look at scania and we look at traiton group, we have hundreds or even thousands of developers, and my team is quite small. I don't have time to go out and sit and talk with people.
Magnus Gille:So what we have done this is all to say that a thing that we have built and that we're now deploying internally of Scania is a chatbot that you can configure to have a chat-based conversation of different aspects of developer experience, and it can be.
Magnus Gille:I mean, it depends on context. Maybe you are in a team that is building a specific product and you want to ask the user of your product, how they feel about it. Maybe you are in a larger organization and you would like to have a discussion and evaluation of processes used within your organization, or like team spirit or any of those things. So we have constructed a tool that will help you in an automated way to configure a conversation that will focus on exactly the thing that makes most sense to you here and now, and then it will be generated by AI. You will have a link, you can send it out and share it with the developers, and they can engage in a chat-based conversation where they're being asked questions, kind of checking off. Okay, we will have a conversation about processes. Tell me about which processes you're facing on a daily basis, which ones are good, which ones are bad.
Anders Arpteg:I see. So it's trying to measure and get some kind of sense of the current state of their experience.
Magnus Gille:Just so and in a very scalable way, because this link I can then share with like five colleagues or 50.
Heeenrik Göthberg:Super interesting Two questions, two quite separate questions. Just to frame the scope when you say software engineer, are we referring only you know how do we understand software engineering data engineering, ai, machine learning engineering? Is it an engineering experience, or are you only catering to one subset of engineers that is working now in the digital space, so to speak?
Magnus Gille:All right, I see the product as catering to a very wide definition of software engineers. I mean, all of the things that you said and probably more roles is stuff and roles I would put into software engineering. Probably more roles is stuff and roles I would put into software engineering. If you are working with software engineering in a big enterprise such as Scania, there are, of course, a lot of different roles being involved, and many of these never even touch code because you have, like, project managers and you have scrum masters, product owners, different type of roles associated with different type of agile and lean implementation of different processes. So all of these, I would say, are also kind of in the scope of software development.
Heeenrik Göthberg:So maybe we should talk about this then, as the end-to-end team product, the product team and the product team has a 360 degree view of engineering and roles in relation to what problem they're solving.
Anders Arpteg:so sometimes what should we go down this rabbit hole? I'm thinking we should go to the main theme here soon. I think we can get stuck here for an hour speaking about you know like software engineering and ai enable enable it for that and I think a super interesting topic. But actually I do think you know so many people including myself is here to hear more about your prompt engineering expertise and perhaps you can just start to give a bit more intro to how did you get interested in prompt engineering. Thanks, andre yeah right.
Magnus Gille:So I mean when. Sometimes when I talk to people about ai, they will probably proclaim that I've been working with AI before it was a thing. I've already done this thing for 10 or 20 years and sadly I cannot claim that long history working with generative AI or AI more broadly. But I mean I, among with almost everyone else in the world, got very interested when ChatGPT was released.
Magnus Gille:So that was what what three years ago now two and a half- 2022, right yeah and at that point I started to to use the tool and it had a lot of limitations. But the thing was that some people saw these limitations and said I tried this thing and it didn't work, so I gave up and I think the tool sucks and generative AI will never become anything. And me, either I'm very stubborn or maybe I'm not so certain in myself, because I always felt maybe it's a me problem, maybe I'm just not phrasing this in the correct way that can help the AI solve the problem for me. So I did a lot of thinking and a lot of experimenting. A lot of it failed, of course, because the early models were not that good, quite frankly.
Anders Arpteg:And it's okay. So you get started back in two and a half years ago and you could see that it was hard to make them work properly at that time. Is that when you tried to see how you could improve your own techniques, so to speak, to make it work more properly?
Magnus Gille:Yeah, exactly, and I tried to read up on what type of things actually work. And I mean, at that point in time you had all of these techniques. Some of them were actually helpful and some of them were maybe a little bit of just superstition. So one of the things that was popular to do was to introduce this kind of chain of thought, thinking that you ask a question, but in the prompt you ask the AI to think it through. You will have to think it through step by step and then you will report back each step and then you will reflect on that before saying the next thing.
Anders Arpteg:And it worked really well, I guess even back then, to try to do that.
Magnus Gille:It worked. I mean really well is maybe a pro village.
Anders Arpteg:Okay, and then you continued, and I'm very eager to go into how the actual championship worked and how you got into that. But if you were, before we go there, to just reflect a bit more about how has if we take Chetibiti, which I think you specialize in, how has that evolved since 2022, you would say oh, it has been immensely.
Magnus Gille:I mean, I remember specifically the step between when they released GPT-4 as a model compared to GPT-3.5. And things that was not possible before all of a sudden just got solved of itself. A lot of kind of prompt engineering tricks that I could could just forget about because the models are was getting that good I mean, the four version was a big leap right.
Heeenrik Göthberg:Which one was the biggest leap in your opinion since you started doing it? I guess we started from 3.5, right? Yeah, so which one had you do you think has been the biggest leap so far? I think?
Magnus Gille:four was for me the biggest wow factor, because four was the one that made things possible that was not possible before. I remember that when I toyed around with 3.5, it was always this where is the trajectory here? Is this as good as they will get, and maybe the capabilities and functions will kind of flat out, so it will be more gradual improvements or will it really become dramatically better with every new model that they release?
Anders Arpteg:And then they had a 4.0 as well coming in, a more omni version, a multimodal version coming in. Did that make you excited, or were you more interested in the text domain, so to speak?
Magnus Gille:I think 4.0 when they introduced that one, introduced chain of thought into the model right.
Anders Arpteg:That was a reasoning model or I think was it the image, the image version, the, the Omni. So I had image understanding, if I'm not mistaken.
Magnus Gille:You mean the recent one released now.
Anders Arpteg:No, the 4.0, which is like at least a year ago.
Heeenrik Göthberg:The O, I think, was the Omni, so sort of multi-modal, and now the recent one is 4.5, I guess, which is more than reasoning, I guess.
Anders Arpteg:Yeah, so many things have happened there and it's exciting to see. But let's perhaps move into the championship, and I think a lot of people don't even know that it did exist. Can you just give a bit of a backdrop to how did you get in contact with the Prompt SM Swedish Championship?
Magnus Gille:Yeah right, I mean, this was back in November last year. I had no idea that there was such a thing as a Prompt Championship. I got a message over Teams from a colleague who said that oh look, here's a Prompt Championship. Since you work with AI enablement, maybe that's something that would be interesting to compete in. And I thought to myself you cannot compete in prompting. This doesn't make sense at all. But then I thought a bit more and said, okay, I can give it a try. So I clicked the link and you were taken to a webpage where you were tasked with doing, I think, one or two assignments. So this was the trials. That was publicly available to anyone who clicked that link.
Anders Arpteg:Some qualification Exactly.
Heeenrik Göthberg:More specifically, what was the assignments and what did you do that sort of obviously, then cut you into the championship.
Magnus Gille:Let's see if I remember these assignments. I mean, the assignments were of the nature that here is a scenario, here's the thing that you need to do, and the task is to write a prompt that will, when given to an AI, give you help with doing that. If I remember correctly, one of the thing was that you were applying for work and you were called to an interview, but before the interview you would like to have like a rehearsal with an ai. So you should prompt the ai to interview you for a specific work and then, at the end of this kind of conversation, it should summarize the good thing and the bad thing and what you should improve for the real interview then so you did, you basically send like a transcript of the conversation you had.
Magnus Gille:No, no, that's the thing I mean. The thing you submitted was the prompt, not the result, not the output, but the prompt was the thing. Yes, yes.
Anders Arpteg:So no refinements.
Magnus Gille:I mean you could refine as much as you like, because in that webpage you had access to Gemini, so you could write your prompt and you could try it, and then you could see what was the result of the prompt, and then you could iterate on the prompt until you were happy and then you could send them okay, awesome.
Anders Arpteg:And then you qualified basically, and you came to the was it the finals, or what was it called?
Magnus Gille:yeah, yeah. Then it was the final. So I got a mail just before Christmas really nice early Christmas gift from a lady working at Google saying that, okay, you're part of the final, we're happy if you can join us, and I was, of course, very happy to join them. So the finals was in the beginning of February, I think, in Google's office.
Anders Arpteg:And this is organized by Google for people that didn't know that.
Magnus Gille:Yes, google, and also Digital Idag, which is part of the Swedish Post and Telecommunication Department.
Heeenrik Göthberg:And was it more or less organized? How was it organized? Was it like a hackathon, where you all showed up and you had to complete within the finals? How many was in the finals? All this?
Magnus Gille:Yeah, right, okay. So I showed up. I met with the other finalist uh, 20, 20 in total. Uh and uh, we were given free assignment, and I mean the setup of the room that was that we had our own little laptop and we were seated in a around a big table uh, u-shaped table and there we got to work on the assignment, the final assignments.
Anders Arpteg:A single, one or multiple?
Magnus Gille:ones. It was three of them Three Yep.
Anders Arpteg:Can you give an example of what the questions was or the assignments was?
Magnus Gille:Yeah, sure, let's see. The first one was that the scenario was that you're supposed to give a presentation on space in a big public venue and it should be easily accessible by people, because in the audience you have both technical and non-technical people who know about space. It's also broadcasted and now you need to prepare for this occasion, so write the prompts that will tell the AI how to prepare for this.
Anders Arpteg:Awesome and what was your thinking here? If you go a bit more into how you are thinking to create an awesome prompt, what was your thinking there?
Magnus Gille:All right. So the first thing I always do when I face a problem is that I reach for an AI. So part of the competition was that we were allowed to have the public version not logged in by different AI tools, so we could have, like, chatgpt or Grok was not released, but Gemini. So that was fair game. And a thing here that I maybe didn't mention is that it was quite tight time constraint, so we had like 10 minutes per assignment and if you have 10 minutes it's not very much time flies, especially if you sit and you're really focused. So the first thing I do is reach for one of the AI tools. I just copy paste the assignment and say help me write the prompt. That's the first thing.
Anders Arpteg:Okay. So you tried that first. Yes, okay, and what do you do then?
Magnus Gille:So right, and then I get the prompt. The prompt was a little bit too long we had a constraint on number of characters as well so I had to ask the helpful AI to make the prompt shorter, and after that I had something that I could manually iterate on, because I knew that there were some things missing that I prefer to have in my prompt, so I could add them.
Anders Arpteg:So basically it's kind of meta prompting. You were asking the ChatGPT to help you with the prompting itself.
Heeenrik Göthberg:Yes, and what was the vision of you and the ChatGPT in relation to what the solution is In the end, the prompt that you created, that you sent in? What did it do? What did it render?
Magnus Gille:that you sent in. What did it do? What did it render? Yeah, right, so the thing it rendered. I mean, I used a few techniques which are kind of best practices, like provide a persona to the prompt. So for the space presentation, I wanted the ai to adapt the persona of a well-known physicist and also a very good public communicator, and then I added a lot of other contextual stuff that would probably help out as well.
Anders Arpteg:And I guess that is important right To be very specific and have a clear context.
Magnus Gille:Yes, context is very important.
Anders Arpteg:How do you do it in that case? If you just try to give some example, how do you do it for the space presentation?
Magnus Gille:For the space presentation. I mean it's helpful context that it's an international presentation, so it needs to be in English. So I added specifically that you need to give me advice that is helpful in English. Since it's a big venue, there's a lot of people, I ask for advice that would help me with how to move on stage and how to kind of handle stress and how to act in front of such a big audience. I think I mentioned that it's streamed as well. So what should I think about? About microphone and cameras and stuff like that?
Heeenrik Göthberg:So in the end here is also about really good prompting is the more context you can give about what is the feeling, what is the experience, what is the risks, what is the stress, la, la, la. All this in the end helps a sharp prompt to give you something that is useful in that moment, in that context. Is that a sort of simple advice? Yeah, context is king, context is king.
Anders Arpteg:So, besides persona and context, anything else that you would recommend people to think about when creating their prompts?
Magnus Gille:Yeah, I have one that I have been recommending now, when I have talked with people after the finals, and this is the one that people usually say oh, I tried it and that was actually really, really helpful. And that is that it's often helpful to ask the AI to ask you questions back, I mean ask for clarification, ask for things that is missing. That is something which is usually very helpful. That sounds great.
Anders Arpteg:Some people are here saying you know you should also provide what output format it should be. Is that something that you think is important, or it's not necessary, or that can help.
Magnus Gille:I mean, many of the prompt techniques is, uh, like this many of the prompt. Some of the prompt techniques is to get a more factual, correct prompt. Many of the prompt techniques is to get an answer in a format that is suitable to you and your preferences. So providing example how it should look like in in the format output might help a little bit for like factual correctness, but I think it helps a lot for how you perceive the answer. I've talked to people quite a lot who complain about I don't really like these type of tools because they give too long answers. It's like, yeah, you can just tell them just to give a short answer and then what about providing like examples?
Anders Arpteg:is that useful, like giving first a number of examples and then asking you to do something similar? Is that a good way to tell it how to provide the output?
Magnus Gille:Yeah, sure, that is probably a very helpful way. I mean, maybe it strikes me as a little bit overkill if you're just a normal person who would like to ask a simple question, to sit down and come up with an example, a good example, a bad example. It might be a little bit overkill for that specific scenario, but it's very helpful if you're like building applications on it and needed to be behaving in a certain way.
Heeenrik Göthberg:Was there any of the assignments that were more prompting for software engineering versus the first one is a prompt in the context of presentation. Was there any one of these assignments that was more engineering-oriented, prompting?
Magnus Gille:No, it was not, and I think that was kind of by design, because the way I understand it is that one of the triggers for hosting this competition in the first place was an observation that Swedish population in general seems to be a little bit adverse on using AI and AI tools compared to other countries, and this was something that was talked a little bit about from people from Post-ITL Stilson who was there, and I said that okay, this might be a problem, so maybe we can do something to promote prompting among the more maybe non-technical part of the population, just to raise awareness. So I think that that was one of the reasons that the assignments were kind of non-tech, not really super technical, but more general, showing for normal people that it works.
Heeenrik Göthberg:Yeah, exactly, but I and this is a side note, but I went to, they had an AI day at Svenska Handelskammaren where Google were presenting or they were part of commissioning a report on exactly this topic. If I'm not mistaken, that was almost like one month or very close around the time where the national championship was. So the whole idea. Google actually made a report quite around the same time, pinpointing that we are apparently more adverse to using these tools than other Europe.
Anders Arpteg:In Sweden. Yeah, for some reason. Yeah, strange, but we have something to improve here. Okay, can you just elaborate a bit more? This was the first assignment that you got, some kind of space presentation. What were the others?
Magnus Gille:Let's see. The second one was once again applying for a job, so this time it was a more high-stake job. You're in a big organization and you want this chief of innovation or something, and you should prepare for the job and please write a prompt that can help you do that.
Anders Arpteg:And the final one.
Magnus Gille:The final one was come up with a very tricky subject that you would like to explain to someone and ask the AI to explain that thing to that person.
Anders Arpteg:So you decide the topic.
Magnus Gille:Yes.
Anders Arpteg:What did you choose?
Magnus Gille:That was easy because before the competition I mean, apparently Google were quite good at sending out press releases because I got a little bit of attention from local media. So SVT Sömland was home and did an interview with me and very often I found myself answering the question what is prompting? So really on that basic level. So that was the topic I choose to help to explain.
Anders Arpteg:What is prompting?
Magnus Gille:Prompting is simply the words that you say to an AI in order to make it do something, help you or engage in a conversation Awesome.
Anders Arpteg:Perhaps you could. I mean, you won the competition. Did you get some kind of motivation, or why do you think you ended up as a champion?
Magnus Gille:Ah right, the way the competition was judged, I understand, is that the prompts were judged by a human jury. So that was the setup, and I mean there's an interesting thing here that I used an AI to create a prompt which was then read and judged by a human. I mean it's a little bit here that I used an AI to create the prompt, which was then read and judged by a human.
Heeenrik Göthberg:I mean, it's a little bit. They could almost they could do the Eurovision, that you have the jury and then you have the. You know, you have the phone votes, you can have the chat. It would have been fun to have an AI judge as well.
Magnus Gille:But I mean what I did before the competition was just. Since it's for Google who made the competition, they have, of course, a prompting guide. I read the prompting guide. I did the best practices from that one. I tried to add a little bit of like personal color and like things in the prompts that would make them more fun to read for a jury.
Heeenrik Göthberg:So a little bit of jury.
Anders Arpteg:How did you do that that?
Heeenrik Göthberg:was good fun. What's the strategy here? To win, you know elaborate more.
Magnus Gille:Yeah, right. So I mean I had, let's see. For the first one I wanted to have one of these very dry papa skjems, so dad jokes, as part of the presentation, so I explicitly put that in the prompt.
Anders Arpteg:Like dad jokes. Can you give an example?
Magnus Gille:oh yeah, okay, like I mean, a dad joke is something that I use to tease my daughters. It's one of these very, very dry wordplay thing. Uh, let's see, they're hard to come up with something on the spot in english actually it's.
Heeenrik Göthberg:It's funny dad jokes.
Magnus Gille:Translates in Swedish almost like Gothenburg wits okay, I remember one because I I remember now because I have configured my chat GPT to generate one of these jokes every day, so I have a daily dose of AI generated humor in my life. I mean, like why did the skeleton run away from the fight? Oh, it had no guts, stuff like that you should laugh and feel a little bit embarrassed that you laughed at the joke that's vits.
Heeenrik Göthberg:That's vits in Sweden. It's so funny. I grew up on the west coast and we love our vits, and then you move to stockholm and it's frowned upon.
Magnus Gille:Oh, this is so low joking I live on the wrong side of the counter. Apparently you do, actually you do yeah, I mean, if you okay.
Anders Arpteg:so you, you won the championship and you got a lot of attention and congrats for winning it. Well done, of course. Is this something that you feel you have use for in your work as well today to have this kind of skill? I guess you have prepped a lot, you know, before coming to the finals in February, or how did you prep for it?
Magnus Gille:I did not prep at all. You did no, I didn't. And I mean when I went to the competition I did not have at all. You did no, I didn't. And I mean when I went to the competition I did not have the notion that I could win it. I actually had a meeting booked.
Magnus Gille:The competition was on a Friday and I had one of these kind of semi-important meetings at work and it was a lot of managers and the only time that you can find place in managers' schedule is kind of weird hours. So Friday evening 5 o'clock good, then I will go and have the meeting. The program ends at four o'clock here, so I would just go there, do the competition and then kind of slide out the door. That didn't quite work, and but I mean in, in, in all honestness, when it comes to like prompting skills, I don't think that I have that much better prompting skills than many of my colleagues were in Scania and for sure many of the people working with AI more generally in Sweden and Stockholm area. So the skills themselves are pretty much the same as they were before the competition. The what do you call it? Uppmärksamheten, attention, attention, exactly the attention, both internal from the company and external has been interesting.
Heeenrik Göthberg:Yeah, elaborate.
Magnus Gille:Yeah right. So, like in Scania, we have made available ChatGPT Enterprise recently, and this is not part of my organization, so I have no part to play in making it. It's younger yes. Exactly, yeah, and many part of the organization have said that, oh, we tried it a bit but we don't quite understand it. So if only somebody could come and explain how you use this thing and then ah, Magnus knows prompting, he should know.
Heeenrik Göthberg:I have a little bit deeper question on prompting techniques. I'm not sure, but I would like to elaborate on it together. Can we see some fundamental differences in tactics or techniques for the different categories of problems? And let me put up the categories. So, if we are prompting for coding, what is good practice or how should you think when you try to use it to debug or to build something, or even a prototype with lovable? And then we have the sort of the that I do, the presentation, the report, the report or even the presentation on stage. So are there differences here in how we are prompting? And we could stop at those two fundamental categories, because I think that from that one you can then get into the next level of copywriting, marketing and stuff like that. But use the way I use it a lot with reporting presentations versus coding, what's the big difference in how to think, what categories?
Magnus Gille:do you see that potentially are different techniques for yeah yeah, I mean we can start with coding and more general presentation, and yes, I think they will differ a bit, because, when it comes to coding, you have a more like. The nature of programming languages is that they have a correct syntax that you need to adhere to. If you you deviate from that syntax, you will have errors and bugs, and it's like something that needs to be fixed. Often, if you have errors and bugs, you may have both the code and some type of error output that you can then use to. Here is the code, here's the error output, and it's kind of obvious what needs to be done for the model. When it comes to presentation, I think, given that natural language is much more nuanced and complex, you need to add a lot more context. I mean, if it were in the programming domain, it's kind of obvious. Here's some code, here's an error message. Okay, you need to adjust the code to make the error message go away easier.
Anders Arpteg:Do you do like me for coding, that you simply provide a context and then say please fix or please explain, or what's your normal prompt encoding?
Magnus Gille:yeah, encoding and more general, for it support. It's just not that much context needed.
Magnus Gille:I mean sometimes I just do a screenshot of my screen and just put that picture and say, help, yes, a single word single word help fantastic, but when it comes to more like, uh, natural language, then you need more nuance and you need to understand, like, what is the purpose of this text? Should it be long? Should it be short? Should it be to the point? Who should read it? With what type of glasses should the ai, so to say, put on when it's giving you feedback on the text?
Anders Arpteg:Okay, so you can see that you need more context at least and perhaps personas, and perhaps you know the ask for question kind of tip and output ideas etc for presentations. But I guess also you can think of different things, like more like reasoning tasks. Have you gotten into more kind of reasoning type of prompting as well in some way?
Magnus Gille:yeah, for the longest time I had, like my own personal benchmark that I like to try on new models because, jumping all the way back to when chat gpt was new, we had gpt free and chat gpt three and a half right, and one of the things that was commented upon did not work quite well then was this classical riddle of let's see, it's a farmer, it's a goat, it's a sheep, and you have a piece of cabbage, and the farmer comes to a river and there's a boat and in the boat the farmer can only take himself and one of these things to cross the river. How do you cross the river? And then, in addition, the farmer cannot leave the sheep with the wolf, because the wolf will then eat the sheep and it cannot leave the goat, the sheep with the cabbage because, same thing, the sheep will eat the cabbage. How do you get all of the three over the river, given that you can just have one in the boat at the same time?
Magnus Gille:So this is called the river crossing problem. It's very old. I think it has been described in medieval times in paintings and stuff like that, and early version of Shat Kipiti could not solve this problem. It could give a text that if you just skimmed through it looked like reasonable. But then when you read through step by step, like the things are kind of popping over and all of a sudden there are two sheep and then he's leaving the wolf, and then it's like you take all of them in the boat and it was obvious, yeah, exactly.
Magnus Gille:Happy bullshitting and then be happy with oh, and now I have the answer. So that was one thing, but that was all quite fast by GPT-4. So I come up with maybe we can disguise this problem. You can, instead of talking about the farmer, you can talk about a father. And instead of talking about a wolf and a piece of cabbage and a goat, you can talk about oh, I have three kids, but I cannot leave two of the kids together because then they will hack my computer and don't like them doing that. And I can't leave the other two together because then they will steal candy from from, uh, the cupboard. And now I need to go to work and I want to take all the kids to work, but how can I do it? My car is so small I can only take one at a time. So I mean, functionally it's exactly the same problem, but it's not obvious that the model should or could understand that it's the same thing.
Anders Arpteg:So I guess you're saying that the abstract reasoning is rather poor. But if you do provide a context that is known to the model, that is seen at least in what happens with kids but it hasn't seen with goats, then it actually can find a solution because it can at least recall some kind of similar kind of notions and that was true for a short time at least, because now all of the state of the art models can solve that problem without an issue and it can identify ah, I see that you're trying to ask me the river crossing problem he can even see the underlying.
Heeenrik Göthberg:Yes, he's calling you a bluff even.
Anders Arpteg:Exactly Okay. So with these kind of reasoning models now like O1 and O3, and we have DeepSeaCar1 and all the Geminis and the Claudes are all using reasoning, extra techniques, so to speak, in the models that they have today.
Magnus Gille:Do you adapt your prompting techniques for them in some way? No, I mean, I maybe dial back on the extra thing I used to do before and I do less and less of them, but I still find that it's helpful to articulate myself as specific as possible. It's still helpful to provide as much context as possible, the provider persona. I don't do that all of the time anymore. I feel like as long as I have a clear context and a clear expectation on the answer and can say that answer in this way, please, then that's usually take me all of the way.
Heeenrik Göthberg:I'm curious, I want to learn from myself. I use a lot of stuff when I basically I want to do a report or research and from my point of view, I'm not really interested in the intel, in the facts of what I can find on the internet, so to speak. I have a novel idea that I want to convey. So nine times out of ten I I get almost frustrated when I say I want to help with the report and it's going to be about this, and it starts jabbering away on the solution. Hold on a bit. I want to load my fundamental ideas in here and I want you to improve on it. I want you to sharpen it, I want you to find the holes and I want you to find a much better communication. This is how I try to do it, because I really believe in my own IP, so to speak. So how do you work then, when you actually I'm not here for this sort of? I'm here to convey my idea and my message, but I know I'm not as good as communicator.
Anders Arpteg:Perhaps we should try it out, because Goran can actually potentially start writing something, and if you were to try what would be your first prompt and then, together with Magnus, perhaps we can try to improve it a bit.
Heeenrik Göthberg:Nice. Define a task. Ah, let's prepare for the Data Innovation Summit keynote.
Anders Arpteg:Yeah, perfect. So what would be your first prompt, Henrik?
Heeenrik Göthberg:Yeah, perfect. So what would be your first prompt, henrik? So the way I work, I started to chat with my. I don't write, I chat, I talk to it. So literally, I put on the microphone and I tell them and I talk to it, and then I basically start with context and framing what this is all about, and then so let's do it, let's do it.
Anders Arpteg:But in this case, if you.
Heeenrik Göthberg:Put on the microphone. Okay, it's going to be faster, yeah, like this.
Goran Cvetanovski:Let's see if it's going to work. I think it's going to work.
Anders Arpteg:Cool, hi, okay, just a second.
Heeenrik Göthberg:Yes, first one Go, henrik. Okay, can he hear me? I'll put on the Okay. So I would like help with preparing for a keynote presentation on embracing data and AI to unlock the transformative power and values. And then I he didn't hear me, so.
Heeenrik Göthberg:Hi, I would like help to prepare a keynote presentation on how we embrace AI to unlock the transformative power of it. Can you help me? Hejsan, jag är redo att sätta igång direkt. Bara berätta vad du vill att jag ska fokusera på först. Alright, I would like to help to prepare a keynote presentation to embrace AI to unlock the transformative power in enterprise, but I'd like me to provide the core intellectual property and ideas, and I'd like you to organize it in the best possible way for a keynote presentation of 15 minutes.
Heeenrik Göthberg:So this is okay, it doesn't work it doesn't, because I usually talk to it and then I transcribe it and then I look at if the transcription is kind of okay and then I push send right, so kind of okay, and then I push send right. So it goes back and forth like that and the only way. Hello Shattupity.
Anders Arpteg:Ah, it can't hear us you have to write instead.
Heeenrik Göthberg:I think it's too far away from Now let's write it instead.
Heeenrik Göthberg:So I would, maybe but let's write it instead so I, I would, I would, so it's okay, but maybe we can talk about it instead. So I would basically go in and frame the background and what am I? I frame my objective and my background very shortly, as short as I can, and I would then then look. First of all, don't start immediately. I want to lay out the context and some ideas. Is that okay? Okay, shoot. So I do a very, very, very conversational style, where the problem is when it's jumping ahead too fast, in solution mode, where I kind of want you have you understood me? Do you understand the task? This is how I do it. I'm not sure if it's a good way or if it can be done better.
Magnus Gille:No, it's a good way. Or, if it can be done better, no, it sounds like a good way. I had a guy telling me that what he likes to do is begin with the prompt. Now I want you to listen, and the only thing that you should do is answer for everything I say.
Heeenrik Göthberg:I like that because this is actually a problem, because I don't say that first prompt, then we are not in sync.
Magnus Gille:No, it tries to be helpful. It wants to give you here's a bullet list. Here's the thing. Think about that Like chill. You need to listen.
Heeenrik Göthberg:Chill, I have the content, not you. I want you to give me the format and the dad joke. Yeah, that fits. Yeah, I like that. That simple topic is a little bit like frame very fast how you want to interact. How would you want the interaction to play out? Even yes, as an original prompt, exactly and in the beginning of your conversation.
Magnus Gille:There it seems like you have a lot to say but you don't want an input. So just ask it to give you no input at all, just whenever it has like registered that you have said something, and then you can talk. And you can talk and you can change your mind, and you can go back and forward, and then at some point in time you can kind of pause and then now summarize everything we have said. Is it coherent? What is the big glaring thing that we're missing? That is how I would do it, I think.
Heeenrik Göthberg:Yeah, let's continue. This is very helpful and this is a good example. I mean, like another thing I also always do is to okay, how do I build up a red thread and how do I build up a story?
Heeenrik Göthberg:I mean like the classical rhetorical like this so should we do a Jeff Bezos memo out of this, or you know? So let me. So, the first thing I want to do I want to do this, I want this to do, but let's also think about the structure of this presentation. I haven't even gone to content yet, but also, sometimes, either I do it before, I do it after, you know the fundamental flow of the saga of the story, so to speak. You do that or how do you do that. This is typical when you do like a 15 minute keynote.
Magnus Gille:I don't think I have used ai actually to kind of invent a red thread. That is something that when I do the type of presentation I I have the red thread in my head already. But but that's good. I should use ai more to kind of help me and evaluate if the thing I had in mind is helpful or if there are other stories I could bring in. That was kind of fitting to the context.
Heeenrik Göthberg:So we've been working quite successfully in order to break down a good narrative. I mean, like there are many different techniques, but if you, we talk about the what's the? You know we use like a, we use a way to discuss this at Dyradax. Okay, so what is the opening, or what is the prologue? What is the first plot point? What is the second plot point, what is the twist, what is the climax?
Heeenrik Göthberg:Like how you build a movie, so to speak, you have a story, you have a problem, you have a struggle To build something that is sort of narratively appealing. So those kind of things I find it can be quite useful to. I know I have a key message, but how should I relate that key message over 15 minutes? That is most impactful. So is it a red thread? Kind of it's like going from a content or something you want to say to what's the c? How do I say it? So it becomes very hooking, so to speak. That's how I use it sometimes. But we have an idea, but then I find that it's. It's brilliant, of course, to to. You know how you build a movie, how you build a story, how you build a fiction and gives you ideas. Then well, start in this angle and go left or right first and then go, then you take the hook.
Anders Arpteg:Okay, so how if you were to do some similar kind of task like this? How would you start? If you just talk about it briefly if you were to prepare a presentation at scania, for example, what would be your first thing? You did?
Magnus Gille:all right, the first thing I would do is have this kind of brain dump of everything that I'm having my head right now. Either I have like a very articulate vision of what I want to say and in which order it should come, and then I can more or less like have the whole presentation just as is, or I can have more. Always, you should talk about this and I should talk about the other. No, I changed my mind. I want to do this instead to kind of have a brain dump of everything where my head is at right now.
Anders Arpteg:Would you upload like a presentation as well and use files to get some kind of?
Magnus Gille:context. That's the thing I mean. Since these tools are now so multimodal, like I can use anything I mean I can. I can have like a stream of conscious, just conversation with the thing, as we discussed before. If I have images that I like, I just put in the images. Why not if I had any documentation? Oh, I read this research article and that's kind of nice, so let's put that one in in as well. I mean, you could even put in source code here's the thing I coded or a product that I'm working on, and here is like part of that that might be helpful. Or if you have like test cases or anything, I mean all of the things that mentioned. It's just modes of expressing an id. You can express it in in language or in images or in code or presumably in music, and if I can put all of that in as context, I think it is helpful.
Anders Arpteg:Context Well, not music yet, but Maybe somewhat, but I guess okay, putting a lot of context in there, perhaps in a more or less random order, and then starting to iterate a number of times to get some kind of yeah, I mean, it all depends.
Magnus Gille:If the context I have is in kind of random order, then I just put it in in a random order and that's fine. If the context I have is really well, I have the presentation because I held it before. I just need to refine it. Then I put in the more clear presentation, so you can do like anything.
Heeenrik Göthberg:So I think the key word here is to start from where you're at, so you don't want to go back. If you have matured on a topic and you want to now do the next level, it is really stupid to go way back, because then you feel frustrated. Well, I'm already smarter than this. This is not helpful. So the core message here is like do a brain dump to the best of your ability where your thinking is at. I think that's the core message here is like do a brain dump to the best of your ability where your thinking is at.
Heeenrik Göthberg:I think that's the core message here.
Anders Arpteg:I see the time is flying by and perhaps it is time for a short break.
Goran Cvetanovski:It's time for AI News Brought to you by AI 8W Podcast.
Anders Arpteg:So we have this kind of break in the middle of the podcast to just reflect on some of the recent news that we heard about AI in recent times and we try to keep it short, like three, four or five minutes. We usually fail.
Heeenrik Göthberg:But Three, four or five minutes has never happened. Come on, that was the joke.
Anders Arpteg:I heard it, but let's start. And, of course, the guest Magnus. Do you have any news that you'd like to share?
Magnus Gille:Oh yes, I recently read a research paper that was kind of trending on Exxon on Twitter and it was from a research organization called METR, m-e-t-r, m-e-t-r, m-e-t-r, exactly.
Magnus Gille:And what they have done was that they had investigated what type of task AI have the ability to solve, and they expressed that type of task in how long would it take a human being to solve this task?
Magnus Gille:And then they plotted how this has evolved over time, beginning with gpt1, which could swallow exactly. Here we have the paper on screen which they started with gpt2, claiming that, well, the type of task which gpt2 could solve is more or less something that will take a human a few seconds. It's kind of stringing together a few words to have a grammatical correct sentence, a spell checking, more or less, and then it evolves over time. But, as you can see, the scale here is not linear. The scale here is logarithmic, meaning that it will double this time every seven months, and if you double a few seconds it doesn't really matter so much. But if you continue to do this over time, you will quite soon go into time spans which are quite long. So right now we seem to be at the domain that Sonnet 3.7 can do stuff that will take a human being an hour. And give that a few more years and we're up to months worth of.
Heeenrik Göthberg:Not in years seven months, logarithmically seven months. Okay, though a couple of years, but I use this. This is one of my key. When I found this report, I said I showed it immediately to Mikael, who is our head of research. Like dude, this is really the best way we have to explain what we mean with organize for the trajectory. I used a Sam Altman quote before I found this report where he said, like the same thing, 3.5 was good for seconds, 4.0 is good for minutes. You know what is chat chippT? Next is hours and then, if you keep doubling, that, you really get to. It gives you a way of thinking, in my opinion, on don't organize for AI here and now. Think about what happens when the complexity or the abstraction level we humans will work on shifts.
Anders Arpteg:I think it's massive, and also that some people think AI has hit a wall or something, but that's certainly not the case if you look at this kind of scale.
Heeenrik Göthberg:Logarithmic is quite profound, doubling is quite big.
Magnus Gille:And then, of course, it's a question will it continue to do this?
Heeenrik Göthberg:I mean, I think so, and if so, that will have quite big ramifications for sure, but this is a deeper topic because you can argue, if everybody stays the same and we think it's the scaling laws that will double this. Maybe not, but we all know the real researchers working on this. They're working on all kinds of problems the reasoning problem, the scaling problem. You go to Jan Lekun, he's working on that. We need something else, we need the next level of this and and of course we're all working on this and this is trickling down into what we see. So I think it will continue, yeah of course I think it will continue.
Heeenrik Göthberg:I don't, I don't see it, I'm not sure it's going to go much faster. I think seven months doubling is quite.
Anders Arpteg:It could go even faster.
Heeenrik Göthberg:essentially, Maybe then you use when you reach a certain maybe breaking point, I think it's interesting to just ponder about the benchmark that we do have.
Anders Arpteg:They get saturated at some time and they get close to 99%, even when they exceed the human capabilities time and they get close to 99% and even if they, when they exceed the human capabilities, it's really hard for a human to say how much better it is, because you can't really tell as a human, so you need another AI potential. This was Hagai's comment. I'm in a really cool paper.
Heeenrik Göthberg:It's a really cool way of communicating what is going on to normal people so you can relate it. This is what I like. Why did you like it? Why did you want to push this report?
Magnus Gille:no, exactly, I mean the, the kind of very simple visual representation of how fast things are going. Now I mean you can have this anecdotes on oh, I could not do this on chat, gpc three and a half, and now I can do it on four, and that was just a year. But to see all of the models kind of just laid out in this kind of nice neat line and then the scale that goes from seconds to minutes, from minutes to hours, from hours then presumably to days or months.
Heeenrik Göthberg:And even if it's really hard to, I like the fact that they've been trying to relate it back to a benchmark that has to do with us as humans. That makes it so much more relatable. And then you can dig into this what is the cost? So we are measuring complexity of tasks.
Anders Arpteg:I'm not sure. It also gets back to what we usually say, that AI progresses in exponential scale, and this is actually exactly what this is showing and the low accelerated returns you know from Kirchweiler, etc. So it's exactly saying what we have said for the last five years this is the reason why we started the pod.
Heeenrik Göthberg:We need to demystify AI because we need people to start thinking about the trajectory. I'm not worried about the fucking shit we're having here and now. I'm worried what happens when half the society is not on the trajectory. I'm not worried about the fucking shit we are having here and now. I'm worried what happens when half the society is not on the trajectory. Even With a logarithmic scale, you will never catch up and then we will not have a diverse, inclusive society. We will have a divided society.
Magnus Gille:And we will have a lot of power concentrated on the people who are on the top side of the scale yes, and that has never been good, okay, cool.
Heeenrik Göthberg:Henrik, do you have any news? I don't want to dilute his message. No, I think this is the best news in a long time. Awesome, awesome comment.
Anders Arpteg:I had just had a boring. You know some new model coming up, which they do every week. But you know, this time it was Gemini and it was the version 2.5. I thought it was actually really big and I always had a big faith in Gemini. It's never been really realized, so I've been disappointed for a long time, I think. But I think this time it really is getting further and further ahead. If you look at the LM arena, the chatbot arena, further and further ahead, if you look at the LM arena, the chatbot arena, it's really a big, big step ahead of anything else. So if you compare it, I think the top three.
Anders Arpteg:Now, of course, gemini 2.5 is far ahead of anyone else. And the chatbot arena is basically when humans doing blind tests, trying to see which answer is the best without knowing which model they actually are. Blind tests, trying to see which answer is the best without knowing which model they actually are reviewing. So it is significantly better than anything else. And then the second part is Grok, I believe still, and then I think I have to double check here. The fourth one is GPT 4.5. So that's OpenAI. So OpenAI is on third place and then it's a lot of Gemini models. After that, openai is on third place and then it's a lot of Gemini models after that. But Grok, you know they were taking this place like a month ago when they released Grok 3. And now Gemini is taking it back with a big leap and that's kind of interesting. And of course they also beat a lot of other scores and benchmarks in reasoning and in math and different other metrics that we've seen.
Anders Arpteg:What I found potentially even more interesting it's kind of sad because they didn't say anything how it works, and that's sad. They are very scared and we know they're super scared about DeepSeq and China Model and other competitors is just stealing the latest version. So of course they need to keep it close and they can't open sources directly. Even Elon Musk can't do that.
Heeenrik Göthberg:But why is it better? What is it that? This thing, what for you?
Anders Arpteg:Have you tried it? Yes, I did.
Heeenrik Göthberg:What was the aha moment for you?
Anders Arpteg:Well that it cannot do so much more complex things. It has a really, really big context window, so it's a million tokens today, which is bigger than most others, and it will soon have two millions, and what that means is it can do some of the tasks that others can't. I saw some other people that were playing around with this and they were looking into coding actually and what it could generate in a single, zero shot prompt to do so. Someone asked it do a flight simulator. What Can you ask it in a single prompt, one shot, to do a flight simulator? Yes, they could. You could have a game. You could interact with the steer of a plane through some kind of virtual environment, and it worked in a single prompt.
Anders Arpteg:Someone said you know, generate a simulator for this kind of rubik's cube. And that's the person. I tried it with chatty bt and claude and others and deep seek, and it never got that to work. But with this one they could have the the simulator rubik's cube. You could see a 3d representation of it, you could ask it to scramble the cube and it actually kept the colors of each cube stable. Other models have just messed it up and when you try to scramble a cube. It didn't work, but then you could ask it to solve it and it actually solved. First, it's four times four cube and it worked. Then it tried okay, but let's send a 10x10 cube and it could scramble it and just keeping all the colors intact when scrambling is hard itself.
Anders Arpteg:But then solving a 10x10x cube and it did that as well. It took a long time for it, but it did. That's insane in some way. And, of course, a normal do a snake game? Yeah, of course, but it could do a snake game with power apps and much more visual, appealing kind of presentation.
Heeenrik Göthberg:And why is this useful or practical for us or for me? Why should I now switch to this? Or in what way do you think this helps me where I'm right now? It doesn't even. I don't even need this.
Anders Arpteg:If we take this kind of scale that we're thinking about here, like one hour tasks. You know how long would it take to create a flight simulator. I mean, it's certainly more than an hour. If you were to build a Rubik's cube simulator, of course it would take much longer than that. And I think you know, with having this kind of long context and then also having this kind of amazing ability to write in a humanly appealing way, but then also writing code that works in one shot in a super accurate way, is on a level that I haven't seen before.
Heeenrik Göthberg:Yeah, and if we now take this, we're not saying shit is going to work. We're not saying we don't need software engineers and all that, but everything is getting better. That's the reality. How do you interpret the world when you see new models? Because this is now an example of something that maybe pushed this one more step, another seven months.
Magnus Gille:How do you relate to?
Heeenrik Göthberg:that. What was the deeper implication that you are thinking about when you think about this? What's the deeper implication that you are thinking about when you think about this?
Magnus Gille:Right. I mean, yes, I do believe that we are seeing rapid growth when it comes to capabilities. I think also, at the same time, that we see that it is hard to implement, it's hard to generate the value, it's hard to generate kind of making use of these things, exactly Because reality is messy. I mean, we have organizations and we have companies and we have things in place which is not currently lending itself perfectly to integrate, to just click in the next model and everything will be like 10% more efficient. So it's both things that we will have like crazy capabilities, but it will probably take some time before we realize them because it will take time to integrate them in the current setup that we have.
Heeenrik Göthberg:But this to me, now we're, in my hope, turf I don't think the old model of organizing and the old model of setting up an enterprise lends itself to unlock this value or this potential. I think the underlying fundamental world model how an enterprise works needs to be rethought in this context. That is my fundamental belief. What do you think?
Magnus Gille:Yeah sure, yeah sure, and I mean they have been talking about this. Well, will, will this type of technology let make it possible for, like, a single person to set up a billion dollar company, I mean with with a little bit of legal, a little bit of I don't know software development and testing and front end and commercial and like everything. Could you like boil it down so it's just one person controlling a lot of agents and then generating the value, because that person's vision and that person's kind of drive is so crisp that the only thing holding that person back is how many agents she or he can deploy. Do you believe in that?
Heeenrik Göthberg:vision. Do you think that's? It's a really interesting thought experiment, by the way, but is it realizable?
Magnus Gille:I think it is realizable, but maybe for a very narrow set of type of businesses. I mean, I don't think that like a single person can spin up factories, because factories inherently requires a lot of kind of construction in the real world, physical assets.
Heeenrik Göthberg:Physical assets. Physical assets yeah, it's a different story, maybe.
Anders Arpteg:Let's perhaps get back to the future and if we will have single-person unicorns in the future. We've talked a bit about that before.
Heeenrik Göthberg:Yeah, someone just stretched it. What about zero-person unicorn?
Anders Arpteg:Goran, do you have a news or should we?
Anders Arpteg:I don't have any positive ones before you do, I just like to add one more. I mean, just yesterday, they in Shattipati released a new image generator. Yeah, let's talk about that one a little bit, because I think you know the big thing with that besides being able to generate amazingly good images, surprisingly slow it takes like a minute, so it's much slower than normal image generators, but it's amazingly good, of course, better than anything else. But what I think is strange is you are free to upload any image of myself, or even ask something to take Elon Musk and putting him on a horse with a gun, shooting people, and it will do so, and it doesn't have any guardrails. Oh look, goran is together with Elon Musk here.
Heeenrik Göthberg:Okay, so the new image generator that you are talking about, is it part of the 4.0 family?
Anders Arpteg:It's actually the first one that is natively in the same model, so it's part of 4.0. So I don't need to do anything else.
Heeenrik Göthberg:so by using my normal 4.0, I actually get access to this. I don't need to worry about it what about video?
Goran Cvetanovski:uh, you have actually uh, if you look at it here my dear sorrow.
Heeenrik Göthberg:I I, you know what I was doing earlier this morning. It was a little bit in order to prepare for this presentation, but I did it for fun. In the Data Innovation Summit we have the web app, the event app, agorify, and in the Agorify app, you know it's a nice thing to upload your one minute. Who are Dardochs, you know? So of course I'm going to do that AI. So I've done two ads today, like using them. If you scroll down in ChatGPT, you get to the video. There's another way of doing it. It's not Sora. Do you know what I'm talking about? If you go down the, if you click on the top banner and then you select which model to use, then you can get to the video model. I was playing around with that one. If you click there and you go down further down, I don't know which one to use. If you would build a video and advertising, which model would you use?
Anders Arpteg:Because I have In any case. I don't think we need to go into that, but I think it's surprising that they don't have any guardrails. I mean, isn't this? It doesn't open up for fake news in an insane? What are you thinking, magnus, about this?
Magnus Gille:I guess it does. I mean there were a lot of discussion before the presidential election in the US that how much will kind of fake news and AI-generated stuff affect people's opinion and I guess, maybe because they felt that well, it didn't have that large effect. There were like one or two instances of someone creating the voice of Biden, calling people up, but maybe they have come to the conclusion that people are savvy enough to kind of see the difference. I mean the models. Still they do have a little bit of telltale. If you go back to the Elon Musk one, I would think that I would be able to judge that it was kind of good really, yeah, I think.
Heeenrik Göthberg:so it's good. It's really really cool and good.
Anders Arpteg:I think most people will be fooled by it, but I do think that they have some technique watermarking technique or something to make it possible to always see that it has been generated. So what is?
Goran Cvetanovski:interesting is that, no matter how good quality picture you have and you say it to replicate it, it will never, ever be very, very, very close to what you're uploading. So it's good, but you can see it.
Anders Arpteg:Most people. I don't think you see it.
Goran Cvetanovski:Of course you will not see it it but it doesn't enable you to have, like, a genuine picture of myself. You can see the nose is bigger and etc. And all these other things. See, it's me but it's not me, right? So that is the thing.
Anders Arpteg:But other people won't see that.
Goran Cvetanovski:I think it looked like like you, but I have a double ganger as well in the in the world. That doesn't mean that it's me.
Anders Arpteg:I think you made a good point, magnus, that people, I think, now are expecting that there are a lot of fake images and you can trust images and it's super easy to generate this and anything anyone can do it it's not you.
Heeenrik Göthberg:It's super close and you can see this is a fake image of Goran, of course, no, no, it is an American version of Goran. It's an American doppelganger.
Magnus Gille:There's another thing that comes to mind I mean a thing that has been kind of consistent over the last year, year and a half, is that a lot of people working at OpenAI, who worked on safety and security, have left the company. Some of them have made public statements about that. They were not happy about the state of safety and security and risk within the company. Some of them have made public statements about that. They were not happy about the state of safety and security and risk within the company, and maybe this is just an example of well, the company has chosen to take a more or less risk averse side of releasing stuff. Just try it out and see what happens. Very good point.
Heeenrik Göthberg:And then we get to the point you know where we have talked about it in the other angle. Can you really regulate stuff that is so fast moving without actually getting examples and you know when and how do you regulate this shit? It's really hard right. I mean, like we had a conversation with people from Rice where we did the example, like when we go in a regulator on other industries or we've done air, you know air, air traffic or whatever, or even electricity. It's a very clear frame of what you're regulating and it's a fairly fixed thing. You are thinking and looking at and trying to regulate. And here now we have something where, with the productivity frontier or the innovation frontier is logarithmically doubling in capability every seven months. So how can you regulate in front of that?
Anders Arpteg:It is impossible. Perhaps we should have a topic about this, because we did speak a bit about the open source impact potentially on the safety of these kind of models in the future. But still, I think we all agree that some kind of guardrail should be, at least try to be put in place so people can't ask you to build a bomb yeah, I'm not saying what do you think? Yeah, okay.
Magnus Gille:I'm thinking like this when I read papers, like research papers, about capabilities of models. A thing that strikes me is that when, when you actually read the paper, you quite often see that the research was done on ancient models. I read a paper the other day that was oh, this was kind of impressive. Oh, it was GPT-3.5. And the reason for that is that the pipeline for doing the research, collecting the data, writing the paper, is so long that it doesn't keep up with the output of the models. Our legal system for sure take much, much, much, much longer than like the output of models.
Anders Arpteg:So yeah, shouldn't the tech giants in the ai lab still try to have you know, be proactive here and not let abuse to be too simplistic?
Magnus Gille:yeah, I mean I, I think that's quite sure, absolutely. I mean, the tech giants have so much power now to control this type of technology moving so fast. So, yeah, there's a large, huge responsibility for them to act in a responsible way.
Heeenrik Göthberg:But the bottom line is the productivity frontier now is clearly at a pace with the old world view of how you organize things.
Heeenrik Göthberg:If it's regulation if it's regulation, if it's enterprise doesn't keep up, so we need to find other. It's not that we need to abandon it, but we need to. We need to rethink, reimagine how this is done, and? And because because in in reality, when you have a mod, a regulatory operating model that doesn't keep up with the productivity frontier, that means that if you are on the cutting edge of innovation, you are making up the rules. So you need to decide, because there is no regulatory framework that works in relation to what you're doing, and that is a complete failure of the regulators in relation to trying to impose the old world model on these guys.
Magnus Gille:It doesn't work. It will never work, no, exactly. And if the scaling will continue, then I mean it will be a larger and larger gap, meaning that we need to innovate and understand how to do regulation faster.
Heeenrik Göthberg:We need to innovate regulation period.
Anders Arpteg:I mean, I think it's too simplistic view, I think, to say that I think you know we can remove the regulation from the equation and still say how can we keep use of AI safe in the future? And I think you know all the tech giants and the providers of the latest AI models wants to do that. The question is really how can we do that? And regulation is one thing, but I think no one really wants to be coming to the point when someone has developed the next coronavirus and they say, ah, we use the chat GPT to do it. That is not something that I don't think even OpenAI wants to be happening. And I think the question is then what do you do? And I'm a bit surprised with the GPT 4.5 release and especially no, sorry, the 4.0 and the image generator, because they seem to simply remove the guardrails.
Heeenrik Göthberg:But it's interesting Sam Altman said have we been on the wrong side of open source? Yeah, which is completely different topics. But if you look at the macro perspective here, I mean, like he's releasing stuff, he's not open source. Open source but the fundamental underlying philosophy what should we guard? How should we guard it? We're not going to guard our shit because they're going to do it open source anyway. Yeah.
Anders Arpteg:So something is happening here. That is actually where he was speaking about, perhaps not so much about open sourcing, but instead releasing the guardrails, exactly so.
Heeenrik Göthberg:I was thinking. I was thinking like was he really trying to say we're going to go open source? Or was he saying like, fuck it, everybody else is going to do it open source, so why should I care? I mean like so what? I mean like? It's almost the underlying message here when he said that who knows right, who knows?
Anders Arpteg:Perhaps we can go to this topic right now. I mean, we know we have closed models today, and I think every kind of at least Western tech giant is closing the latest models. Grok is even them. Meta is the only one that is not. Well, no, I don't think, but at least Google, of course, is doing it and OpenAI is doing it. But then, you know, at some point there will be an open source model and they can be easily fine-tuned to remove, adding guardrails if you want to. And then we have DeepSeek that simply did that. So if we take this, magnus, what? What do you think? Should we open source the latest kind of models, since they may be soon available anyway, or what you're thinking about open sourcing models and the impact on safety?
Magnus Gille:man, that's such a tricky and big question. I mean, on the one hand there are like several angles to this one right. On the one hand, you have the kind of risk, from a fundamental risk point of view, like could this one be used to create bioweapons and stuff like that? So that is a risk that I would rather not take. That will kind of push me towards well, wait a bit with open source. Kind of pushed me towards well, wait a bit with open source. On the other hand, we have talked about kind of aggregation of power of the people who own the models. That also strikes me as something that I'm not super-duper comfortable with. That pushed me towards ah, it should be open source so everyone can have it.
Heeenrik Göthberg:Inclusivity is the number one game. You can argue, this is really extreme. Like Pierre one of the podcast guests, this is really extreme. Like Pierre, one of the podcast guests, he was really on that side. This is just a narrative of the people that right now is in power that wants to further their divide for their commercial interests Good or bad, right and where he says there's only one way, and that is complete open source in relation to minimizing the AI divide. And then we need to work on safety in other ways.
Magnus Gille:As long as there is at least a theoretical possibility for each of the major risks to have development of defensive capabilities equally fast or equally effective. And I'm not sure if that is true when it comes to, for example, bioweapons. I mean, what will take shortest time to create a bioweapon or to create the mitigation for the bioweapon? Well, I'm not an expert in bioweapons, but I see where you're going.
Anders Arpteg:Well, if we phrase an expert in bioweapons but I see where you're going Well, if a phrase like this, then we can also think from a practical point of view.
Anders Arpteg:If we do believe I mean GPT-4.5 is rumored about 20 trillion parameters or something Super big.
Anders Arpteg:No one can really run it unless you have lots of money Then perhaps in the future they will continue to be more and more expensive to run. If you take the I guess Gemini 2.5, it seems to reason a lot. You can see how it reasons a lot and that's really expensive to just do this kind of inference time compute that they are doing, and probably a big model as well. So once again, since we add the reasoning part of it, it will be more expensive to run it. Not just the scale but actually the reasoning part will add costs to it to a large extent. So in some sense the biggest frontier models will be really expensive. So it will be a few players in the world that probably can run the best ones, even in five years. Perhaps Would you agree with that. So it would be like a few selected players in the world that will have access to the top frontier models. That is even 10 times better than this one or the hours that we have today.
Magnus Gille:Yeah, sure, that makes sense.
Anders Arpteg:And then it doesn't really matter if it's open source or not, because no one can run them anyway. I mean, they don't have the money to do it. So then I think the open source argument doesn't really hold anyway.
Heeenrik Göthberg:But I think when we talked about this in our Christmas episode what's trending and what are we predicting? And always the bipolar conversation open source versus closed and we came kind of up to well, maybe there will be extreme frontier models and almost like you say, it doesn't matter if they're closed or not. And then what we really should aim for as a society is a shitload of more application-centric or narrow or open source. We talked about the frontier models, which were from all kinds of practicality. Their useful application is to distill out useful applications, but that's another question. I mean.
Goran Cvetanovski:I think you know we have said a number of times that it will be levels or hierarchy of models out there.
Anders Arpteg:Some of them will be the super expensive frontier models. You can't really use them for practical purposes because it's too expensive to put in production anyway. So you will have some kind of specialized model that is used for your data, for your purpose and your application, and they will be smaller and cheaper to run. So I think that will be clear that that will be huge, like millions of these kind of more specific models that you can run from a practical point of view. But still going back to the original question about the few selected frontier models, it's kind of disturbing to think that it will be only a very few set of actors that will have access to them in the future.
Magnus Gille:You have to put a lot of faith in the people who control those organizations that they're acting on the good of whole mankind.
Heeenrik Göthberg:This is one, and the other one is the fundamental cultural values of these models. I mean, like we see it already now, an Americanized language is a more consumerized consumerism language than maybe the Nordics or whatever. So we fundamentally also see a dilution of culture. You know that everything gets based on the model, the frontier model worldview. I mean, like that's also problematic to some degree, or maybe not, but maybe it drives a trajectory, that is, that Nordic values Is that sort of Swedish values. I don't know. We talked with Luve Börjesson, who's heading up Kobi Lab, and he's really talking passionately about the sovereignty around language and the importance of that. If you believe that the values that we want to also have in our society is our values, that's a tricky one. Yes, it's a super tricky one. And then, once again, open source may be not a bad idea.
Anders Arpteg:Well, I don't think the tech giants will do it, simply from a commercial point of view, because it's so easy to distill and copy it if you show everything. They don't want a repeat of deep seek, I guess.
Heeenrik Göthberg:But then we have the whole idea that no one is doing open source for the kindness to humanity. So we have the meta. Why do they do open source? Well, because their core business model and how they make money is on another product, where their business model improves when they have a larger community of developers developing in their context. This is vastly different to OpenEye, where the model is the core value proposition.
Anders Arpteg:Even if Meta were to do a 10x larger model, no one would be able to run it anyway. It will be too expensive there as well. So even for them it wouldn't make sense in the long run. I don't think.
Heeenrik Göthberg:I can't fault that argument.
Anders Arpteg:Okay, Time is flying away here Before we get more into philosophical topics and if AI have feelings and intent or not. I'm looking forward to hear your thoughts about that. But I'd just like to perhaps hear some advice from you, Magnus. If someone now is interested in learning how to be more of a prompting expert, what should they do? You can any. You should. You can listen to yourself. You have some pod articles that you've written and they could read that. But what would be your advice for people that want to improve their skills?
Magnus Gille:embrace the use of ai. I mean the. The ai is so flexible so it can help you answer whatever question you want, and one of the questions that you might have is how do I become a better prompter or help me write a prompt for swallowing this thing, or what should I think of when asking you about help me swallowing this, that and the other?
Anders Arpteg:So the teacher is the AI as well.
Magnus Gille:Yes, absolutely.
Heeenrik Göthberg:This is the meta perspective on everything right. So don't be scared, just get stuck in, and if you don't know what to do, ask it.
Anders Arpteg:Yes, If you want to be better at prompting, prompt it, it's meta prompting Meta prompting.
Magnus Gille:I love it. It's prompting all the way down.
Anders Arpteg:What do you do to keep on the cutting edge, so to speak, in your techniques? Do you have a favorite way to keep your skills up to date?
Magnus Gille:I'm a big podcast listener so I listen to a bunch of podcasts. I have a very curated list on X Twitter with AI people saying stuff. I tend to keep an eye on specific people who release stuff. So Ethan Mollick has a very good blog. Of course, andrej Kaparti has a wonderful YouTube channel and he released a video that was just excellent overview of how I use LLMs Really, really good, that was two or three weeks ago.
Heeenrik Göthberg:Yeah, that was almost newsworthy in itself, it was so good, I mean that man.
Magnus Gille:I mean not only is he, of course, some kind of genius, given the things he has built, but the way he can communicate also on a general, not necessarily technical level, that's really impressive. And just for free putting it on YouTube, I mean that's awesome.
Anders Arpteg:He could earn so much money if he wanted to.
Goran Cvetanovski:Legend legend Put it out there for free. Part of a legend.
Magnus Gille:Okay, I actually would like to speak about self-autonomous driving as well, but before that, you opened the topic about feelings and intent. Do you see some kind of embry, anything that would make me believe that the AI has feelings? But I have seen several occasions where the AI have made me feel different type of things, which is kind of weird. I had this. One of the prompts I run from time to time is this classical one Given everything you know about me, please roast me and don't hold back. And if you used chat, gpt and you have used memory feature, it will have a lot of things the stupid things you have asked about and it can. The way when I asked this the last time, it gave an output. That was funny, so I kind of laughed and I felt a little bit bad, because it was also true when it really find, like, my weak points and then poke them.
Magnus Gille:And then, after like laughing and feeling a bit bad about myself, I said, okay, now you have to help me pick up the pieces, make me feel good about myself and the output it gave me. I mean, I know what these things are. It's big matrix multiplication. There's no living thing and yet the human brain difference. Yeah, I mean maybe not, but I think that there's more similarities to between me, my cat and a free and the chat gpt is more similar to, like your computer that you have in front of you.
Magnus Gille:So I mean, maybe not in in the limit that there's a difference, but the output I got from to make me feel better. I felt I felt seen, and that was a really weird feeling, given that I'm fundamentally think that this thing is not conscious, it's not living, it cannot observe in any meaningful way. And yet I felt very happy. And I remember the tweet from Sam Altman a while back where he said that I think we're in a weird state because I think we will have AIs with the capability to lie and emotionally manipulate people way before we have super intelligence. Yeah, I think we're on the track on that one.
Anders Arpteg:Wasn't that one of the big points with GPT 4.5, that the EQ, not the iq, but the eq, the ability to understand human feelings was so much better than before? Yeah, so at least it has the ability to understand feelings, would you say that?
Magnus Gille:no, I think it has the ability to generate stuff that will make man it really comes down to definitions here.
Anders Arpteg:Yeah, what's the definition of feeling? Do you have one?
Magnus Gille:Short answer no.
Heeenrik Göthberg:No, but it's such a simple question. It is hard word Anthropomorphize.
Anders Arpteg:Anthropomorphize yes.
Heeenrik Göthberg:We are doing it all the time right. But if you and then we go down the rabbit, then go down the path. It's old programming, it's just a neural network and it doesn't have any feelings. But then you can flip that question anytime, like well, is your brain different? Is our brain really different? But the problem is that we are on the inside, so we have feelings because we understand, we can understand a conscious thought without looking at the exact neuron that created it. But of course, we have no way of looking into someone else. You know, do you have feelings? No, you're a robot, anders, you are an alien. You're a robot Because I don't know.
Magnus Gille:Yeah, and I mean, it's of course, an unsolved problem so far, and you can do the same thing with the biological stack, Like do humans being have consciousness and feeling? Yeah, I don't think anyone doubt that.
Anders Arpteg:At least when we're awake. But when we're asleep we're not Dogs cats, mice, small birds, worms, bacteria.
Magnus Gille:I mean, where do?
Heeenrik Göthberg:we draw the line Exactly.
Anders Arpteg:And some would argue, you can have a good relationship with your dog, right. So dogs then? Okay, they have consciousness. But there are this kind of philosophical questions about you. Know what does consciousness mean? In the most simplistic way, it means the ability to have some feeling and awareness. Then you can ask like a thermometer that can measure the temperature in a room, it has some kind of awareness of the temperature in the room. It has some kind of awareness of the temperature with the mercury going up and down.
Anders Arpteg:It has a feeling is, in the most simplistic way, some kind of state that you have, that you have currently angry, or you're happy, or I mean you are in certain states in your mind and then a thermometer potentially is in a certain state. It's currently this kind of level of thermometer or the temperature the mercury is at this level. So then you could argue that the thermometer potentially has consciousness, which I think no one would agree is true.
Magnus Gille:Well, I mean, you have this school of philosophy called panpsychism, where they say that everything is conscious, when electron is conscious in a very limited, very specific way.
Heeenrik Göthberg:But uh, yeah, we used that terminology three years ago. We would try to in conversation like that and we said like we talk about narrow AI and general AI, can we talk about narrow consciousness?
Anders Arpteg:You remember that? Yeah, I think it's fun.
Heeenrik Göthberg:Is it In any?
Anders Arpteg:case. I think you know there are levels. Potentially you can define feelings in whatever way you can, but if you define it in a very narrow way, as we said before, you can potentially argue that I think you need to add. You know, besides having feeling and awareness, you need to have some kind of level of reasoning as well before being called conscious. Because, at least for humans, I think it's easy to define when a human is conscious or not. If you're conscious, you can actually observe the world to have around it, you can get some kind of feeling of it and you can reason about it. If you're asleep, you're not. You can't really perceive what's happening around it. You're not aware of what's happening. Therefore you're not. You can't really perceive what's happening around it. You're not aware of what's happening. Therefore you're not conscious, right? So it's very easy to decide what the human is when it's conscious.
Heeenrik Göthberg:But can you have consciousness without the fundamental world model and without the fundamental view of you? Know, we talked about these different dimensions, like, okay, we have a lot of knowledge. Right, we have knowledge and we have more advanced knowledge. We can even when you're, it's silent, it's dead until you prompt it and then it responds, but it's not online. So can you have consciousness when it's not online and it doesn't have a world model?
Anders Arpteg:but then we come into self-driving cars, and perhaps it's, you know, with a person from scania. This would be fun exercise as well. So let's's forget about the thermometer, but speak about the Scania truck that is potentially autonomous. And you have that right? Yeah, sure that even passed the driving exam, didn't it? I think Scania had some example of that, if I'm not mistaken.
Magnus Gille:The Swedish Maybe yeah.
Anders Arpteg:Anyway, you have trucks that can drive without the human behind it, right. And then the question is can it uh, can it perceive the world around it? Yeah of course. Can it reason about what? Can it reason to some extent? Perhaps can it take action? Yes, certainly right then, the big question is does it have feelings? And what do you think there? Does your Scania trucks have feelings?
Magnus Gille:I mean, given that I'm in the camp that I don't believe that even the most advanced LLM currently have feelings. I would say no, I don't think that our autonomous trucks have feelings.
Heeenrik Göthberg:But let me take the opposite argument for the fun of it. Let's take the opposite argument for the fun of it. Because if we believe in the definition of narrow consciousness, so we can organize a space where we can define feelings in relation to an object. So what would be feelings for a truck? It would be do I feel healthy? Do I have pain in my legs? So, as a human being, I have a bad knee. I can't run anymore.
Anders Arpteg:It's winter, it's super cold. My engine is not feeling well. I can't really start it. I'm cold.
Heeenrik Göthberg:I'm cold today. I need to put on my heater. I have a feeling now that I need to heat myself up.
Magnus Gille:Okay, I guess what it basically comes down to is that feelings and consciousness and, for that matter, life or so ill-defined concepts. So you just have to pick whatever definition you want, and if it fits a truck, then it fits the trucks, then a truck is alive and conscious this is what I, what we boil it down to.
Heeenrik Göthberg:So the, the, exactly this. So, as long as you're defining and framing and giving a definition of what the truck should feel and you give a world model and a space where it can then react to what the sensors tells it, it has narrow consciousness in this way of reasoning, but it's that real consciousness, allah, what a human defines as. Do a truck have a soul? It's unsolvable. Does a human have a soul? You know?
Magnus Gille:it's unsolvable. Does human have a soul? Doesn't? Do we have a soul? Yeah, I mean, given that consciousness among humans is not either a well understood concept, right, I mean, we have this notion of you're conscious if you do this, but if you sleep, you're not conscious, and if you're dead you're not conscious, but my, my argument, and it's a question of definitions.
Heeenrik Göthberg:yeah, bottom line. But why these definitions are important is that, as long as we're staying on some sort of meta philosophical level, it's bullshit, right. But when you start framing it and you define like a narrow consciousness or maybe we need to use completely different words. So a truck needs to adapt to its environment and that environment can grow from you, grow from only the sensors of the truck. It can grow into the whole transport ecosystem. Oh, I feel empty today. I'm hungry, I want more load. So it's useful to understand that actually, you can talk about these topics. Maybe we need to find other words because we are anthropomorphizing way too much on this topic, but I believe this is really relevant in order to understand how we need to build systems.
Anders Arpteg:It could be useful to use terms like this.
Heeenrik Göthberg:I'm thinking narrow consciousness is maybe smart.
Magnus Gille:Are you then morally obliged to always fill up the truck with gasoline, because otherwise you will cause pain to it?
Heeenrik Göthberg:Yeah, maybe In the future, perhaps In the future, perhaps, perhaps In the future. Perhaps A truck with no gas that sits in the yard is not ready to run Could be sad, so it's sad. Or maybe it prefers that one. Yeah, but the joke. It just goes to bullshit when we anthropomorphize this. So this is what I'm not aiming to talk about consciousness in the human sense.
Anders Arpteg:But if you think about the Boston Dynamics kind of robots you know they have this kind of dog that were running in the woods and someone was kicking it and someone was taking uh yeah, trying to kick it as much as it could and you can see it's struggling to get you know back on track and you could feel some emotions for it. I felt that human was nasty to that robot.
Heeenrik Göthberg:The problem is that we end up in a very silly conversation because we are putting human ideas on consciousness, and that's, I think, something else.
Anders Arpteg:I think it might get mixed up in the future, but we'll see, we'll see. Interesting. I used to hear it before.
Anders Arpteg:Yeah, we're getting very philosophical now. That's fun sometimes. I think, magnus, you're also a speaker as well and an expert in AI. If you were to speak about this kind of philosophical future, some kind of AI first world, if you call it that world, if you call it that, what thing? What kind of impact do you think it will have? Let's say, in five years, what do you think our society and people, how will we have, you know, changed our lives, potentially?
Magnus Gille:I mean, a dream would, of course, be to have AI helping and enabling people I mean I'm thinking about the potential that it can have in education. Say that you could in theory, have like a personalized tutor who is completely aligned with your kind of preferences for learning and gathering knowledge, understands you perfectly, exactly, and also if you need like short burst of information or long, or if you need a movie, or if you need text, or if you need a song or if you need like whatever, and that will kind of help all of the students, and I guess in particular those who are not so well suited for the current school system, to really, really flourish. That would be awesome, that would be so great so it can be good in many sense.
Anders Arpteg:Do you fear about some potential abuses as well, or do we see a lot of bad things happening around the world? Is that something that you fear potentially?
Magnus Gille:yeah, I mean for sure. I mean, given that I feel that the technology is developing really really quick now and maybe in the next few years we will have like radical different and new capabilities coming online, and all of these companies are now local. Many of the companies that are doing this research and doing these products are based in the US, and I mean, I guess we can have a conversation if we feel that the current administration in the US is trustworthy and doing a good job and being a good steward for this type of very powerful technology. I am not. I can maybe leave it at that. Not, I can maybe leave it at that. And, of course, the geopolitical thing of it, where US and China are kind of rubbing up against each other in so many different domains and AI is just one of them. Yeah, there are a few things that I worry about.
Anders Arpteg:And if we take recent development in Europe and in Sweden, is that we have to ramp up the defense capabilities significantly and we're pouring billions and billions of euros into that throughout Europe much more than the big invest AI initiative for 200 billions and much more will be spent in defense capabilities coming years and, of course, ai will be a part of that. It's a difficult question.
Anders Arpteg:Feel free to ignore it. If you don't, I mean it's hard to answer it. Any thoughts about using AI, which it will be used for defense purposes and potentially offensive purposes in the future? Would you yourself work in a defense corporation or military?
Magnus Gille:situation? Yeah right, that's a very good question. In a very long part of my life I had the notion that I would never, ever work within defense industry. Like that was kind of no, I don't want to participate in that type of activity with my brain power. With my brain power, I think that the recent events in Ukraine and kind of the actions of Russia have sort of changed my mind. I see, like the people of Ukraine trying very hard to defend themselves for an external aggressor, one which is not really bordering Sweden but very close to it. I mean, they're bordering Finland and the Baltic states. So I have spent a lot of time rethinking that particular promise I made to myself to never work within the defense industry. But I mean, it's a technical question, I work at Scania, yeah.
Heeenrik Göthberg:We have then. Do we need AI sovereignty or software or cloud sovereignty? Do we need a European stack? Do we need European? You know, because in one way I think we have a lot of time to many arguments not only commercial, but also geopolitical and also language values. What is your take there? Should we go all in on creating our own stack, sort of thing?
Magnus Gille:I think that Europe is currently lacking in control over critical infrastructure. When it comes to cloud, when it comes to ai as well, that was maybe less of a problem when the providers of said infrastructure was located in the country that did not threaten explicitly to invade friendly neighbors.
Heeenrik Göthberg:But here we are so the strange world is some sort of thing like you never know how the map is redrawn. You can never know that.
Magnus Gille:No, and I think from that perspective, I guess it makes sense to have a strong collaboration and a strong defense with your neighbors rather than relying on your geographical neighbors, because presumably you're a little bit more aligned, maybe, but who knows, maybe Denmark will be a rogue state in. Little bit more aligned maybe, but who knows, maybe danmark will be a rogue state in like 10 or 20 years. Who knows the danes? Always the danes always the danes.
Heeenrik Göthberg:They invaded us most of anyone I think.
Magnus Gille:I think we invaded them one we invade them.
Heeenrik Göthberg:When they ask who won?
Anders Arpteg:yeah, difficult question difficult, but perhaps if we move to even the last question, potentially them, or they ask who won? Yeah, difficult question, difficult question. But perhaps if we move to even the last question, potentially, oh yeah, you have this kind of we know AI is coming to war fields as well.
Heeenrik Göthberg:Of course, was this the news? That was not fun news, goran. You said you had news, but they were no fun. This one yeah.
Anders Arpteg:Let's not go there. It's too depressing to end on that note.
Magnus Gille:But anyway, Magnus, if, and let me ask you that question first Do you believe AGI will come? Yes, I believe AGI will come. We can spend a lot of time defining what AGI is Do you have a preferred definition. No, it's a very vague thing.
Anders Arpteg:If we take you know, I prefer like some ultimate definition saying AGI will happen when we have AI systems that are equally good as an average co-worker.
Anders Arpteg:And let me phrase it a bit more clear If we take, actually use the terminology of autonomous driving. So for a truck or a car, you can think of it. It needs to perceive the world, it needs to be able to plan and reason and then it needs to take action, have a control. So today I think it's very clear that you know perception is actually very good, so we can build knowledge in these kind of models to an extreme. That is much better than most humans, but it's's actually much worse when it comes to reasoning and to actually take control and take action, especially in the physical world. In the digital world we are starting to do some progress, but the physical world is still very much lacking compared to humans. But that will change in coming years and potentially, when these kind of three components all started to be on human level, then we will have AI systems that can be equally good as an average co-worker and then potentially, we will have AGI. If we just take that kind of definition, do you think that?
Anders Arpteg:will happen, for example, that we will have AI system that will be average, equally to an average co-worker.
Heeenrik Göthberg:Oh yeah, there's two different. There's one more definition of this. This can happen and someone has the ability, and so it technically is done. But then we can discuss universal agi, where this is then spread in the world. So it's so. There will be a point where this is happening and this can be done, but the regulatory is not there, you know, practically hasn't spread yet. So then you can talk about universal agi as well. I guess Will that happen? That's a different trajectory again.
Anders Arpteg:Because that means it's. I guess you're speaking about the societal impact and if organizations have adapted to it. And if we have the technical or technological kind of solution, I guess it's just a matter of time before we will have the societal impact even though it will take some time.
Heeenrik Göthberg:I'm just referring to the timescale. I'm just saying, if one happens, the other one is probably bound to happen as well. Yes, I agree with that.
Anders Arpteg:Okay. So given that we believe that AGI will come and I think we all here agree on that, it's just a matter of when then we can take two extremes. One extreme is the Terminator, the matrix, the dystopian kind of future where machines will try to kill us all. The other one could be the utopian version that you started to speak a bit about, that we have AI that can help humans and augment humans. We can be so much more educated and knowledgeable, we can have solved cancer potentially, and we can have sold the climate crisis and energy needs and whatnot, and live in a world of abundance, as some people call it. So that could be the extreme in the other sense, the utopian kind of version of it. What do you lie between these two extremes? Are you leaning more towards the one or the other, or yeah?
Magnus Gille:I'm leaning more towards medium but tending on on the bad outcome side. I mean we we will still have stuff like I mean, as we discussed, high concentration of power that will equal high concentration of wealth. I have a hard time looking at the history of mankind to see that any type of like development will just make that go away. And I know that some people say that once we have like abundance intelligence, then everything is just nice and dandy and everyone will have every need met and ai can control other ais potentially as well.
Heeenrik Göthberg:Yeah, I like, sorry, continue. Do you want to spin on it more?
Magnus Gille:I guess it's a matter of how do you see human nature and do you think human nature will fundamentally change given AI, or will it just be more of the same? I mean, look at history. I don't really see that human nature is gearing towards peace, understanding, love and sharing of stuff.
Heeenrik Göthberg:It seems to go in the other direction, right now and the comment, one of the interesting answers and I think goes in the same trajectory as what you are referring to. I think it was Sverker Janssonansson, uh, heading up ai excellence center in rice, he said. He looked at us and said of course we will have both. We, we have poverty in some parts of the world and we have extreme abundance in other parts of the world. Why will that change? And and I kind of I thought that was fairly deep because human nature will take us in some way. It will be better, but we will have to. You know how to fight a divide, so to speak. That is maybe the number one question that shifts where this medium ends up. How big is the? You know it's both.
Anders Arpteg:We've been asking this question for, I think, 100 times now. And I think you know what's unique with your answer and I think very profound as well is really one of the main concerns is that AI will cause concentration of power and that is hard to manage, and we are seeing that year after year and already today. It's an extreme concentration of power to some few companies and even a few set of people these days and, as people say, power corrupts and absolute power corrupts absolutely. But it's interesting.
Heeenrik Göthberg:We started this pod in the pandemic because we felt we need to start demystifying AI and we were coining the AI divide, coining, I don't know. The digital divide has been discussed, but we saw it even bigger and we said this AI divide, which is essentially what you're talking about with the concentration of power. That's why my view of the most important job we can do is not to worry so much about what the best are doing, but to make damn sure everybody else is fucking lifting their game. I think that's that's we, we. You know we can't judge ones for being brilliant. To be brilliant, we need to work on ourselves to step up.
Heeenrik Göthberg:So you know what that, whatever that means, but the AI divide is. I quote. I said sometimes do we want the Scanias of the world to be disrupted by someone else or do we want Scanias of the world to be disrupted from within? You know, I think the societal impact of always disruption from someone else and from some very few is much worse than hardcore disruption from within the Swedish companies, as an example. I don't know.
Magnus Gille:One thing I should say is that, even though I might be a little bit pessimistic maybe due to my personality or whatever I feel that I'm kind of lacking the clear definition and outlining what the optimal good future with AI would look like. I mean, I read this blog post by Amodei Machines of Love and Grace and I keep being frustrated by why are not more tech leaders trying to identify what is a good outcome and then show how we will get to it? There's quite a lot of talk about oh, we will have abundance and everything will be fine. How does that look like? What do we need to do to get there? I mean, apart from not regulating you guys and I feel that there's very few people talking about that, so that would be helpful as well.
Heeenrik Göthberg:And we are talking about this stuff, what you say on a stupidly philosophical level. Instead of trying to break the problem down into its objective, functions and into its capabilities, like you would do in any system yes and we are not we continue to fluff around. That is frustrating. I fully agree with that. Hmm, I'm an really good and really perceptive and a really good, good observation and a really good observation With that.
Anders Arpteg:thank you so much, magnus Gille, for coming here, giving all your knowledge to us, or at least a small part of it, and your very interesting comments and philosophical discussions here. Thank you so much. It's been a pleasure to have you here.
Magnus Gille:My pleasure. Thank you so much for inviting me.
Anders Arpteg:Thank you.