E117 - AI and the Future of Software Testing

Dive into a captivating discussion on the AIAW Podcast with Vilhelm von Ehrenheim, Co-Founder and Chief AI Officer at QA.tech. In this episode, we explore the intersection of AI and software testing, delving into Vilhelm's journey from EQT's Motherbrain Intelligence to transforming QA practices at QA.tech. Highlights include insights into QA.tech's business model, case studies on AI-driven testing, speculations on a QA-driven development future, and strategies for startup success. Join us for a profound look at the future of software development in an AI-first world, available now on all major podcast platforms.

Speaker 1: 0:00

Tell us a little bit about that whole initial process. Like there was two, three, you're four co-founders right. Yeah, exactly, and they had a background. They'd been funding startups before and he was an AI startup and they needed the AI contingent into the founder family, so to speak. Could you elaborate on that a little bit?

Speaker 3: 0:21

Yeah, so they had this idea. It's very AI-first to use AI to ensure quality and then so they naturally wanted to speak to people that were working in the field, like if it was a good idea, if it was possible and those kind of things. So that's why they were interested to me.

Speaker 1: 0:40

So they reached out to you and I guess you had a mutual friend. You said yeah exactly.

Speaker 3: 0:45

It's always good to have contacts.

Speaker 1: 0:47

And then, what was that starting conversation? How did this take us back to fly on the wall?

Speaker 3: 0:55

The first conversation was very much about how you could potentially address this, like, instead of, how could you structure the problem in a way so that we could actually execute it as an agent, essentially being interacting with the web page, and we also talked a lot about how to break down different things into components and how to validate them and all of this. So there's a lot of hypotheticals, but super interesting.

Speaker 4: 1:24

What really made you make the decision to jump on the startup, going from at least a rather safe place, I guess, to a rather secure?

Speaker 3: 1:33

First of all, I really like building things. I think something that I've always wanted to do, so even in larger settings, and even at Ekt, motherware and I was about building that platform up, and after doing that for seven years, I felt like I was ready to start building something new.

Speaker 4: 1:54

Awesome, and if you were to just elaborate? You've been doing this for a happy year now.

Speaker 3: 2:01

Yeah, exactly. So I had this little grace period before I left Ekt, so I started full time in November.

Speaker 4: 2:09

Ah, in November. So what's the difference then?

Speaker 1: 2:12

Yeah, Can I ask a core question I was going for how do you feel it so far?

Speaker 4: 2:17

Is it as you expected? Yeah, how does it feel?

Speaker 3: 2:25

It feels very liberating. I think being a very small group of people building something together is exhilarating, and I think it also brings me a lot of energy. So, even if you put a lot of time in and you spend a lot of time together, it's so much fun building things.

Speaker 4: 2:42

So, yeah, and do you spend more time than before?

Speaker 3: 2:46

you would say it's probably the same, but it's more time on the building side thinking about how we should attack different kinds of problems in order to be able to build a product.

Speaker 4: 3:03

If you were still. I mean, if someone is listening to this, perhaps they are thinking I have a lot of ideas for doing a startup, but there are a number of pros and cons, and perhaps it depends on where you are in life, et cetera, if you have kids or not, or how secure your economy is, and so forth. When, or who, would you recommend to actually go for a startup, and perhaps who would you not recommend to do it?

Speaker 3: 3:28

I don't know. I found that first thing to say who shouldn't do things. But I think in general, if you have a drive to, if you really like, kind of building things, and you have this kind of inherent drive to do that and you have that as a dream, I think you should go for it.

Speaker 1: 3:46

But I think more fair question is what was your thought process? So you're living in Tabi Tabi family.

Speaker 3: 3:53

Yeah, two kids, two kids.

Speaker 1: 3:56

So this is spectrum from a trust junkie to a risk junkie, I guess, and when you put yourself on that scale or what was your thought process? To lead sort of equity. Mother brain is an awesome setup, of course, as well.

Speaker 3: 4:08

It is totally, and it's been a tremendous journey, and I've been there for seven years as well.

Speaker 1: 4:13

So what was sort of your thought process Like I need to take the leap now, how are you reasoning with your wife and stuff like this.

Speaker 3: 4:21

I think it's much about finding things that you really love and doing that brings more joy right. So, I think for the family it's also very important that we both me and my wife, are talking a lot about this, like what can we do to kind of feel like we're living the life right? That's kind of the thing. So this is one of the things that I've been wanting to do for a long time, so she's super supportive.

Speaker 1: 4:47

But could you sort of weigh risks, pros and cons? Did you have that? Was that a deep conversation, or was it just now? It's time to do it? Maybe you were done sort of thing. Seven years is still also a long, not long long. But sometimes you feel like, OK, it's more about I have to do the next thing now or else.

Speaker 3: 5:07

Yeah, exactly, and I think it was down to me being more of a builder than a maintainer, right Like we built this up, but it was essentially like a small startup inside of ECT and we kind of scaled that to being 40 people and, like a large organization, intraprenuor top journey you did with Mother Brain. Exactly, even though, like, maybe it's slightly less risky if you do it internally at a company, but you still have the same kinds of fights and you have to kind of work with internal customers done instead.

Speaker 4: 5:43

I guess we already done a lot of the introduction here. But, let's still go for the formal introduction and welcome you here. Thank you very much for the second time, wilhelm von Airenheim.

Speaker 3: 5:55

Airenheim, airenheim, it was a smoothest presentation.

Speaker 1: 5:59

Wilhelm von Airenheim. That's for the Oscars, exactly.

Speaker 4: 6:05

But you're here now in a new role, right? You were previously here as the head of the Mother Brain intelligence at ECT. And now you actually started a startup. That's the focus for today as well.

Speaker 1: 6:20

So I think if we did a portrait Larsson do you have the number? What was the episode? If someone wants to go into more deeply understand Mother Brain which is a super cool thing and learn more about Wilhelm, let's look. You can look at that episode, but I think for this time it's more of a theme and of course, we're talking about starting an AI startup in 2024. 35. Episode 45. That's longer than that, episode 45. And we are now at episode 117. Yeah, so now we have a topical pod which is sort of an AI startup in 2024. How is that like? And with a very strong focus on AI driven or AI first in testing, in software development. And I think you can put an overarching theme is where is software development going end to end with AI? So that's the sort of the overarching theme now. So let's go into this. Was that fair?

Speaker 4: 7:20

And perhaps let's not repeat the whole background of you, but perhaps still for people to have a small feeling of who you are. How would you describe your background and interests?

Speaker 3: 7:31

Well, I've been working with AI in different kind of settings for most of my career. I studied engineering physics in Dunds and then I've worked for some time at Klarna building their credit and fraud risk models. This is where they take automated decisions at time approaches, and then now, most lately, at the EQT for building the motherboard platform from pretty much an idea.

Speaker 1: 7:56

You were more or less there from day one, right?

Speaker 3: 7:58

Yeah, exactly, I was number three, so it was a full stack developer manager and me.

Speaker 4: 8:05

But you've always also been focusing on AI right.

Speaker 3: 8:07

Yeah, exactly.

Speaker 4: 8:09

And now you made the big jump then. And what's your title?

Speaker 3: 8:14

Chief AI officer is that right yeah?

Speaker 4: 8:16

cool Is that becoming more and more of a common like C-suite kind of title?

Speaker 3: 8:21

I've heard that it's becoming more and more popular, but I really like the idea of the title.

Speaker 1: 8:28

So let's talk about your C-suite. You have a C-suite. How have you structured it? Cto and CPO.

Speaker 3: 8:34

Exactly. We have CPO, ceo, aio and then CO as well.

Speaker 1: 8:43

And have we mentioned what's the name of the startup QA Tech? Qa Tech what's in that name?

Speaker 3: 8:50

Well, QA is for quality assurance, quality assurance, exactly QA. Tech and we were discussing the name quite a bit and I think naturally you would go for more like of this AIAI or something like that.

Speaker 1: 9:02

Qai Exactly.

Speaker 3: 9:03

But it felt like QA Tech sounded better. It was nice and short and felt like an interesting kind of name.

Speaker 4: 9:13

Should we put into the test now and say, if you were to make an elevator pitch in 30 seconds?

Speaker 1: 9:17

to say, the bar test. The bar test.

Speaker 4: 9:21

The bar test or the elevator pitch of what is really QA Tech about.

Speaker 3: 9:27

So QA Tech is about testing end products so web platforms it's a primary focus and testing the functionality of the product. So we are developing autonomous agents that can interact with and try your product in different ways. So we do automatic discovery on the functionality and then we ask the agents to perform different kinds of objectives. That could be stuff like logging in, signing up, changing, building information or, depending on what your product is doing, doing the main core things there, like adding a new customer, if it's a CRM system, adding a new kind of pet, if it's a pet platform, or whatever.

Speaker 4: 10:17

We actually had an episode last week speaking specifically about autonomous agents, so it's rather nicely followed.

Speaker 1: 10:23

It's super nice that we are now on this path from rags to Asians, and my prediction is and we're probably going to talk about it in the Data Innovation Summit how we need to frame agent thinking. And here now we come to the concrete topic of agents in relation to quality assurance and testing of web applications.

Speaker 3: 10:44

Exactly. I think, agents are all the rage. 2024, right, this is what I said, right.

Speaker 1: 10:52

It's the year of the agents. I've already predicted.

Speaker 4: 10:54

I gave course in agent oriented programming back in 2004.

Speaker 1: 10:57

Yeah, but it was not fucking hyped. I'm sorry, man. There is the market, we know my friend.

Speaker 4: 11:05

Well, it's certainly picking up now and I guess it's also making use of large language models and, of course, the recent rage in generative AI in different ways.

Speaker 1: 11:12

Yeah, so this is also a key topic, then this is to some degree, also an approach which we will come into, that is, generative, ai oriented yeah, of course.

Speaker 3: 11:24

I think agents as a concept is super old right and more traditionally you would phrase it as a reinforcement learning problem maybe, where the agent has to try a lot of different things and learn from that. The problem is when you have a very open-ended kind of environment. That is very problematic because it's hard for the agents to adapt and kind of learn new things because they have this kind of value function in hand. But the large language models are opening up different kinds of approach because they have so much more world knowledge. If you, for example, were to have an agent that should learn how to open a door, it doesn't necessarily have to try and kind of knock on all of the places in the door before it accidentally touches the handle and then realizing that this is how you open a door. If you ask a large language models today, like if you just ask GPT for….

Speaker 1: 12:20

How do you open a door?

Speaker 3: 12:21

Yeah, exactly it was here and just used the handle.

Speaker 1: 12:24

Come on.

Speaker 4: 12:24

Come on. But if you were to try to do like a problem description of what are companies potentially having a problem with that, they could use autonomous agents to do some kind of quality assurance for…. Can you just describe why it's so difficult to do quality assurance without having your technology?

Speaker 3: 12:43

Yeah, I think first of all it kind of breaks… Usually broken up into two different kinds of ways to handle this. One is to do more kind of programmatic and turn tests, where you essentially hard code like a path through your application using a browser. That would be like open this link, wait for 300 milliseconds, then you press this selector and so forth and so forth. But the problem with that is when you develop a product you change it all the time and then the tests will inherently fail, like if you have 100% test coverage of all of these things hard coded and you change anything, then a lot of the tests will break, not because the functionality… your internet functionality is not functioning, but because the tests are connected.

Speaker 1: 13:32

And the variations that… Because we will have blind spots, because the developer will test what he was the intention of the thing to do, but how to think about all the variations of what a user who has not understood what the purpose is will try to do.

Speaker 3: 13:50

Yeah, exactly. And then the other side of it is to use humans to test. They are very slow and expensive, right? So asking a human to click about and ensure that everything is working every time you do a release is very cumbersome and it takes a very long time.

Speaker 4: 14:08

Okay so, comparing to the traditional way, either human testing, but let's focus like programmatic, like adding a set of actions or instructions to Selenium or some other browser controlling software. How do you approach it differently? How is a new company wants to use your technology? What do they have to do?

Speaker 3: 14:27

Very little, because we intend to discover the functionality on the page and then instruct our agents to test that functionality. But we're focusing much more on the objectives rather than the actual kind of exact path.

Speaker 3: 14:41

Go through an example of… yeah, so the easiest one would be login right. So if you ask the agent to login, it doesn't matter if you change the entire login process, if you start using out of zero or something else, it will still work right Because the login functionality works. So the agent will try to do that and then evaluate whether or not the login is working. If all of these selectors and everything has changed since last time, it doesn't really matter.

Speaker 4: 15:07

So go a bit deeper now, because this is an interesting topic. How do you…. If you take a login functionality, let the agent discover how to login on the web page without having instructions in how to do so.

Speaker 3: 15:21

So we do discover that there is a login form right, and then we give it an objective that it should try to achieve, with some kind of examples of how it should think about evaluating it.

Speaker 4: 15:35

And using an LLM in some way.

Speaker 3: 15:38

Yeah, so we use LLMs to kind of both to construct the agent that is actually performing the actions, but then also in order to generate these kinds of objectives.

Speaker 4: 15:51

And I'm seeing more sensing that you're trying to hide some kind of the secrets and please don't be afraid to say so if you want to keep some of the magic source secret. But okay, so you give it some kind of objective. You may have a starting point then, like a logging form that you provide or….

Speaker 3: 16:12

Yeah, so the configuration for a new web page would be, if we're testing logging, then would be to provide a link and, of course, correct authentication details otherwise you can't actually log in and then if the agent has those available, then what we're asking you to do is then it kind of goes into the web page, it looks at everything that is available there and then has to kind of select what to do.

Speaker 4: 16:42

So you provide to the prompt to LLM the basic HTML code that you see, or JavaScript code that you see, or….

Speaker 3: 16:50

So yeah, the HTML and the JavaScript is like two verbose. It's very nonspecific, so we do a lot of pre-processing of the HTML before actually kind of providing it to the agent. But then when we have that available, it's essentially provided with a set of actions that it can do and, given this specific web page, and then it has to kind of select and kind of choose what to do. Given those tools that it has available, which would be to interact with the page and fill in information and those kind of things. It both has the possibility to kind of understand which actions are available, but also it also kind of use a multimodal thing to understand how it looks and kind of interact with it.

Speaker 4: 17:33

So, using information of the layout and the graphics as well, or what or how do you make it understand simply where to put the username of the logging forms of the speak?

Speaker 3: 17:46

So I mean, essentially you can think of it like the agent has this objective, saying like your objective is to log in with these private credentials and here's the set of actions that you can do. Then there could be like a click on this button, fill in this field that is kind of email, and fill in this field, that is password and like all of these things, and then it has to understand which ones to do in which order and figure that out. And that kind of brings it back a little bit to LLMs being really powerful at reasoning about things. So if you ask an LLM like, given you have these kind of different actions to that you can choose from and your objective is this, what would you choose? And then it does that and kind of fills in the information.

Speaker 4: 18:30

Each action is that manually provided saying this is how you enter the username.

Speaker 3: 18:35

No, so the actions are provided as tools through the kind of you know, open AI kind of function calling.

Speaker 1: 18:47

But what's your understanding and vision for the user experience here? Are you thinking about this to become something that someone can do on their own, or plug in and add, or how is this?

Speaker 3: 19:00

So from the end user perspective, they don't have to do this at all. This is done by us on our platform, right?

Speaker 1: 19:05

Yeah, but let's say okay. So let's take the scenario that someone is doing web application development and they want to now use this service. So is this an engineering experience that you don't set up for the? You know the typical testing engineers and they use this as a way.

Speaker 3: 19:27

Also, if you're building a web app as a developer, all you have to do is essentially put an integration in your CI, cd kind of workflow Okay, and then to have some kind of staging environment. Preferably, I mean, it could be a production environment, but usually you want to catch things before it goes to production right and then? So then this would trigger, like the set of tests that we have discovered on your web page, and then the agent will try to do all of those.

Speaker 1: 19:52

Yeah, so instead of doing programmatic testing that they have coded and they run that somewhere at this point in time, when you go to the software development cycle, you come to testing, and now we come to the last. You know user acceptance type testing. You are now plugging this into an API in your CI CD work. You know workflow that this then kicks in and then performs the tests.

Speaker 3: 20:13

Yes, and then if it breaks, it stops your kind of development and what is the experience results?

Speaker 1: 20:20

What's the output? Is it something breaks? So you can see it?

Speaker 3: 20:23

like, if you have GitHub actions, for example, as your CI kind of pipeline, then you will have that kind of you know being read and saying that this is not functioning. You can click in and then you get into ourselves.

Speaker 1: 20:34

Which is more or less going straight back into the normal CI CD flow.

Speaker 3: 20:38

You would have Right, whatever you have thought that would be. Exactly, and then you would. You would click on that link and you will get into our platform and then you can see both the passing and the failing tests. But then in there you can see the steps and the reasoning that the agent has done, together with screenshots and recordings of the whole session. So you can see like okay, it's trying to fill in things and then it clicks on login and then it breaks.

Speaker 1: 21:00

So the result is not only a red line, it's actually like a playback of the problem.

Speaker 3: 21:05

Yeah, exactly, and then we've bundled that together with other kind of information that could be interesting, like network calls and what happened in the background.

Speaker 1: 21:14

So this becomes like a package that you, when you click on a problem that you, you know it's highlighted in the CDs CD process, this is red. You click on that and then you go into your app, you get the package. This is the sort of the playback, this is sort of the background data you know whatever that is in a neatly packaged. Here you go.

Speaker 3: 21:32

I mean, if you have a human QA tester, what they would normally provide is, like maybe a screenshot, if you're lucky and like a description of it that is really random, Exactly right. So we want to be developer friendly here. So the idea is that it should get you enough kind of context in order to understand what went wrong.

Speaker 1: 21:52

And you can actually in very simple ways give more data around the problem than a normal business test could ever do Totally.

Speaker 4: 22:01

What's the business model of QA tech?

Speaker 3: 22:04

So we're primarily focusing on on B2B sauce at this stage. I think in the longer horizon we could probably aim at any kind of web development at all, but it made sense for us to start somewhere and I think having a tight ICP makes it easier to focus on specific problems.

Speaker 4: 22:22

ICP. What do you mean?

Speaker 3: 22:24

So like initial customer profile.

Speaker 1: 22:27

But it's interesting here because I can see how this goes into any web application development, any web development. And I give an example. I mean I'm working or discussing with a client which is a very large sports retail brand, typical setup, american company, european head office who then builds the whole sort of web CMS, the whole thing, you know, the whole e-commerce suite and this e-commerce setup that they use. It's done centrally in Europe and then you have 20 different variations in different languages and, of course, what is supposed to work has never been tested in 20 languages and the different variations. So is that a core cost? Is that an ICP or is that outside of your ICP right now?

Speaker 3: 23:19

That profile the reason for us not focusing on those kinds. We spoke to some kind of potential e-commerce clients early as well, and I think it could definitely be a customer at this stage as well. It's just easier for us to focus on some specific things, and when you look at a source application, they generally kind of need to function on a specific desktop application setting and they're more similar to each other.

Speaker 1: 23:46

So you want to limit the variations here in order to stabilize?

Speaker 3: 23:51

It's more about building something successful and then kind of scaling from that right. So, if you try to do everything at the same time, it's harder to focus.

Speaker 4: 23:59

What do you charge for?

Speaker 3: 24:03

So we charge.

Speaker 2: 24:04

We're pretty early so we don't charge that much yet, but the idea is to charge per developer A seat license to use the plugin tool.

Speaker 4: 24:16

To use the agent. Do you have any? I think you mentioned you had a number of like beta customers or something. Can you give some examples, some case study or some example customer that you can elaborate or just describe how they make use of your technology?

Speaker 3: 24:35

So all of them are using it from this kind of same kind of perspective, as I said before, like they have integrated it within their deployment pipelines. They run the tests on their platform to kind of validate that it works before release.

Speaker 4: 24:56

If you can, can you give some concrete customer example, or do you want to get some secret?

Speaker 3: 25:03

We have a customer called Fibble, called Elative, we have one called Digital and there are a few of those like smaller B2B source kind of customers.

Speaker 4: 25:18

Can you, just from the point of view, share a potential value that they see? Is it that they don't have to write a test? Is it that they save time? Is it that they find more bugs? What is really the value that they see as a customer?

Speaker 3: 25:30

Exactly, I think it's a bit of both. I think, first of all, you don't want to ship broken products, right? Especially if you're a source platform, that's your core product. It's very important that it functions and that your users are happy, right. So that's one promise that we can kind of ensure that it works better.

Speaker 3: 25:48

And then the other one is that, since it is very time consuming with testing, like, developer resources are pretty expensive. You don't want your developers to spend time on testing that what they did actually work. You want them to spend time on building new interesting functionality, right. And also the QA process in general is very slow and cumbersome, which also hinders developer kind of efficiency, right. So the idea there is to make that process linear and smoother, which I think will be very welcome, both from the QA engineer kind of perspective. They can spend less time on kind of clicking the same button every day and more focus on, like, the process and the quality kind of thinking as a whole. And then from the developer standpoint, they don't get to kind of change context switch as much. They can get feedback much earlier on how well their stuff is actually working and then ship more stuff.

Speaker 1: 26:48

Can we zoom out a little bit now? Because I think we started by sort of going into understanding Q-Tech and we've been dissecting sort of what are we actually talking about here? And if we zoom out, we're talking about AI first, thinking around software development testing, and we are more specifically talking about the quality assurance part. So if we zoom out from examples and what that all means, let's talk about software development testing. We can start from a quality assurance perspective and then we can broaden it to software development testing end to end. How do we see the evolution here? Where are we now? And you have clearly seen okay, there is a direction and path here. We are jumping onto it. So could you elaborate on what is the opportunity here that makes VC jump on it? Pre-seed, that's an interesting one. You had no problem getting pre-seed in this case.

Speaker 1: 27:48

So could you elaborate a little bit about what is this really all about? Because there's so much value in here, but sometimes when you go to detail, it's like you don't see the big picture.

Speaker 3: 27:58

Can we get a little bit out, I think. In general, I think everybody feels the pain of broken web applications. It's very easy to relate to. Also, this is a huge industry. The QA industry in general is really big as well.

Speaker 1: 28:16

What is the QA industry? Could we put a number or a frame on that, because I know this?

Speaker 3: 28:22

I don't remember the number in my head now, but it is very big in general, both from an outsourcing perspective but also having people in your team. Usually a very common setup is that when you have cross-functional teams that develop these kinds of products, you will have maybe one QA per five engineers or something like that.

Speaker 1: 28:42

So, if you think about this number, how many engineers, software engineers, how many software teams? And for every five software engineers there's a QA guy. That's what we're talking about. We're only looking at this topic. I can relate to this basically on working with embedded in-loge clients, like entry manager in Miscania or line manager working with the big consultancies. Building up a team how many QA engineers we need to have QA engineers in the team? This is hardcore. What we're talking about, that every single software team needs to take care of QA. Either you get people for it.

Speaker 4: 29:22

If we're to just elaborate a bit more about QA, because QA is one thing, testing could potentially be different. I'm just eager to hear your thoughts about this. Some people say QA is not only about making the software work, but also making the code or the architecture look good. Do you agree with that, or what's the focus that you have for QA?

Speaker 3: 29:41

I alluded to that a little bit before. If you're a QA engineer in a team and you focus on the quality, I think doing repetitive testing, usually called smoke testing, where you click things and ensure that they work that's not your primary focus area. You have to do it and it takes a lot of time. If we can take some of that away from them and also some of the discoverability on what they could actually test, it would make it easier for them to zoom out and look at the entire process instead.

Speaker 4: 30:16

More focus on the quality, so to speak, and less on testing instead. Yeah, exactly that makes sense. Speaking about testing in general and to Henrik's point here about the whole process and, I guess, the evolution of software testing in general, we can see that transitioning throughout the years in different ways. I guess now with AIS, as you have introduced, there are new ways to do testing but also to do coding, I guess in some way. Yeah totally.

Speaker 4: 30:52

If you were to just start to speculate a bit more, how would you say traditional testing have started to transition. It's a rather philosophical question.

Speaker 3: 31:02

It is, but I think, like you said, the development experience for a developer today is very different than it was two years ago. So I think, both in regards to people being able to produce much more quicker, but then also you could generate like unit tests more easily, and there's a lot of different things that helps that GenA can help you with.

Speaker 4: 31:28

Just to explain what you mean here, it's rather easy today to have a function that you've written and just ask ShettyPT to write the unit test for it, and that produces a new pattern of working. Yeah, exactly.

Speaker 3: 31:39

And you have co-pilot helping you actually write the function in the first place. So people get to be much more productive, and maybe if a developer can do 10x coding now and they can do it before, but it's not necessarily the same when it comes to end product testing, though, right. So if you have a team that before could do so much and then now they can do 10x more, but then the quality kind of insurance and those kind of things have to scale with that as well.

Speaker 1: 32:09

But let me test out where we are on the evolution now and I have my hypothesis and you can shoot me down. I propose to you that we are in right now, in 2024, you know, February to the point where, on an individual performance basis, coders, developers, are starting to pick these tools up as part of their repertoire, how they code. But have we truly moved into the core workflow? The industrial view of this has changed, so is there a process change or is it an individual way of working? Change where we are right now?

Speaker 3: 32:49

Well, I think it's mostly on the individual level. I think when you think about constructing an engineering team, that's still pretty hard and making sure that they were building the right things and making sure that everything functions as a whole. So I think we'll see more evolution in that, as well as we see some tooling that comes out now that can help you generate like entire skeletons of projects or even products as a whole.

Speaker 1: 33:16

Do you agree with that view of where we are right now? I guess?

Speaker 4: 33:19

one hypothesis you can make here we have in a team that builds some kind of software product, a set of competencies that can be back-end and database engineers and can be front-end engineers and so much more, and it can be QA engineers. One hypothesis would be that AI simply makes all of them more productive in some sense, and you can ask a database or DBA or database engineer or data engineer to simply use AI to produce code, but also to solve problems in a quicker way.

Speaker 1: 33:52

As individuals, like when they're at home coding. Is that what we're talking about now?

Speaker 3: 33:57

Yes, but as teams as well. Right, it depends on what you're looking at. If it's easier to write documentation about the stuff that you have been building, it's easier to build code, it's easier to get feedback and structure things, I think you would get more productive as a team. But AI is not really there that you can look at everything that is happening within an engineering team.

Speaker 1: 34:26

Let me try to, let me finish the point.

Speaker 4: 34:29

We have a set of people. They have different competencies. Ai makes all of them potentially more productive, including the QA person. And now, with your technology, one assumption could be that you try to remove the QA person, but that's not the case. Potentially, perhaps your software makes the QA person even more productive, simply.

Speaker 3: 34:51

Yes, I think at least those that we have spoken to so far are really welcoming this, because it takes away a pain point for them and also getting back to telling the developers in the team that their code is not functioning because they forgot something stupid. It's not necessarily the most fun in their work. They want to think about the entire quality of the process instead.

Speaker 4: 35:13

That's cool. I was thinking.

Speaker 1: 35:20

Let me try to be sharp. This is my problem. Hypothesis Is there a big difference between using AI to boost the individual level versus embedding the AI as part of the core SDLC or CICD workflow? So we are elevating the AI topic from where we're having the same CICD. Everything is the same, but we're all ten times more productive because we are using our tools individually Versus the fundamentally AI-driven SDLC software development lifecycle. I think there's quite a my view, this must be a distinction here.

Speaker 3: 36:09

Yeah, there is definitely a difference, but I think we'll see more and more of that emerging as AI capabilities are evolving as well. Just being able to have suggestions of what code to write on a line basis, like co-pilot is doing, was unthinkable a few years ago. But now it's getting more and more common and everybody is using it.

Speaker 4: 36:36

I guess another change could be before you had a bit more separated responsibilities as people. You had one person doing the QA and other people perhaps separate people doing testing separately and then other people doing the database part separately. Perhaps one change unless you simply empower all of them individually could be that they actually unite and have fewer people doing more stuff.

Speaker 3: 37:06

Yeah, totally. I think that makes it easier to bridge the communication gap. What's the problem? When you have a lot of people involved in the same project, it's very hard to keep everybody informed on what decisions we've made and how things look. This is a goosebumps moment.

Speaker 1: 37:26

That's a long time. I used that word because this is profound, because we've been on a path that first we're working in different disciplines, then we understand we need to have product teams. The complexity of the product teams has kind of been growing because you need your AI engineer, your data scientist, now you need an AI engineer, data engineer, UX engineer, and now you need a QA tester and all of a sudden now the cost and the complexity of the product team has kind of been on a trajectory the last couple of years with the complexity and the difference in technology that if you want to do AI driven, if you want to do what Spotify is doing, you kind of have an engineering manager and then you have all these people because no one can really master all of this. So with AI are we now starting to combat the complexity of the sort of growing specialist team?

Speaker 4: 38:21

We had this question actually up for, I think one year or two years ago, and we were asking will that just be an increasing number of specialized titles, roles that you get hired for?

Speaker 1: 38:32

You have to have them because it's so complex but Perhaps not If you just take documentation.

Speaker 4: 38:39

there were people that specialized in that and just wrote documentation for software and only did that, but perhaps now, with AI, generated documentation from the code.

Speaker 3: 38:48

the normal right Totally, and also understanding large APIs right. It's very cumbersome, but you don't necessarily have to do that the same way anymore, because you could just ask an AI to help you.

Speaker 1: 39:01

Because this is super important, right. So we have been pushing and I think this has been one of the main divides between normal enterprise trying to do stuff versus real AI, product oriented. I mean, like you can't do, in my opinion, proper AI without having a product oriented approach. And if you go to large enterprise, the downfall has been to have a project steering. The treatise is a project, so they don't get this. Now, with this it's actually starting to sort of make it simpler again for the rest of the world to catch up to where the hyperscalculus has sort of done on nuts and bolts level. That had to figure it out in the end. You know, who did we have? Who was the head of engineering management for search in Spotify? Who was here? What's his name? I forget.

Speaker 4: 39:49

Oh yes, I should know that, Anders Nyman.

Speaker 1: 39:52

Anders Nyman Beautiful right, fantastic understanding to look at the core product of search within the Spotify platform and how they have organized that and have a CPO and have an engineering manager. You know it's a fairly large team with psychologists and all this and it's actually that is the blueprint, but it's a ridiculous blueprint for normal companies.

Speaker 4: 40:14

I think potentially and please disagree, if you want, willem that we've been seeing an increasing specialization in developer roles and potentially with AI we can start to change that and have increasing generalization in developer roles, and some people are saying that the role of the developer is gone altogether in a number of years.

Speaker 1: 40:34

I do not believe that I do not believe that either.

Speaker 4: 40:36

No, no no, but potentially at least, we will have a bit more general roles as a developer.

Speaker 3: 40:41

Yes, I think you could do it already today, right? Like, even if I'm kind of not a front-end engineer, or like a back-end engineer or a data engineer, I could just, with AI's assistant, accomplish things that I couldn't do before.

Speaker 4: 40:55

Like, say I have this application.

Speaker 3: 40:56

I want to deploy it in Kubernetes, like how do I do it? What should I write in this kind of you know manifest and so forth? And then I can, with AI's help, like write that at least, so it kind of functions, and then kind of bounce it back and forth and then solve the solution. Before it would take days. Now I can do it in a few hours.

Speaker 1: 41:13

Because I don't think this is a real problem with the hyperscalers like the Klarnas and the real properly set-up team from the beginning, like you start from scratch in Mother Brain with people who really know what they're doing. But in the enterprise this is really problematic. Right, because to really recruit and understand the complexity of these different roles has been a big burden.

Speaker 4: 41:35

Totally and.

Speaker 1: 41:36

I think now one way of looking at this with AI is actually you need to have your specialist competences, but maybe you can do it a little bit different here.

Speaker 3: 41:46

I think it's easier to get started and then it's easier to understand what you need in order to actually kind of take care of those things that you have generated, that you can't take care of.

Speaker 1: 41:55

But then to be AI enabled, to be AI ready, then to dare to use these types of tools for the enterprise is actually part of the solution. You know, because you now have the blockage, oh, we don't dare to use this stuff yet.

Speaker 3: 42:11

But in general also, I think it's easy to either be too tech-focused or too kind of like little tech-focused from the business perspective. In larger enterprises Like you have a lot of problems that could be automated or improved with technology. But if you kind of Separate, the two.

Speaker 1: 42:29

If you separate the two, much like it's very hard.

Speaker 3: 42:31

That's why the product kind of roles have been increasingly important, because they bridge that a little bit right.

Speaker 1: 42:37

And we in Dairdags have been talking a lot about this in different conversations. We talk about the blind spot. Blind spots to sort of highlight the operational AI divide between the people in the know and the people learning still, and one of the key misconceptions, I think, is this sort of we need to have four different competencies and then you go in and look at how Spotify works, and then you go in and look at the other two and then you have the idea from Anders which is a co-creation of teams, competencies, working together on a daily basis, versus this handover between roles in enterprise.

Speaker 3: 43:19

That's the promise of a cross-functional team right. So cross-functional handover versus cross-functional true co-creation is two different things in my opinion, and we all work together. I think you still have the problem when, in larger enterprises, that you have to anchor the vision and the purpose of those teams into what you actually need and which processes to change, and changing processes and changing it in general is very hard.

Speaker 1: 43:44

But could you contrast Klarna versus Mother Brain versus now, Because even Klarna in the end you started at 200, I guess you were number 200 somewhere in there. Yeah, I don't remember exactly. And when you left. How many were there in Klarna when you left? Over a thousand. Okay, so the contrast when you started in Klarna at 200, to when you left, the complexity of sort of making it work.

Speaker 3: 44:04

Yeah, I mean Klarna was also like Spotify and Steelious right, A very kind of engineering focused organization, so that's a good one. Yeah, it's a very good place to start because you get to understand like all of these different kind of complexities and what you could potentially do with technology right.

Speaker 4: 44:21

Could I just ask a crazy question here Today? Qa Tech is focused and correct me if I'm wrong but focused on finding bugs, basically, and making sure that the functionality of the web page is working properly. But let's say that you would like to expand QA Tech to say I see the logging is working in this way, but I want to add this functionality to it. And I'd just say I want to add not only a Google login, I want to have a GitHub login as well. Add it to it and you tell QA Tech to generate the code for me to do so, so you can have a test driven that could be a potential future, right?

Speaker 3: 45:00

I think there is always a risk of trying to do everything at the same time. I think we've talked a lot about, like at least there should be a set of best practices to do a lot of things right. And if you're developing a login, then and it should function in certain ways like it should handle information in a way that is expected If you have a capital letter in the beginning or something the phone automatically adds right, like it should still function to login, and there's a lot of these different best practices that is not always easy to get a hold of and know. And then, if we can connect the errors with information and help the users to improve on those, then the next logical step would be to suggest code to do that as well.

Speaker 4: 45:48

You know you have normally the test driven development type of programming is kind of well established and you basically write a test before you do write the code. But if you now can have AI, that helps for one that you do to do the testing. What if you could simply have the user suggesting the functionality and then moving it back to the code, just as test driven do, but actually using the AI to produce the code?

Speaker 3: 46:14

Yeah, and I think the AI it's not really good at kind of evaluating themselves as well, right, like in some way. So I think if you have something that generates code, you probably need some kind of counterpart that could help it know when it did something good or something bad to adapt. Like if you try to write a function using JupyterGPT and it kind of gives you code and then you try to run it and it doesn't work and you paste in the error, then corrects itself, right.

Speaker 4: 46:41

So this leads I'm sorry to jump to a philosophical question now directly.

Speaker 1: 46:45

Let's do it.

Speaker 4: 46:46

But if you have for one, now that you already have a way to test if it works or not. Let's say you add a functionality to add new code to it and do as you say. Let's say it doesn't work when you add the code, but then you have the original test functionality that fix that code for you and you just put it in the loop Then you build something really crazy. Potentially you can have users or even other agents creating their own functionality together without you potentially involved.

Speaker 3: 47:23

I think you can do that, like already today. There are kind of projects like GPT engineer, for example with Anton that tries to generate like entire projects.

Speaker 4: 47:35

But it doesn't have a loop though.

Speaker 1: 47:37

No, exactly you need to kind of point that I see your point here, but because we are starting in the testing topic, it's the logical loop. If you start from the loop perspective first, you put the loop thinking very early on. As a principle. It's an interesting thought. The testing is the looping start. Isn't that what you're saying right now?

Speaker 4: 47:59

Did I get it right. It's one part of the loop at least that can fix bugs. And then you add generation to it and that have bugs and you have a testing that can fix the bugs. Then it can just put it out.

Speaker 1: 48:13

The way I hear this now and I think this is interesting. So we're going to build a loop Great. Where do we start the loop? How do we get the loop started? And instead of looking at starting the loop at the beginning, maybe the loop starts at the end. At the testing.

Speaker 4: 48:27

That is the profound question. It could be using GPT engineer to say to have the first part and then use the testing, and then it goes back and use the. Gpt engineer to just continue to do it. We have no idea what it would end up with.

Speaker 2: 48:40

What else would we end?

Speaker 1: 48:40

up with now.

Speaker 3: 48:43

It's super interesting from an AI perspective, I think to have these reasoning loops. We're utilizing things like that as well in order to for the agents to actually improve and understand things better over time. There are people called reflection, there's another one called expel. There are a few of those that try to borrow more traditional ideas from reinforcement learning, but using that with large language models, so that you have one agent essentially evaluating the other one and then it reflects on its output, and then it takes that into consideration next time it's running, and so forth.

Speaker 1: 49:21

Let me, can I, do a little rabbit hole here, because this is, let's almost continue from a philosophical point of view, but becoming actually super concrete. So digging into agent and building agents. And we are taking now your concrete topic now of agents for quality assurance or testing. In my opinion, when you build an agent, one of the most important questions to get right is the agent AI, utility, function, what is the objective? So here we are talking about you're actually using LLMs even to frame the objective, because I think the whole point with an agent it goes either really good or you have unintended consequences, or you're fucking it up or you're getting still in blind spot if you don't get your objective first. So could we elaborate on the importance and how you deal with the setting objective for nature? How have you done that?

Speaker 3: 50:21

Yeah. So first of all, I think just doing it without any kind of validation is probably a recipe for disaster.

Speaker 1: 50:27

Right, you need to make sure that the objective is aligned, because then you need to talk about objectives, but we also then need to talk immediately about validation. That's what you're saying, of course. So these are two things, two sides of the same coin? Yeah, of course.

Speaker 3: 50:44

The way that you're phrasing the objective is very important in order for the agent to know what it's supposed to do. I think that's why different kind of prompting techniques, like few shots and a lot of those, shows you examples so that the agent understands what it's supposed to do.

Speaker 3: 51:01

So elaborate on few shots for someone who doesn't know, few shots is essentially, just when you do the simplest cases, like if you ask chatGPT to do something, like say that, for example, I have some input and I want to classify whether it's like positive or negative sentiment, for example, then you just give them a few examples like this is a positive sentence, this is a negative one, and then in the context, so in the same kind of window, so in the prompt you give an example.

Speaker 1: 51:27

Yeah, exactly.

Speaker 3: 51:28

And then you give the actual kind of one that you want to classify and say like, oh, what about this one? And then it knows, because it sees examples, it understands the especially like format, and those kind of things are very important.

Speaker 1: 51:39

So basically what you're saying you know, talk about your objective function, you know. Are you doing this now? I guess?

Speaker 3: 51:45

Yeah, so the objectives are. We're trying to make them as kind of you know, easy to generate as possible, of course. So it needs to be kind of a free text and kind of description of what it's supposed to do, and we also give it examples on how it should think about evaluating whether or not it works.

Speaker 1: 52:06

Is this what the dev user should use in the US interface? When he sets up the test, he needs to be able to set up the prompt.

Speaker 3: 52:12

No, we want to do that for them. So this is just to. So we have, like you can think of the problem in three different kind of buckets. We have the agent made. It made agents is just kind of tasked with trying to achieve an objective and then evaluate whether or not it succeeded. So that could be like log in with these specific credentials, right, and then it tries to do that and then it, and then it looks at the history of what it has done and they kind of end state and it then it realizes whether or not it actually functions.

Speaker 3: 52:38

No, I got to this kind of error page instead. Yes, I'm now at the user dashboard, so I have looked in, but then the other buckets are about like this generation. So we, in order to know what kind of functionality exists on the web page and and also even the kind of planning and reasoning about the page, we are constructing like a graph that that kind of explains the how different usage journeys are kind of and what functionalities available, and using that graph and the different kind of things that we kind of discover on the page, we then generate these kinds of objectives for the agent to try and achieve.

Speaker 1: 53:21

But bottom line, you need to go to quite depth understanding how to set objectives and how to validate objectives, and you need to make that a really strong part of your software. This is of any software, I argue Exactly so that is agent oriented.

Speaker 3: 53:39

So so. So currently we're focusing along kind of this validation flow is essentially that we generate things. But it also needs to be that kind of. Even if it's a valid kind of objective, it still needs to be easy enough for the agent to actually succeed with it, otherwise we can't chip it to to to customer but can you see the risk, inherent risk, now, when we start taking talking about agent and we think, you know, maybe 2024 is the real year of the agent where this conversation will explode.

Speaker 1: 54:07

right, if you don't get this right, if you, if you're not understanding how to set up the right objectives and how to sort of understanding this is, this is in the validation, this is going to be tricky.

Speaker 3: 54:22

I don't think that changes too much from from what ML has been for a long time. Right Before, before, you can have tried to optimize some certain, some specific hyper parameters and you need to validate that they work. Now you your hyper parameters is like a string of texts.

Speaker 1: 54:39

But but regardless and I think of course you don't want to have flaky tests, right, you want to test to be kind of functioning, to be honest, I'm coming from a point of view into this conversation where I see project after project after project with super lousy project goals and super lousy project views of what's supposed to be done, and from these lousy goals someone is trying to distill out analytical objective functions and it's literally bullshit the whole journey.

Speaker 4: 55:07

Yeah.

Speaker 1: 55:09

So I think the whole and the whole, the whole poor piss, poor projects, view of what is it that you're trying to solve. Yeah, you just get accentuated and I think you are now speaking to the converted and you know if you're deep into ML, you know it's not going to work if you're not doing this right. But I think this is a major problem when people are coming from a more, you know, traditional enterprise setting goes into this, that they are taking this way too lightly. I don't know.

Speaker 3: 55:37

I think it is. I think it's easy to fool yourself in two different ways. We're kind of slightly pivoted in the discussion, I think, but it's essentially essentially like it's very easy today to test things like upfront. So you could ask chat to PTIP if you could, if you can answer something, and then it would give you a plausible answer. And then you think, okay, all right, we could do a production kind of deployment of this. That it's pretty far from like from that, because it's still just a proof of concept and in order for the agents, or like just the LLM at least, to kind of take good decisions and answer in a good way, you need to phrase it in a lot like a very specific way to make it understand that it's all about the context going in right, yeah, so do you have a very valid point?

Speaker 1: 56:31

Do you understand? Do you think this is a problem?

Speaker 4: 56:33

Sorry, I got distracted by the phone so I actually missed out on the discussion a bit. So let's go.

Speaker 1: 56:38

Let's move on. So, yes, I have one nerdy question. You know a little bit, I don't know. I'm just curious about the tech stack, and then I think we could move into other topics that we have on the list. But could we, could we get a sneak peek on? You know what's the, you know what was the large language model that you're using and what's the tech stack we're using?

Speaker 3: 57:00

several, and we want to be able to evaluate them kind of easily, because they work differently in different settings, so you need to experiment. But we're primarily used open AI models, and then we've used the Google ones for some things, and then we've most experimented with other open source versions, like the CogVLM, for example, which is like a.

Speaker 1: 57:21

So you haven't, you haven't sort of seen in on one. You're still validating the for different parts of this.

Speaker 3: 57:27

Yeah, I think. I think, like most people, primarily using GPT-4.

Speaker 1: 57:32

Yeah, okay so, but let's then zoom out one little bit. You know how are you thinking about open source versus not open source, and why.

Speaker 4: 57:44

But let's keep on this tech stack a bit more.

Speaker 1: 57:46

I mean the more this one thing, but so much more in the tech stack.

Speaker 3: 57:49

Thank you, we can go into this open source soon as well. I'm a huge open source proponent, but I think apart from that, we're using different kinds of kind of take to solve different kinds of problems. We're very kind of type script heavy actually, because it's easier to build and integrate these models into kind of web application flow with that. We're using GraphQL to kind of communicate between different services. We have Python as well, of course.

Speaker 1: 58:23

But you have quite a bit of GraphQL, more than Python.

Speaker 3: 58:27

Same. No, I think, in terms of lines of code it's definitely Python, like. Graphql is just, like you know, this communication layer, but and then we use Neo4j as well to represent this kind of this graph of the web pages.

Speaker 1: 58:42

So you have graph database in the bottom here as well.

Speaker 4: 58:46

Yeah, we do use Postgres as well through this super base, which is like a Postgres on steroids with real time and stuff, and then you run it on some cloud service like Google, or yeah, we use it Google and Azure.

Speaker 3: 58:58

We use both, but why?

Speaker 1: 58:59

both. Why? Why not why?

Speaker 3: 59:01

Because you get credits as a startup and different ones.

Speaker 4: 59:06

Do you use more like Lambda kind of services we?

Speaker 3: 59:10

try to be as kind of you know, managed as possible when it makes sense. So we use Cloud Run for a lot of things, for example and we're talking serverless in different ways Cloud. Run is like a managed Kubernetes that essentially runs your containers for you.

Speaker 1: 59:25

So you want to run as feasible highest abstraction level possible, or how would you phrase this for someone like me who doesn't understand?

Speaker 3: 59:33

Yeah, pretty much right. You don't want to spend time on things that are already solved.

Speaker 2: 59:40

So that's kind of the point.

Speaker 3: 59:40

So if you want to build things quickly, then embracing the cloud solutions makes it much easier to move fast.

Speaker 4: 59:48

Yeah, I mean that's one really powerful, of course, way of working when you can make use of the all the tech stack that all the cloud providers do provide.

Speaker 1: 59:57

Even sitting there jealous on this, I know.

Speaker 3: 1:00:01

It makes it easier to move fast, right like. It depends on your constraints, of course.

Speaker 1: 1:00:13

It's time for.

Speaker 2: 1:00:14

AI News brought to you by AI AW Podcast.

Speaker 1: 1:00:18

I don't think we really prepared Wilhelm for this new segment.

Speaker 3: 1:00:22

This was not the part, like last time. No, no, no this is fun.

Speaker 1: 1:00:26

So we started a new segment that we call AI News and we are trying to all pick apart. You know what happens in the AI News because it's been so ridiculous in 2023, week by week, so week by week. You know what did we find that we want to share, and if you, you know so, but we didn't prepare Wilhelm this time, so you don't have to have a news. But if you had something you that you want to sort of talk about, feel free. But today I think you should start Anders.

Speaker 4: 1:00:57

Yeah, I actually have a news topic that is very related to the podcast we're having. So Google just released a new Gemini powered, so Gemini is, of course, their latest deep mind model. Yeah, that is getting actually very close to GPT-4 performance now.

Speaker 1: 1:01:12

And which one? Is it the Gemini?

Speaker 4: 1:01:15

They don't say I guess it's pro, but yeah, the middle version.

Speaker 1: 1:01:18

The middle version came all the way out in Europe now, or is that the point that's in, bard you?

Speaker 4: 1:01:22

mean in Bard, but this is not in Bard. This is another service that they use actually for bug fixing, and they have previously released a software called OSS FAS and it's basically an open source version of doing FAS testing. So FAS is basically, if I understand it correctly, basically trying to just have a numerous number of inputs strange input to whatever function you're having and seeing if you can find some kind of memory leak or bug that can be exploited as a security vulnerability.

Speaker 3: 1:01:59

It's very popular in the kind of functional programming.

Speaker 4: 1:02:04

So doing this at scale is really hard and they basically provide a service for open source projects that enable open source products that don't have the resources to do this otherwise to do it for them and for free as well. So what they added, now that you know, when they added this kind of FAS testing, you still have to have some kind of entry point, some kind of target that you put in the code and that limits, you know, the coverage that the FAS testing can have in the software product. As I understand it, what they've done now is actually used Gemini as well as a code generator to add more coverage in the code for doing FAS testing. So with that they can improve the coverage and in that way do FAS testing in a much broader way and find new potential security vulnerabilities. And when they tried that using the OSS FAS open source product they have and they all I think they have around like a thousand open source projects included in that already and just tried it using the Gemini approach, which of course, is much better than previous LLMs that they have used they found what was a number, a huge number of additional bugs.

Speaker 4: 1:03:13

But they also found two really big security vulnerabilities for some really popular libraries out there. Let's see if I can find the names of them. It was the CJSOM library and LibPelis libraries, two really common libraries out there, and of course they were doing that the proper way and reporting the vulnerabilities to the maintainers before they released this information. But having the ability to use LLMs, similar to what you're doing, to find bugs using this kind of FAS testing in a much more general way and without having humans having to enter the code for them, is really cool.

Speaker 4: 1:03:57

In addition to that, once they found potential bugs or security vulnerabilities, they also, of course, try to fix them. So they use Gemini also to try to fix the bugs for them. How did it go so? Apparently they fixed 15 percent of the bugs they found, and I think they found like thousands of bugs and then two really severe security vulnerabilities and 15 percent of them could be fixed automatically, and I guess the others at least they provided some hints with a code that could be useful for the engineers to fix the bugs for them as well.

Speaker 1: 1:04:31

This is a big deal.

Speaker 3: 1:04:31

That's really cool.

Speaker 1: 1:04:33

It is cool. But if you think about that, I mean this whole race around AI and security and with using AI for security is sort of on the virtual circle here. Very interesting.

Speaker 4: 1:04:48

Well, this sounds awesome, right, it sounds like this is something very positive for our world. But you can look at the flip side on this and say this is a perfect tool for cyber attacks.

Speaker 1: 1:05:04

Why is that?

Speaker 4: 1:05:06

Now you have a way to use Gemini to find bugs and security vulnerabilities in projects and you can do that at scale without having the skills to do that manually.

Speaker 1: 1:05:18

So could you in theory use this on someone else's open source projects?

Speaker 4: 1:05:22

On an open source project, Whatever code you have available if you get the source code available to whatever product you have. You can put this in this tool. Have Gemini to add the coverage of the code as much as you want found the bugs and then abuse it, but that's always been the case, though.

Speaker 3: 1:05:40

right Like you always have this kind of red, white hat, black hat, kind of you know dynamic, right where you have people that can exploit things but also at the same time it makes it easier to kind of stop the exploits.

Speaker 1: 1:05:52

But does it mean that we should not go for this or do you think this is used? You know we need to go for it and that we need to be aware of the flip side.

Speaker 4: 1:06:00

I don't know, and the question is here, that they do open source this. That makes it additionally questionable, potentially.

Speaker 1: 1:06:09

I know you're a proponent of open sourcing. This is super positive, but it's a huge flip side that you're trying to push high.

Speaker 4: 1:06:14

I mean, if they were, released in a controlled way so people couldn't abuse it, then it would be no problem.

Speaker 3: 1:06:18

But if they are releasing it as a free service that anyone can use without, having any way, but on the same side you could say that then everybody can use it to kind of to mitigate their vulnerabilities right, yes, it becomes a normal phrase.

Speaker 1: 1:06:34

This is a normal phrase.

Speaker 4: 1:06:36

If you have a black hat person, not the white hat. The black hat is something that are deliberately looking for exploits. And how do you prevent them from being empowered by AI, as you are here?

Speaker 1: 1:06:48

You can't anyway, right, I agree with that.

Speaker 4: 1:06:51

But if they didn't open source all the awesome research that I've done here, potentially it would be delayed a bit.

Speaker 3: 1:06:58

I don't know the answer here. It would be, but then at the same time, like this small startup that doesn't have the resources either, they can't use it to kind of fix the bugs.

Speaker 1: 1:07:07

Oh, it's a super hard dilemma.

Speaker 4: 1:07:09

I don't have the answer here it's a trolley dilemma.

Speaker 1: 1:07:12

It's just kind of speeding up more and more. I guess it's just speeding up everything.

Speaker 3: 1:07:16

But it's the same kind of argument like LeCun is making for this kind of misinformation and kind of being able to filter misinformation and stuff.

Speaker 1: 1:07:24

Awesome news. Do you have another one?

Speaker 4: 1:07:26

I know it's fine.

Speaker 1: 1:07:27

Do you want to go next, guaran, or I can go next.

Speaker 2: 1:07:31

I can do a filler, just do a filler.

Speaker 1: 1:07:33

Because you always do a couple of short ones. So do a short one in between, and then you can do a filler, so this is a filler.

Speaker 2: 1:07:39

So just basically to point out. On Gemini, this is three hours ago. So Google now put permanently terminated the bard, at least as a term, and put all of their features under the banner of Gemini. So now if you go to bard, you know, like when you do the bard, what is called user interface, it says Gemini.

Speaker 1: 1:08:04

They're cleaning up their brand portfolio.

Speaker 2: 1:08:07

You can see that they are ramping up for a huge fight here. So in the last couple of weeks they have been just like promoting new things almost daily. So I think they are gearing up for something.

Speaker 4: 1:08:20

I guess we can see a Watson moment coming here, you know after IBM, you mean you?

Speaker 1: 1:08:25

mean Gemini is the new Watson in terms of naming everything in the Gemini they are using the term Watson system.

Speaker 4: 1:08:32

You know IBM in 2011.

Speaker 1: 1:08:33

They're still doing it. They're still doing it and it's like I see this IBM come on. That brand is not good anymore.

Speaker 2: 1:08:41

True, but continue now with the years. Okay, I have another one and then I will come back with the last one I haven't used, which are different.

Speaker 1: 1:08:52

So headlines no way to establish four to six new research centers for artificial intelligence and national coordination. So this is an article that is you can find on medium, by Alex Maltzau, that we want to have here on the board.

Speaker 1: 1:09:10

And basically coming on the we're going to have him on some of the conferences, but he's been doing an excellent job to basically objectively reporting on the AI strategy in Norway. So he's been following the whole process very, very diligently and then basically taking excerpts out of what was discussed in Rick's Dagen or you know, I don't know the Norwegian word and stuff like that and here and then basically sucking out the core topics. And here now the bottom line was that they actually a couple of months back in the Norwegian budget they highlighted we're going to earmark one billion Norwegian Kroner for AI. And then at that point in time he's very fluffy, right, you know? What do we mean with one billion Is? It's all hidden in the different department's budgets. What is this billion all about? And here now he's he's actually quoting to like like the actual board decisions. So they came to decision where they have the. If you find the article, you can sort of see what was written in the meeting minutes.

Speaker 1: 1:10:19

And you know you know, and then basically, it's quite clearly stated that will focus on four to six research centers. What that means, and you know story to unfold, but I just want to highlight Norway's going AI. Norway's doing a good job and Alex is doing a fantastic job to sort of comment on it and making it transparent.

Speaker 2: 1:10:45

But we're also doing a good job. There was a new report this week that came out that we are missing 18,000 people per year.

Speaker 1: 1:10:53

Yeah, this is not a new, so I was thinking about that one and that one.

Speaker 2: 1:10:57

Should we continue that way? No, no, no. But we are pointing out on a problem.

Speaker 1: 1:11:02

We are pointing out problems and here we're putting money where our mouth is in Norway, which I and then not only money where our mouth is, but also what is the money spent for, and I think it's you need to get concrete and stepwise Now. I think it's they're doing an excellent job over there.

Speaker 2: 1:11:21

We will catch up.

Speaker 1: 1:11:22

We'll catch up. Okay, that was my news. Any comments on that and reflections? Congrats.

Speaker 3: 1:11:28

Congrats. Yeah, it's good that they're spending money on AI.

Speaker 1: 1:11:32

Yeah, but the point is that they are trying to be concrete about it, and they are, and someone is then reporting on it in a very professional way.

Speaker 2: 1:11:38

It's a wrong word. Actionable is like actionable is better work, action was better.

Speaker 1: 1:11:43

Okay, that's it. Do you have any? Do you have any?

Speaker 2: 1:11:46

top of your head.

Speaker 1: 1:11:47

Because we know we sort of stumpy on this one yeah.

Speaker 3: 1:11:51

I read this paper about meta prompting recently, which was released very I don't. It's probably not this week, though. It's an interesting kind of concept. You know that when you're prompting a model in general, it's hard for the model to do like everything at the same time. So usually you have a lot of different kinds of asks in the same prompts and a lot of different kinds of Well kind of instructions. So essentially the idea is that you can have this kind of framework to break it up and kind of make the models more performant.

Speaker 3: 1:12:28

So they got like a prompting and stuff yeah, exactly, well, similar to like it's a similar kind of idea, like so it's a way to kind of break up the prompts into several parts of the meaning. So say, for example, I want, I'm asking like GPT, to answer some specific questions about and kind of extract information in different ways. Then it can kind of isolate the different parts and execute those and then kind of combine the results in a nice way.

Speaker 1: 1:12:52

And that increased, like performance of GPT4 and a lot of different kinds of tasks, some sort of orchestration approach with the meta, prompt and smaller.

Speaker 3: 1:13:01

Yeah.

Speaker 1: 1:13:03

And is that just a way to prompt or is it something when you say meta prompt?

Speaker 3: 1:13:07

is it just a way to do it? They called it meta prompting in this paper. Yeah, I guess it's a way to prompt. Like everything is a way to prompt.

Speaker 1: 1:13:15

Right Interesting. Have you heard about that before?

Speaker 4: 1:13:17

Yeah, I heard about it. I haven't heard that.

Speaker 3: 1:13:21

I don't think it's like this week, it's like that specific paper was, like a few weeks ago maybe. Yeah.

Speaker 2: 1:13:27

But we have not covered it, so it's fine. No exactly it's good. All right, so I will finish it. We just do them. This week has been very boring for some reason. So, and you know that is boring when the biggest news, actually, from last Thursday to this, is that the European Parliament agreed on the AI Act. So now this is going to the internal market and civil liberties committees of the European Parliament on the 13th of February.

Speaker 2: 1:13:56

So when that's the top news, test boring week yes exactly, which will be followed with a planning revolt provisionally scheduled for 10th and 11th of April, and, if this is agreed, the AI Act will enter into force 20 days after the publication, so exactly when data innovation summit is in place. So I think this is going to overrun all the discussions about AI and generative AI. Everybody will be talking about this and once it's in place, then basically the prohibited practices will start applying after six months and obligation to the AI models will start after one year, and then you have an etc. So, as I said many times, if you're investing in consulting companies, good for you, man. You're going to make so much money.

Speaker 4: 1:14:42

So this year for legal compliance.

Speaker 1: 1:14:44

Can we see this?

Speaker 2: 1:14:45

like so many experts on LinkedIn these days, it's a beautiful thing and I have a quiz. A bit, a little bit, it's not a quiz actually. So, I was reading this article and thought it was funny. It's not funny, actually very insightful. And it says like, if you have to choose, okay, who will be winning the JI Gen generative AI pattern battle, and is it open AI, is it Microsoft or Google? Who do you think? So we're talking about patterns regarding generative AI, I'll say Google.

Speaker 3: 1:15:22

Right Wilhelm. Oh, my God Feels like Microsoft.

Speaker 2: 1:15:27

All right.

Speaker 4: 1:15:29

Well, open AI and Microsoft are the same, basically so, but I don't think I have really hard time to figure out or to even understand that they are going into battle on this, because both of them are not really using it for offensive purposes.

Speaker 3: 1:15:46

So I don't think they rather have this kind of opponent, which would be Facebook in this right Like or that's all right. They're kind of that's open sourcing everything.

Speaker 1: 1:15:54

So, but what we're talking about? Patent battle in relation to patting in the techniques. What is the pattern?

Speaker 4: 1:16:02

I mean Google at least is are open saying they don't patent to to be offensive or to to take the offense in winning or conducting patent trials. They do it just for defensive purposes if they get attacked and, to my knowledge, microsoft doing the same.

Speaker 2: 1:16:21

So that's why. I'm a bit surprised on show the answer or the right answer is IBM.

Speaker 4: 1:16:34

Yeah, that's true, that makes sense, it's correct, because the other one is going to actually open.

Speaker 2: 1:16:40

AI has less than five patents applied and I think that IBM has over. Let's see where it was.

Speaker 3: 1:16:50

It was they also been going on patents for many, many years Right.

Speaker 1: 1:16:55

IBM and Ericsson has been sort of churning out patents since 50 years.

Speaker 2: 1:17:00

But let me find the topic.

Speaker 1: 1:17:03

So it's interesting because it sort of reflects a little bit about you did where they are coming from. Yes, so.

Speaker 2: 1:17:09

IFI claims patent services found that IBM holds one thousand five hundred ninety one application patent applications related to generative AI. I, third more than Google and more than double in number by field.

Speaker 1: 1:17:22

And is that reflecting on IBM being first or they just being a patented?

Speaker 4: 1:17:25

or what they're not, first on anything when it comes to generating.

Speaker 1: 1:17:28

No, but they're patenting culture.

Speaker 2: 1:17:30

That's what it reflects If you go a little bit, this is the most important part basically what they are patenting on. So these are actually, as you can see, the biggest one is on computing arrange and based on biological models. This, this research, is usually used to show the trends that are hot, upcoming. That can basically be a potential gain, so winnings in a future.

Speaker 3: 1:17:55

So what artificial neural networks can go into the big bubble then?

Speaker 4: 1:17:59

Yeah, so perhaps neuromorphic computing.

Speaker 2: 1:18:02

Yes, of time thinking. Yeah, probably.

Speaker 1: 1:18:04

Neuromorphic is something IBM is kind of Is they. Are they behind or they are more quantum?

Speaker 4: 1:18:10

They are in quantum. Yes, unfortunately for them.

Speaker 1: 1:18:13

But if you talk about IBM, they would rather want to talk about quantum than neuromorphic. The other players that want to talk about that Exactly.

Speaker 2: 1:18:24

In any case the first. So let's just list the top five. So the first is IBM, second is Google, third is Microsoft, fourth is Samsung and then Intel is I was winning from.

Speaker 1: 1:18:35

How do together with Adobe and etc.

Speaker 2: 1:18:37

And, as we said, like open, ai is not even the top 35 or something. It's actually. They are truly open. Staying there is like less than five.

Speaker 4: 1:18:47

So it's the first time they are yes, they're open for the first time.

Speaker 2: 1:18:51

And with that we conclude this boring news week.

Speaker 4: 1:18:58

Okay, should we move back to some question Perhaps the topic a bit and just ask a bit more about the general journey of a startup and for people that want to take the same journey as you have just taken.

Speaker 4: 1:19:12

there are a number of steps I guess one should take. You have a bigger advantage because you've been working in the investor space and have a lot of connections that the people that could help you on that journey. But if you were to just say for someone that is, you know they have the awesome idea of using generative AI for whatever, what should they do?

Speaker 3: 1:19:34

Well, first of all, I think it's good to just kind of talk to potential future customers, so get a little bit of an idea of like of the market and understand if there is a problem that you're actually solving or if it's just something that you think is super cool. I think that's a that's kind of a good place to start. Then, of course, having some kind of proof of concept in place is good, because usually you need to demo something and not just talk about it Before we have funding, I guess, and yeah, I think so that would be good, at least something right.

Speaker 3: 1:20:05

And then, of course, like, pull on your contacts and try like you probably know somebody who knows somebody who's who's at VC, and like if you, if you want to take VC money right.

Speaker 4: 1:20:17

And then continue because the alternatives here either you just you know fund it yourself and go really lean in the beginning until you have something and build your own kind of revenue, or you take VC or you take a loan in some way. Yeah, can. Can you elaborate a bit on the pros and cons on?

Speaker 3: 1:20:36

you know what's the also very much depends on the problem, right? So if you're doing, if you have some kind of idea in, like super deep tech, and it's going to require you to do like a lot, of, like you know hire a big team and do research before you would even have like one paying customer. Then of course that requires you to take in a lot of money.

Speaker 4: 1:21:00

The mistral approach, taking hundreds of millions of dollars Exactly. I highly recommend that.

Speaker 3: 1:21:06

And then otherwise, I mean, if you, if you have, have an idea that is pretty, pretty lean and you think you can kind of start getting at least some idea of a customer and kind of build on this idea and then, and then, once that is in place, start scaling, then maybe that's, that's a better approach, like at least the best approach if you can take it.

Speaker 4: 1:21:25

If you can take it, if you can like, I think.

Speaker 1: 1:21:27

John Bosch said it really really well about when and how to think about VC and he summarized it in one sentence for a lot of startups, I mean, I think the deep tech is an exception Nail it before you scale it. So I think it's a good thing.

Speaker 1: 1:21:46

You know, I think it's a super good saying because what are you going to use the money for? You know, are you even close to a product market fit? You should be able to do a lot of stuff to think about. What is it that you want to scale up with money?

Speaker 3: 1:21:58

Yeah, and also what's what's your, what's your bottleneck.

Speaker 1: 1:22:00

What are you like you're saying? Why are you going to use the money for Like don't?

Speaker 3: 1:22:03

just take in money because you think it's good and like. You need to have a pretty clear plan of what you need it for and why you need it.

Speaker 4: 1:22:11

At least for like, infrastructure needs or computing or storage needs. You can get so much free credits anyway.

Speaker 3: 1:22:16

Yeah, both Google and Azure and like everybody's offering you, like free credits as a especially AI startup these days, because they want to attract that.

Speaker 1: 1:22:25

You want to get you in there first, but but take your own example now, because you, the way you went or had had to go, was more or less a pre seed, fairly early funding, you know, even to get off the ground right. Yeah how was that?

Speaker 3: 1:22:42

I think that was at the stage when we wanted to start kind of building a little bit of a team right.

Speaker 1: 1:22:46

Yeah so.

Speaker 3: 1:22:46

I think if you don't have any revenue, then employing people is expensive, right, so you need to kind of get in money. Someone needs to be a little mask or an angel investor to put the money up for a team and there's a lot of seed and pre seed investors.

Speaker 1: 1:23:01

So, this is a pre seed investment in order to have the fundamental core basic team. And how big is your team now? Nine, right Exactly. No, because you can't really. You were three or four founders, but you even to get going, you need a basic team, right?

Speaker 3: 1:23:20

The more people you employ, the more costly it is Like the primary, especially now that you have credits for all of the computing stuff. That's the primary kind of cost for for a startup like software based right yeah.

Speaker 4: 1:23:37

And you do. Did you take this money mainly to speed up against the production, or yeah, we want to grow and kind of take market share and we feel a little bit like there is a first move advantage as well.

Speaker 3: 1:23:49

So we want to kind of so some marketing purposes is really for building the product or is it for?

Speaker 4: 1:23:54

Yeah, it's totally for building private, Otherwise we wouldn't be able to have the team that we have now.

Speaker 1: 1:23:59

So your preceding money went to building the core engineering team and then, you know, in later stages we will have the whole. You know huge money investment to ramps things up in marketing and all that. But right now we have we have a head of marketing. That started like, yeah, but from a core problem point of view it's to get to the first beta into the real customers and therefore having the real feature set that is a real product. That's the number one goal right now. Yeah, exactly, Cool.

Speaker 3: 1:24:26

I mean we need to get to the stage where we have people we need to get to the stage where we have paying customers is obviously the main goal.

Speaker 4: 1:24:34

And just go through the terms here, for people are not familiar with it. So pre seed money is one thing, then you get some seed money and then you get some kind of series.

Speaker 3: 1:24:43

A and whatnot series A, b, c and so forth, like there's give us a breakdown the way you think about it.

Speaker 3: 1:24:50

So I would say preceded is like before. You have pretty much anything right. Maybe it approved concept. Seed money is usually to start kind of building out like the team and the product and start getting to the point where you have a more kind of paying customer base. Series A would be more like when you have a business that is functioning and then you have like a lot of revenue revenue already right and you kind of want to scale it. Yeah, we will do different rounds or more about, like scaling.

Speaker 1: 1:25:17

And then and then, if you flip a little bit and look about the loan side, so I think there is different types of loans. There's some types of loans you can get very, very early going to the bank, sort of almost like equivalent of the precede stage. And then there is other types of loans.

Speaker 3: 1:25:32

It's easier to get loans if you have revenue right.

Speaker 1: 1:25:35

Yes, and that's what I was alluding to. Then you have other types of loans that you kind of need to have a critical mass, you need to have a customer, you need to have some metrics, and then you can get a loan space on those metrics.

Speaker 4: 1:25:47

Yes, and perhaps you should mention your previous colleague as well that has been on the podcast Henrik. Langgren that has his new investment or financing perhaps is a better term company called Ark Capital, not anymore.

Speaker 1: 1:26:01

They changed name, very important. What's the new name? Gillion Gillion.

Speaker 4: 1:26:05

Billion, gillion.

Speaker 1: 1:26:06

Billion, gillion, yes, gillion, right Our capital, no more Gillion.

Speaker 3: 1:26:11

Thank you. It's also my old colleague, colleague Eileen Becklin, as well.

Speaker 1: 1:26:18

So, eileen, you work with Eileen, and now it's Henrik and Eileen and who else is in?

Speaker 3: 1:26:22

Gillion. They're a pretty big team now.

Speaker 1: 1:26:24

There's a big team now. It's 80 people. I think, yeah, they scale fast.

Speaker 4: 1:26:28

So would you consider doing the financing approach instead of the VC approach?

Speaker 3: 1:26:33

I can speculate on that. I think I mean it kind of depends on the different strategies. I think at this stage we haven't really kind of focused on building revenue early, so we're kind of still building a lot of product in order to start to scale it right.

Speaker 1: 1:26:48

So your model, where it is right now, is simply out of their learning mode, because you need to have I think so, you need to have predictable revenue actually.

Speaker 4: 1:26:57

But you plan to take some seed or series A in the future. Yeah, yeah, totally.

Speaker 1: 1:27:01

Cool and can we okay, what is your ideas on how to think about finding a good VC partnership? So right now, when you had, you have friends and you managed to find a really nice partner, but how would you advise around you know I want to go down the VC route how to think here about finding the right VC partner for you?

Speaker 3: 1:27:26

I mean this is it's hard right.

Speaker 1: 1:27:28

I think, it's.

Speaker 3: 1:27:28

it boils down to finding somebody that believes in your kind of vision, right. So make it as easy and like little complications as possible, like no preference structures or anything in the beginning. Just kind of make it a collaboration that you kind of build together, right? So if you have a VC that believes in you and wants to help you but that are kind of very open minded, I think that is a very good kind of starting point.

Speaker 1: 1:28:00

Let me sharpen it. I mean, like there is another term in here, another type of VC. We talk about an angel investor.

Speaker 3: 1:28:08

So you go, you know that's great for brought me your work and how would you define an angel investor is more like a person. It's a person who's sort of putting up the capital versus a fund it could be like you could be an angel investor, like if you put your money into a startup.

Speaker 1: 1:28:21

And how would you differentiate the benefits or you know how to think about angel investors versus the C from a fund.

Speaker 3: 1:28:28

Usually, angel investors are not able to make as large tickets as a fund right because they have to invest their money.

Speaker 1: 1:28:35

Then may you go out. And if you find 10 angels, sure, one way of sure.

Speaker 3: 1:28:39

That could definitely be a good way. I think what angels open up is their network right. They usually have interesting expertise. They've kind of maybe run similar startups before and then they also know a lot of people, especially in the VC kind of landscape.

Speaker 1: 1:28:57

So it sort of opens up one of the key tips and tricks around choosing your angel or VC. It's not only about the partnership per se, which is, I think, you mentioned and tick that box as the number one box, but it can also has something to do with like, what is their ecosystem all about? You know, and how does my product and my idea fit into their ecosystem? So I give the example maybe there's a strong e-commerce ecosystem or there's a strong B2B manufacturing ecosystem or green tech ecosystem.

Speaker 3: 1:29:29

Yeah, I mean if you, if you like, yeah, that matters a lot. Right, because I think that's part the same thing as kind of finding an investor that fits your kind of vision. Right, because if you're building something, if say, if you're building an e-com solution and you have some kind of you know, healthcare investor, they're not going to give you good advice. It's kind of.

Speaker 4: 1:29:48

well, they, maybe they do, but the network to tap into, which I think is a key point of one of the key benefits, Okay, if we I mean, since you have a unique background as well and in actually evaluating startups, if you were to make your startup now as attractable as possible for investors.

Speaker 3: 1:30:08

I hope we're scoring very high on the investor scoring you know the, you know the mother brain algorithm. So the algorithm is now. If QA tech, then give 100 million.

Speaker 1: 1:30:21

No, I mean insights, you know what are the ways.

Speaker 4: 1:30:23

I mean either in way of recruiting the team, the product, the market fit or whatever. You know what would be the key ingredients, you would say, for me attracted.

Speaker 3: 1:30:34

Yeah, I think it's slightly different depending on this stage, right, but in general, showing that you have a product market fit is super important. Like that you have kind of maybe quotes or like reviews or something about your early product. Like that will definitely help. And then the team is like super key. Like when we evaluated startups, team is always like very central, like have they done similar journeys before? Do they, who do they know like, who do like anything that could potentially help them succeed in their mission? I think that's more important than the actual kind of business idea. We usually have to pivot a little bit like and all these things, but then, of course, the more mature you get, like you know, go to market strategy, like all these things.

Speaker 1: 1:31:19

But if you would you rank this is, would you say that the team sort of shows up the highest in the algorithm, sort of speak over very high.

Speaker 3: 1:31:28

I don't think I can speak about the activities.

Speaker 1: 1:31:30

I don't know, I was using that as a metaphor, but you don't.

Speaker 3: 1:31:33

Yeah, you know I think the team is super important yeah.

Speaker 4: 1:31:38

How do you make it seem attractive? What's what are you looking for?

Speaker 3: 1:31:41

I think I think you should focus on making it good, like that will make it attractive, it's complementary skills and the network they have.

Speaker 1: 1:31:50

You know what? What is the criteria for a high score on teams?

Speaker 3: 1:31:58

I mean, what's what's very important when one kind of valid thing that I think is that you have some kind of similar experience, because if you, if you're like a set of people that have very little experience from the same kind of domain experience and kind of.

Speaker 3: 1:32:16

You know that could also be about experience from from similar sized companies and similar kind of growth trajectories, right? So if you, if you have somebody on the team that has done this before and successfully, right. Or even if they didn't succeed, it's still kind of you know a merit thing, because they know so much of those kind of traps that you don't have to fall into again, right?

Speaker 1: 1:32:36

So I think so domain, but also on problem understanding, problem of the Solving, but also problem of the trajectory that your particular don't forget the tech and the tech and the tech, of course. I've got about. We always do on purpose on purpose and everybody forgets the sales and marketing. Yeah, yeah, this is also Jan Bosch's famous quote that Tech company is trying to underestimate. It's easy to underestimate that facet of the pie sort of thinking. What do you think about that comment?

Speaker 3: 1:33:10

The facet of the pie. I didn't understand it.

Speaker 1: 1:33:11

The facet of the pie. You know what are the ingredients. You know to succeed as a startup, as a tech startup. Oh, our product is so beautiful, so we don't need sales.

Speaker 3: 1:33:21

I think you need everything. It's easy to be kind of arrogant in your specific domain, right Like if you're, if you're a developer, then you think that you only need developers to actually build something. But you also need sales to sale it, sell it, you need marketing to make.

Speaker 1: 1:33:35

So the whole meta they go to market. The meta product is not only core technology, but it's the whole thing that needs to fly.

Speaker 3: 1:33:41

Yeah, totally right. Yeah, it's also why it's successful with cross functional teams. That we talked about before. It's because you bring, like the diversity of different minds together and solve it together.

Speaker 4: 1:33:51

So I guess then you know the team is important, but that means also that recruitment, I guess, is important for startups going long term. Do you have any thoughts about how to make your startup as attractive as possible to be able to recruit and also keep and retain talents in the startup?

Speaker 3: 1:34:09

I think in general, in regards to kind of attracting and retaining talent, it's about like having an interesting opportunity for them to work on, and treating your team with respect and creating a kind of healthy environment where ideas are welcome and and you kind of have fun together. I think if you do those things, you're gonna retain your talent.

Speaker 1: 1:34:33

What, what do you need? Purpose mastery, autonomy, fun, exactly.

Speaker 3: 1:34:38

That's good. That's a good list. Wouldn't you sign up for that?

Speaker 1: 1:34:43

I would sign up for that.

Speaker 3: 1:34:45

I think. But then of course it kind of Depends on your business model. I think we are a little lucky in the sense that we're building agents in 2024, I think it's like it's a very interesting area.

Speaker 4: 1:34:57

Yeah, awesome, time is flying away and perhaps we should move a Bit closer, not to the end, but to some more philosophical questions and I'm not maybe physical, philosophical but.

Speaker 1: 1:35:11

But, but, zooming out one step further, I mean, maybe that is philosophical. Which one were you thinking about?

Speaker 4: 1:35:17

I have a couple I was thinking like the future of software development.

Speaker 1: 1:35:20

Yes, this is the same, but I don't think it's okay. It's philosophical, because you're speculating, yeah, but I think it's super concrete in another way. Yeah, okay, let's go there.

Speaker 4: 1:35:29

Okay, let's see if we can phrase it as a question. But Of course, now having AI driven QA is one step into the future of software development. And we have Chatbots, or sorry, chat dpt's and bards, and we're sorry, gemini.

Speaker 4: 1:35:45

That should be called nowadays not part that is being extremely useful and basically 10xing every developer in producing code and building software. But I guess it will come a point where the AI becomes Not only, or even becomes like a majority, of the key component, and the human become more of a reviewer, to the point where potentially they are. The process of software development is Completely automated, as we spoke about before. Or what do you think? How do you see the future of software developing?

Speaker 3: 1:36:19

and you know and fold I think it's very interesting because, uh, when you the more stuff you offload on on on AI, uh, the less you understand as well, right, like it's easy to to kind of forget about best practices and kind of code understanding and all of these things right.

Speaker 3: 1:36:38

So, um, I I think we're definitely heading the way where we kind of uh, become more and more Advanced and automated, but I think people Like to understand and challenge themselves. So I think I don't see us kind of, you know, only lying on the beach.

Speaker 1: 1:36:56

I think the AI also wants to lie on the beach at some point, but it's kind of but we you're talking I think you're pushing in a good nerve here, because some people has I'm a coder, identity and really rest. I mean like that's mastery, purpose, autonomy and I'm an engineer, so use, because I have AI coming out of my ears and everything can be automated. I bet you that people still want to identify themselves as engineers. Yeah, so what does that mean then?

Speaker 3: 1:37:26

because I don't think the profession is going away though.

Speaker 4: 1:37:28

No, that's what I mean, right? So, basically, but will the programming language of the future be English? I don't think so.

Speaker 3: 1:37:35

No, not only I think. I think it will always be A way to communicate, like, even if you have, like a team of agents that helps you code and build things together, I don't think you would sit there like only like chatting to your Engineering team on slack and hoping that they will build it for you.

Speaker 4: 1:37:51

I think you will like.

Speaker 3: 1:37:52

If you're interested in coding, I think you will be part of that.

Speaker 1: 1:37:55

So whether that becomes, then. That opens up the next topic then of Transparency and be able to interact with the code at all when it's all Down it's all kind of super crazy AI generated code that nobody understands.

Speaker 3: 1:38:09

Yeah, that will be a problem, but okay.

Speaker 4: 1:38:12

I mean, if you take the the qa tech kind of approach and and focus on on the interface and what the user sees, and if you were to add a prompt later in your service saying please Move this button in this way, add this kind of additional functionality to it, could the engineering practice move in that direction?

Speaker 3: 1:38:32

So but I think so too, and when you think about it, it's not like moving the buttons that are the most interesting kind of problems for an engineer array. Usually engineers like to to tackle kind of problems and think about how to solve different kind of things. Right, it's not only writing the exact like the JavaScript or a kind of component, right? So I think, uh, that component will still be there. I think, yeah, how will you Uh Kind of actually write code and how you, how you do different things, what might change like will definitely change, I think. But being part of the kind of Uh problem solving and, like you know, tactical puzzle is but let's, let's test a little bit.

Speaker 1: 1:39:16

Like I listen I can't remember which podcast it was if it was like free man with Elon Musk or whoever it was. I think it was Elon Musk. I want to test his view on this. The more ai we have, you know, the more capability we have the the limits of what type of super complex problem we can try to tackle Shifts no, it doesn't.

Speaker 3: 1:39:43

It shifts in some ways, but it, like people say, love playing chess, like kind of depends on what you like to do. Right, it's kind of okay, so, okay, I.

Speaker 1: 1:39:51

I take, I retract my question and frame it differently so, as an engineer, we can start Uh, tackling more grand challenges. So take the example of Jeff Bezos and Elon Musk's grand challenge we're going to put people on Mars.

Speaker 1: 1:40:05

It's extremely super complex in and and it's so multi-dimensional in in as an engineering quest right so, in order to be able to Scratch the surface on the real problem of going to Mars, we will need a shit lot of ai to take us out of the mundane problems into think about Engineering problems. But now on a more on a super much high complexity level. Yeah what do you think about that sort of framing?

Speaker 3: 1:40:33

it's. The same kind of analogy is like Uh, we talked about before with, like, running an engineering team. It's a different thing than writing code, right? Yes, and I think being part of the puzzle and not having to do like all of those repetitive tasks, and being faster and being able to push things and and build new technology, that's definitely going to make us move faster because we're going to move on, the we're going to get away from the mundane problems to the complex, interesting problems as an engineer.

Speaker 1: 1:41:04

Yeah, it's one way of looking at it.

Speaker 3: 1:41:05

Yeah and maybe like debugging your code because it's behaves in a slightly Weirder way than you thought it would and, like you know, having raised conditions and all of these kind of strange things, that Is is maybe not the engineer's favorite task to do anyway, no right. So, but but then and like what happens after that when, when AI is more kind of part of Setting the goals and kind of thinking about the more complicated things, then we're talking more about like super intelligence and those.

Speaker 1: 1:41:34

Okay, if you. If you, because now we took it to a lot of AI. But then the next level of software, when we have super intelligence or, you know, when we are not even Okay, that's a different story, yet I guess.

Speaker 3: 1:41:45

Then it's harder to kind of hypothesize about what it's supposed to be right.

Speaker 4: 1:41:48

Yeah, it's just super intelligence, then you know.

Speaker 3: 1:41:51

They can go to Mars. That's a big, big, big yeah.

Speaker 4: 1:41:54

And I guess when one way to see it is basically that today Software development is limited to the few people that actually gone through the education of doing that and spend a lot of time Learning the the craft, so to speak, and perhaps with AI so many people can actually start to build software and that could be a really good thing. So many more people can build stuff and that can increase the number of products and services we have in our World, and perhaps that is the way forward to what Elon Musk calls the world of abundance, where we have so many products and services.

Speaker 1: 1:42:27

But this is Okay, you go first.

Speaker 3: 1:42:30

I think one thing that is super interesting in this is when you think about Some like some of the problems that you usually encounter, like like Integrating between different services and those kind of things. If you, if you kind of Envision what AI could potentially help out with, is like building integrations like this, so maybe you would rather say then kind of okay, I need this kind of integration to I don't know, probably not Salesforce at that point, but you, maybe that is the case, and then that kind of AI writes that kind of integration automatically and this kind of essentially contacts the other Firm and kind of writes the code together with their AI. Right, it's kind of interesting kind of perspective.

Speaker 1: 1:43:04

But. But let me now take that and then put that in a very concrete perspective in the news. But go on, and there was. There was this news article, I think this week that was one of the AI news we didn't take. There's been new reports on how many engineers do we need in Sweden for the next year, and the figure has gone up to 100,000.

Speaker 2: 1:43:26

But that was the the report that was.

Speaker 1: 1:43:29

You. You mentioned it briefly, but I want to take it into this conversation now.

Speaker 2: 1:43:32

Tech sweden basically created a new or published a new report which is um, um In general, saying that by 2028, I believe it was that we will be basically we will be in need of 100,000 Tech professionals in sweden, which means maybe it was 2030, which is basically that every year we need to get around 18,000 18,000 more software and it was predominantly software engineers of different colors, all engineers in tech.

Speaker 2: 1:44:05

So it's established it's. The report says that we are currently having a workforce in the tech around 200,000 in sweden and that needs to be increased. There is a demand for that to increase up till 300,000, which is not very Different from what dig report did like in 70,000.

Speaker 1: 1:44:24

I think this yeah.

Speaker 2: 1:44:25

I think I believe it was 77,000 in 2020 or 2019. Maybe it was the report, yeah, but in general we are, yeah, we are in need of Okay, so so, that was, that was the context.

Speaker 1: 1:44:38

So now we take that context of 100,000 new engineers until 2028 and then we flip that into the conversation of what will software development of the future look like? So now let's go back from super macro into sort of what is the software development evolution to you?

Speaker 1: 1:44:58

know, let's say 2030 or 2028, to use the same Number and then think about what is the profile of those 100,000? Is it the same as we've had, or we can we now expand the recruitment and talent pool in different ways? What should we then look for? What do we mean with an engineer or a software development engineer that reaches those numbers? I think that is very interesting, that we can't look backwards. We can't look backwards into this Statistic. We need to look forward to understand that statistic. Can we do that a little bit?

Speaker 3: 1:45:30

Yeah, super interesting and goes against feeling like you. We would be replaced in any way, right?

Speaker 2: 1:45:36

Exactly but.

Speaker 3: 1:45:38

I think we we talked about it slightly before as well. I think the the need, the need For different kinds of specialized kind of engineers is changing a little bit. I think Already today, like if you're, if you're working with AI, then maybe the skills today is not the same as it was kind of in the requirements like a few years ago, because Now it's much more about like being able to call APIs and store things in databases and juggle the data around, right, and I think we will continuously see Stuff shifting like this. But but in order to be able kind of to stay relevant with the trend, I think it's very important just to have this kind of I don't know if the right term would allow the hacker mindset, right, you just like to kind of attack different things and learn and kind of explore and experiment, I think.

Speaker 3: 1:46:33

So I think it's super important and not be afraid to kind of try things, but rather like, just go head first and try to build whatever and use the AI tooling that is available to learn and kind of go deeper.

Speaker 1: 1:46:47

What do you think, Anders? How would you, if you want to answer? You are responsible to get 100,000 people into Sweden to 2028, 18,000 per year. Knowing what you know and knowing the trajectory, what is it that we need? What is the engineer for the next five, six years?

Speaker 4: 1:47:08

I don't know. But I think a core grasp of the basics is getting increasingly important. If you have just understanding the basics of science or physics, of math, I think we'll continue to be super important.

Speaker 1: 1:47:23

That doesn't go away. Exactly.

Speaker 4: 1:47:27

Rather than knowing. Even though I like TypeScript or I actually love programming different ways. That will change a lot in the future, but if you know the basics, you will stay current and be able to do.

Speaker 1: 1:47:40

So the whole CV talent recruitment. Do you know NET 1.3? How relevant is that in this?

Speaker 3: 1:47:49

I don't like to think of it. When I'm recruiting, I don't necessarily look for a specific programming skill. I look more for the mindset in general.

Speaker 1: 1:47:56

This is what I mean, right.

Speaker 3: 1:47:59

I think, being if you're curious and try different things and learn and adapt. That goes to the point where you're saying if you know the basics of math and physics, then which is a core in engineering and education? It's not that I'm working with physics, but then you understand problem solving and you have a kind of curious mindset about understanding the world.

Speaker 1: 1:48:24

Recruiting for the engineering mindset, recruiting for the hacker mindset rather than the CV engineering of you know.

Speaker 4: 1:48:32

Basically what intelligence really is about.

Speaker 1: 1:48:35

I would say yes, that definition yes.

Speaker 4: 1:48:38

Problem solving Right, yeah, or skill acquisition efficiency. I think you know that is the proper way to.

Speaker 1: 1:48:46

Skill acquisition efficiency is a good metric.

Speaker 4: 1:48:49

Okay anyway, could be really a philosophical question then, assuming the super intelligence will arrive in AGI that is potentially so much more intelligent than all humans combined, what kind of world do you think we'll have? Will we have the dystopian nightmare of the terminator and the matrix and the machine trying to kill us all humans, or more of a utopian paradise where humans are free to not work 40 hours a week, but perhaps 10 hours a week or not at all if they don't want to, and still have an awesome time pursuing their happiness and creativity?

Speaker 3: 1:49:28

I'm more on the kind of positive side of things in general.

Speaker 4: 1:49:32

But as everyone on this podcast is surprising.

Speaker 1: 1:49:36

No doers not many doers so far, but I think you need him here.

Speaker 3: 1:49:44

I think it kind of depends on what you think about. Like super intelligence is such a fluffy, weird term where we kind of, you know, paint very intelligent species in the picture of ourselves, like I think, and like sci-fi has been doing this forever, right, yeah, it's kind of and then painting this picture of like AGI having this kind of goal of like destroying all over, enslaving humanity, is like it's very narrow minded, I think.

Speaker 1: 1:50:13

I don't think it's good. It's good entertainment oh it's great, it's entertainment.

Speaker 3: 1:50:18

I love it.

Speaker 4: 1:50:20

It's also a media and everyone else is focusing so much on it. Yes, isn't it? Yes?

Speaker 3: 1:50:24

I guess that's why you have so many positive kind of guests here.

Speaker 4: 1:50:28

We should have more journalists on them, but how do you see?

Speaker 1: 1:50:31

that Let me go harder. Do you believe in this sort of singularity point or something will go that we're sort of we lose the control, or something like that?

Speaker 3: 1:50:44

I mean it's very hard to kind of to say that you don't believe in it at all. Right, like there is definitely a possibility about kind of machines improving themselves and kind of this whole kind of thing, but they would move. No, but they would move towards a scenario where, like they're part of like this evolutionary kind of ecosystem kind of approach is very improbable in my head. Like we as a species, as we have been fostered by evolution and kind of death and life for such like forever right, like that's what we are as entities, even if you give like machine the possibility to improve themselves, why would they pave like that? Like I don't. I think that's very improbable actually.

Speaker 4: 1:51:34

Yeah, read your snowman.

Speaker 1: 1:51:36

So why would they do that?

Speaker 4: 1:51:38

You know, we are more intelligent than so many other species on the on the earth. We don't kill them?

Speaker 2: 1:51:45

Do we?

Speaker 4: 1:51:46

What's the evidence?

Speaker 2: 1:51:49

I mean we, we killed most of the animals in the world.

Speaker 4: 1:51:53

There is nothing that we do not kill. Most of the animals. No, no, they die in the week yeah very far from yeah, very far from they die by it.

Speaker 2: 1:52:02

Yeah, they're extinct. Extinct by themselves, yes, but OK, let me tell you.

Speaker 1: 1:52:06

Take another super fun question. I'm very much inspired by Max T Marks book. Here these questions lie 3.0. I think it's fun. If we believe that this trajectory towards superintelligence, do you believe it's going to be a very slow, a sort of a evolutionary progression? That is sort of we're going to look back, oh shit, we passed the general intelligence Couple of you know. Or do you think there will be some sort of extreme spike where something is exponentially rapidly, the super fast scenario, sort of thing? So it's the evolutionist.

Speaker 4: 1:52:48

Takeoff.

Speaker 1: 1:52:48

Fast takeoff scenario. Will we? Do you believe in a fast takeoff scenario? Or how do you understand? I think, I think.

Speaker 3: 1:52:54

I mean that I think we're part of that journey, right, when you think.

Speaker 3: 1:52:58

When you think about it, like everything we do and every, all the resources we put into this is like is is about like making ourselves more efficient or kind of some reaching some kind of weird higher goal that we have as a species, right, and I think, as as a is kind of you know, evolving with this it's solving.

Speaker 3: 1:53:17

We're using it to solve our different kinds of problems. I think when you think about it from a really fast takeoff thing is more when they would start kind of evolving themselves, right, yeah, exactly, and then then it gets more to be a question about, like, what kind of intelligence are we looking at then, and what kind of kind of you know goals or kind of you know inherent kind of kind of trajectories are they kind of aiming for? You could have a system that improves itself and only cares about like the bytecode in a weird kind of integration layer at some point that doesn't really affect you, right? Or you could have a kind of somebody that wants just to be on the beach and then kind of, you know, enslave you and you have to do all the QA testing in the future.

Speaker 4: 1:54:02

What do you think about the next GPT then? Do you believe in the Q start kind of algorithm adding reasoning to GPT? For that makes it potentially an AGI candidate.

Speaker 3: 1:54:10

No, I don't. I also think the AGI concept is starting to get a little bit overloaded.

Speaker 1: 1:54:15

I think so too Right.

Speaker 3: 1:54:16

It's kind of or at least kind of a water down or whatever you say right. It's kind of it feels like it's. It was the same with the AI term, like a couple of years ago, when everybody's talking about machine learning and then suddenly you slapped AI on it instead and then talked about the same thing. But it's more like a system thinking and so.

Speaker 4: 1:54:35

I like what Sam Oatesman has started to speak more about in recent at least months and saying that sure, agi will come, but it will not have the great impact that people are speculating on humanity. We will adapt. And he basically compared it to when they released GPT three or four. It was a big turmoil for like two weeks and then people said, well, this is part of our life now but it was very much boils down to how we then have put the definition at that point towards what is. Agi.

Speaker 1: 1:55:07

You know what is artificial general intelligence and then if, then I think if you're referring to this, then something which is human, like you know.

Speaker 3: 1:55:14

I mean, that's, that's kind of weird right, like why would it work? Yeah, it doesn't make sense.

Speaker 1: 1:55:19

It doesn't make any sense to even reflect, you know what I'm thinking already.

Speaker 3: 1:55:24

When you think about the capability that you see in models today, they're extraordinary, right we're, we're surpassing kind of humans in a lot of different kinds of angles and ways. Like, when you think about it from the kind of general standpoint, it should be that it could do like a lot of different kinds of things.

Speaker 1: 1:55:41

You know, chat GPT can answer things about quantum mechanics, kind of marketing and how to clean your house, right, it's going to drive your car now a chance for it, so you get back to the core topic of what is generality and what is autonomous and what's intelligent what is intelligence? And then, and then we get back to, I think, the JEPA paper, jan Le Koon and the story we have. We've been trying to dissect what is the different components of what would make something general intelligence, and I think I think that is a quite interesting topic, then, because you need to really broaden the definition based more, based on what it knows and or can answer.

Speaker 3: 1:56:19

But what I think definitely will happen, it will be a model that is better at kind of taking instructions and will work in more different kinds of modalities.

Speaker 1: 1:56:30

Yeah, that makes all the sense, but it's the whole planning in control and all that things that needs to be added to this, of course, to talk about.

Speaker 3: 1:56:37

But also like still, the models as they are today are very transactional. Right, like you don't. You don't like people's in, especially in media, talk about the mass being entities. They're not entities at all, they're very transactional is just like one one pass through a network, one task and then and then, in order to get them to behave more like like like an entity, you have to do a lot of kind of fixing and wiggling and saving and kind of doing a lot of these things right, yeah, adding memory to it.

Speaker 1: 1:57:07

Yeah exactly but. But could we go back by a little bit?

Speaker 1: 1:57:11

I think, this is an interesting one with when we had Jesper talking about this. How would we define an autonomous agent? I mean like because now, in one way, q Tech is building an agent, we, you know as a way, but I mean like it. For me it's the difference, like we can we perform one task, versus having an objective, analyzing and planning a sequence of tasks and then performing that, and this is now a little bit you know. So I think this whole, how do we define agent, autonomous agent?

Speaker 3: 1:57:44

I think the agent kind of definition is that it needs to be able to interact with its environment. Yeah right and then, in it being autonomous, it means, then it needs to be able to take decisions by by itself.

Speaker 1: 1:57:57

So it's some sort of because you can then you know you could argument simple definition there is much more kind of super narrow, but but then of course you get to this whole planning and reasoning about what steps to take and then be able to take and execute.

Speaker 3: 1:58:14

Yeah, right, but, like you know, meta developed this system called Cicero that beat humanity in, or at least most people in, diplomacy, which is like this very human game where it's all about like talking to each other, and that's a composition of different kinds of models.

Speaker 3: 1:58:32

Right, that you have like one planning system, you have like one language model, like a lot of these different components working together, and I don't see us kind of, you know, having something that is just solving all of those different kinds of problem automatically. Right, you need to compose different kind of things. I think that will still be the case with GBD5.

Speaker 4: 1:58:51

Listen to John Beacon. The paper.

Speaker 1: 1:58:53

Yeah, because that is one big model, or is it orchestration of several things we go. This is so many fun things.

Speaker 3: 1:59:00

Sorry.

Speaker 4: 1:59:02

Sorry, yeah no.

Speaker 3: 1:59:04

You also have this kind of active inference kind of setting that they try to kind of go back to more of this bit patient kind of modeling approach where you, where you try to predict your environment and then kind of you know all act more about it's more about like like a more animal, like entity with a function right, that you kind of try to interact with your environment and don't have a one specific goal, yeah, exactly.

Speaker 4: 1:59:29

So many things to speak more about and I hope we can continue after the cameras have turned off, but for that I would love to thank you very much for coming here again. Let's see if I can say the name properly now Wilhelm von. It was a true pleasure to have you here again, and I wish you the very best of luck with QA Tech. I'm sure you will do very well with that going forward.

Speaker 1: 1:59:56

Thank you, guys. It was a pleasure to be here. Thank you.

AIAW Podcast

E117 - AI and the Future of Software Testing - Vilhelm von Ehrenheim

Listen to this podcast on