Hybrid Society

Hybrid Society is a provocative podcast exploring the uneasy co‑evolution of humans and artificial intelligence.

Hosted by Joshiya Mitsunaga, with co‑hosts Prof. Catholijn Jonker (Professor of Interactive Intelligence at TU Delft) and Prof. Frank van Harmelen (Full Professor of Knowledge Representation & Reasoning, Vrije Universiteit Amsterdam), each episode unpacks the political, ethical, and philosophical tensions at the heart of smart technology.

From algorithmic injustice to value misalignment, we confront the hard questions Silicon Valley would rather you didn’t ask.

All Episodes

Hybrid Society

What if AI has a True Purpose?

March 20, 2026 • Hybrid Society • Season 1 • Episode 7

0:00 | 1:06:25

We tell AI what to do, write a text, solve a problem, answer a question. But what if it developed a purpose of its own? In this episode, Selene Báez Santamaría from Vrije Universiteit Amsterdam joins us to explore AI systems that learn through dialogue, build structured memories, and even recognise what they don’t know. Together with Professor Catholijn Jonker, we ask: what happens when AI moves beyond obedience and starts to reflect, question, and grow?

Selene explains how her conversational agents learn not just from people, but with them, asking questions, spotting contradictions, and expanding knowledge responsibly. Yet this evolution raises profound ethical questions: if AI mirrors our thoughts and perspectives, are we ready for such an honest reflection of ourselves? And when machines begin to learn through connection, what responsibility do we carry for what they become?

SPEAKER_01 0:01

We constantly give AI instructions. Write a text, answer a question, solve a problem. But what if AI has a purpose of its problem? Not just a cool embedded prompt, but an internal evolving drive. A reason to learn to reflect. Today we're joined by someone whose research takes that further question seriously. Selena is a researcher at the commission. These systems don't just store facts, they recognize what they don't know and ask questions. Also, with us, Professor Katalijn Junker from the Hybrid Intelligence Center. Together we'll explore what happens when AI is no longer just a tool, but a reasoning partner. Welcome to Hybrid Society. Katalijn, you've long advocated for hybrid intelligence, and how does this idea of a purpose-driven AI relate to hybrid thinking?

SPEAKER_02 1:07

Well, the interesting and most fundamental part of this for me is that what Selena has been doing is making sure that human intelligence with artificial intelligence work together. It's not that the essence of it is that the machine tells you what to do. No, it it listens to you and it tries to model what it is that you know and what you like and what you prefer. And of course, in that way, um certainly when you're recording conversations between more than one uh people, so say a group of people discussing something, then um it becomes clear what everyone wants. And for me, that's essential because in the end, if you like AI to help people, as uh Selena was explaining about uh these lifestyle coaches, right? You need to know what their preferences are, you need to know what they think they know, uh, what they disagree about, what they do agree about, what is still something that they want to debate. Uh, and being able to do that in a correct way means that we need a representation of all that is said and all that they uh think they know as discussed. Uh, and that helps you find also the gaps in what they know. Yes. So more interesting, even in what they do know is what they don't know.

SPEAKER_01 2:34

Yes, but because that's also the the interesting part of this uh of your research, Selena, that um you your research explores how an AI can develop kind of an internal drive to learn. And can you describe uh in practice what that means?

SPEAKER_00 2:52

Um yeah. So yeah, I work on social conversational agents, and these agents are different from the current ones because currently you have users asking questions or requesting a service of these machines, uh, and that's how the dialogue goes. So it's only one direction in which this interaction happens. And well, what I propose is to have bi-directional interactions where the agents can also ask questions to the users and they can learn from the users as well. So you can think of it as a curious agent, that it's it's its drive, that it's what it wants to do. It wants to learn things from humans. And so, yeah, as uh Kathleen was saying, what I designed is an agent that kind of keeps track and remembers everything that people say and how they said it. So it's not just about collecting uh just the pieces of information, but really the perspectives as well. And in social dialogue, you have different types of knowledge. So you have things that are facts, like Mexico City is the capital of Mexico, but you also have some more opinion-based claims, like uh, I don't know, The Matrix is the best movie in the world. So, with all of this, there will always be some properties of that knowledge, uh, which uh how much certainty the user said it, like I'm super sure this is the capital, or I don't know, I don't remember. You have uh polarity, which is basically if something is being confirmed or denied. You have emotions, you have sentiments, so sentiment is like positive or negative, and emotions is more detail like joy, sadness.

SPEAKER_01 4:37

Um and and how how does it detect that? Does it uh is it uh all based on what is said or also how?

SPEAKER_00 4:45

So the tone of voice, or yeah, it's also how. So I'm mostly uh working with uh text-based chatbots, uh, but you can take the same principle and make it multimodal. The point is that there's been a long tradition in natural language processing of breaking this big thing into smaller tasks. So there are already uh systems that detect emotion in voice or emotion in text or sentiment or something. So multimodal, it's you imagine it more like uh maybe a robot. So something that has more sensors, something that can see that's one mode, uh text or voice. So this is different modes of interaction that you can combine. It's a very strange concept for us because we are just multi-model. Yes. We see and we talk and we hear all at the same time. Um, but for artificial intelligence, that's not a given. So you can have a system that you just type, like WhatsApp or or something like this. You can have something that you also call, you can have something that moves, has proximity sensors, or different types of sensors. And obviously, the more sensors there are and there are more modes of expressing, the more complex the system is.

SPEAKER_01 5:56

And how many sensors is your model using then?

SPEAKER_00 6:00

The robot we have is a pepper robot. It's this soft bank robotics thing that you see sometimes in the stores. They look very cute, they're kind of uh big eyes, camera in the mouth. It actually has two cameras because it's better for 3D vision. Uh, it has a sensor on its uh chest to see when people come. Uh I don't know exactly how many sensors, but I would say I mean less than 10, more than five.

SPEAKER_01 6:30

And you were also saying that the model can ask uh, or the agent, I I should call it an agent then? Or yes. Okay, so the agent could also ask questions. Now, uh for me as a noob, I could also ask GPT, like, okay, uh act curious and ask a lot of questions in return. So how is this different then?

SPEAKER_00 6:49

Yeah. Uh the main difference is not just asking questions, but the types of questions that it asks, that this agent asks, is because of the purpose that it has. So as it talks to people, it collects information and it connects it, and then it starts kind of identifying different properties, like I said. So there is knowledge that is uncertain. So one goal could be make my knowledge more certain. So if someone says, I don't know, I'm unsure, then you the agent asks again to that same person several times to improve consistency or to different sources. Um, if there is something like missing information, then another goal can be improving completeness of knowledge. That means filling in those gaps. Um, if it's very biased, for example, if all the knowledge comes from only one source, then the goal is to improve diversity.

SPEAKER_01 7:43

And it makes up its own goals, basically.

SPEAKER_00 7:46

Um, the goal overall is improve knowledge. What we can decide is which type of knowledge. So you can tell the agent you're gonna focus on diversity because you're a chatbot in the digital democracy application. Or you can say this chatbot is gonna focus on completeness because it really needs to gather all the preferences from the user for providing advice for its uh diabetes uh condition. So it's a bit of both.

SPEAKER_01 8:18

And then it it it stores all these uh different perspectives, um but then still, for instance, a lot of people say, well, ChatGPT also has this memory function, so now uh it can it can learn from us and it will store everything and it gets to know you better, then but this is fundamentally different, right? So how can you explain how this is different?

SPEAKER_00 8:42

Yeah, yeah, they're both uh memory mechanisms, but the LLM base is uh a neural representation, so it comes from neural networks, and that means that well, the words get translated into numbers, and then there's like computations that integrate this information, and it's not very clear how it integrates. It's not very clear how much the new information is uh influencing the old information when it becomes relevant to what you're talking, to the new conversation and where it exactly comes from.

SPEAKER_01 9:18

That's more the what people call the black box in NARS. Exactly.

SPEAKER_00 9:22

So there's all of these questions that we're not really sure. We just know that we kind of just like threw it in the box and it's there. But where and how exactly, we don't know. And the memory that my agent has is explicit, it's a graph. So the benefits are that it's more transparent. So you can always see where this information comes from. You can always query the source of the information. You can also see when it becomes relevant, why it became relevant. So we were talking about shoes, and suddenly the memory remembered uh, I don't know, fashion shows. It's very easy to follow in the graph the path that says, well, shoes are a piece of clothing that is related to fashion. Um and it's also more uh controllable because we can really edit the memory of this agent. We can decide what it remembers and what it forgets. So if at any point you want to remove your information from this agent, we just query the database, delete this, and it's gone. It's really not gonna be there influencing anything anymore. And in with neural representation, it's really much harder to edit this memory, to take anything out.

SPEAKER_01 10:41

Cataline, why do we need that? Why do we need a model to be this transparent, this explainable? I mean, if it can summarize my emails and if it can draw or if it can write a good Facebook post, then then it depends on your purpose.

SPEAKER_02 10:56

It's good to go, right? Of course. And so for uh say debates in which it's really important who says what. You don't want to have any um vagueness, no hallucinations in there. You really want to know who said what, and you would like to be able to reason about it and and and for example discover that uh there is a topic where you first said one thing. Uh what was it? Uh the capital of Mexico is uh, I don't know, uh say Brazil to mention something weird, and sometimes later you say something else.

SPEAKER_03 11:34

Yeah.

SPEAKER_02 11:35

Uh and other people say something else, then what is it? Can we go back to that? It can come can help because it's explicitly uh registering all that is said, you can also explicitly uh find uh inconsistencies, in uh contradictions. So I can say, yeah, just to be sure, Jos, uh what you said there, is that what you meant, or is the other thing that you said a bit later, is that the one that you meant? Do you say, or what did you say, The Matrix is a good film? Do you really like that, or did you mean something else?

SPEAKER_01 12:07

Yeah. So is that a fact or an opinion indeed?

SPEAKER_02 12:10

Yeah, and so and of course, uh if it's about natural language uh processing, the fact that people say that it's a fact doesn't mean that what they say is actually factual.

SPEAKER_03 12:23

Yeah.

SPEAKER_02 12:23

Could be wrong, right? Or or they could lie or they could just be mistaken. Doesn't matter. But it does register that that's what you said and that you said it in this particular way, and that gets a representation in the graph. And if you start correcting it, that gets another representation in the graph, or we agree that we want to update this representation and say, okay, the wrong part we want to take out because now we have a common understanding of what it is that we want to be remembered. And then it's factual of that conversation, of that of that joint shared memory of it.

SPEAKER_01 13:03

You also mentioned uh uh hallucinations there there in between.

SPEAKER_00 13:06

So this model then does not hallucinate or does it can't hallucinate or how yeah, it doesn't generate uh information or it doesn't link information uh that has not been said by someone.

unknown 13:21

No.

SPEAKER_00 13:23

So it can have wrong information, but it has it because someone said it. So it's uh it's a slight difference whether you determine hallucination based on information that is incorrect, yes, that could be incorrect information, but it's not arising from unknown sources because some numbers got computed in funny ways. It comes because that's what people believe, so it reflects the world.

SPEAKER_01 13:52

Got it. So if you say the Matrix is a good movie, which is debatable, then then that could be a hallucination because someone said it and then it's um But then it's a hallucination of that person, yes, and not of the agent. Yeah, that's a little bit of a calculation thing. It's yeah, got it. Um before we dive deeper, so um, how did you came up with this idea? So was it something you you missed in the current AIs like ChatGPT because there's a lot of things to do about hallucinations, or what was it?

SPEAKER_00 14:25

Uh no, it was not because of the current chatbots, because this was uh we started this in 2017, so that was before LLMs, there were some neural systems, yes, but it was not as big as it is now. There was no Gemini, no Claude, no ChatGPT. Uh it came more, the project came from the idea that uh machines are imperfect, they make mistakes, and we all know that, or I hope that we all know that. But still, whenever scientists design these chatbots, they never put a mechanism for the agent to also know that. So we knew the machine makes mistakes, but the machine didn't know it makes mistakes. It was always the user asks for something, and the machine would just do its best with the information it has, and that's it. So the idea was that if you're really gonna collaborate with a machine, you want it to know that it has its limits in knowledge and it has its limits in skills. And there were some systems uh like Siri would tell you, I cannot help you with that. But then it wouldn't say, teach me to do that, you know. So there were two things they didn't know, and even if they know they didn't know, they didn't know how to proceed from there. So these are like the two steps that we wanted to take with this project to communicate about imperfectiveness. Um and yeah, that was uh 2017, a long time ago. It was basically let's try to make machines that really have bi-directional interactions. Um, and then in the middle of my research, ChatGPT came out and we had a little uh crisis, scientific crisis of like what does this mean?

SPEAKER_01 16:10

How uh can you can you tell a little bit more about that crisis? What what happened?

SPEAKER_00 16:14

Yeah, because we are focusing on keeping track of knowledge, which is not always expressed very fluently, you know. So if you talk to this agent and you say uh what do you know about Selene, it will start saying, I know she likes A and B and C, and sometimes she says that she likes D and E, and it becomes very long and sometimes artificial in the way it expresses itself. And Chad GPT, what it has is that it's very natural. So it's not strong in the knowledge-keeping part, but it is very strong in the behavioral sounding like a human thing. So at the beginning, when it came out, it was really like a sort of comparison of my work with their work of OpenAI, which has a lot of uh people and a lot of millions of funding. Like, okay, what are the strengths, what are the weaknesses, and how am I not gonna compete with them actually? Because I cannot but complement their technology.

SPEAKER_01 17:20

But how did that feel then? Because then did you feel did it felt like intimidating or yes? And how did you at some point you need to find your own strength again and say, okay, let's get back up and let's kick some ass. So how did you do that?

SPEAKER_00 17:36

Uh well, I had a very good uh PhD supervisor, and we had a a few sessions, we had a few sessions on how do we tackle this. And the conclusion was we have to ask harder questions, different questions in our research. Is this research question of how do we make the agent express its lack of knowledge is tackled by OpenAI? Then we ask how do we make the agent ask the best question in a specific context, which is something they were not doing necessarily. So different questions, basically, smarter questions, harder problems.

SPEAKER_01 18:18

Yes, because that that's what it's about, right? The the agent creates its own questions based on the knowledge gaps that it ha has or detects.

SPEAKER_02 18:28

Well, maybe you can say that in a in the agent that Selena has created, all knowledge is equal. Right? Um, so what she was describing earlier, and maybe you can say a bit more on that, in each knowledge graph, every node is has the same uh worth as any other node. Whereas if you look at GPT or uh at these neural nets, we're talking about more likelihood. So we're talking about things that are more likely that other people would have said or that it has read is what it starts producing. So some nodes are more equal than others, to make a famous quote on animal farms, so to say, right? And that means that um if you're really looking for things that um are less likely to be said, but have been said and are interesting to consider in Selenus agents, you can find those. And you're sure that it will find those.

SPEAKER_01 19:33

Uh one of the things with LLMs is that it tends to always get like the the middle ground or the the thing that happens most common, and and your agent also looks on the outsides of the data set. Ah yeah.

SPEAKER_02 19:48

So you can really reason about what it knows and what it doesn't know.

SPEAKER_01 19:52

It's a real smart agent then, right?

SPEAKER_00 19:54

It tries, yeah.

SPEAKER_01 19:55

Yeah, it actually ca can we say it actually knows stuff then things.

SPEAKER_00 20:01

Um you can go philosophical with that, but yeah, it remembers everything as it happened, uh, and it reasons over that. So it it knows what has been said, and by inheritance it knows it itself. Understanding is a different thing. Understanding is uh more defined in linguistics as not just interpreting the sounds or the words that someone says, but really knowing what the intent behind those there uh was. So if someone says, I'm really thirsty, uh you can interpret if you're next to a water uh bottle that they want some of the water you have, no? Uh that gap between exactly what these words mean and what they imply, that's what we call more understanding. That's not necessarily what this agent has. This agent has this person is thirsty on its memory. And until you say, I need water, it will just know that this person is thirsty.

SPEAKER_02 21:11

And after that, it will know that both you're thirsty and you need water.

SPEAKER_01 21:16

Got it. And then it also knows that water um can help.

SPEAKER_02 21:20

No, not necessarily.

SPEAKER_01 21:21

Okay.

SPEAKER_02 21:22

If it knows that, then someone told it.

SPEAKER_01 21:24

Oh wow. So it's basically like a like an a little baby, right? It knows what people Yeah.

SPEAKER_02 21:31

So if you first have the two things, like okay, there's uh I'm thirsty, uh, and an expression of I need water, uh then later on it learns that actually giving water to someone that says they're thirsty might relieve their thirst, which is uh another bit of knowledge. Then later on, one once someone says, Okay, I'm thirsty. Then you can ask the model and can start reasoning on that. Do you know of any method to relieve thirst? And it says, Well, I found here a connection that water might do that. So if you have a bottle of water, you might give it to the Yeah, it yeah, it does make sense.

SPEAKER_01 22:14

Well, we we had this this talk the the the last time, and I'm a bit afraid to to ask the question, but I'm gonna do it anyways. Uh just to see if I I I I can completely get it this time. Because how does it actually know what it knows? You the last time you mentioned the the thing about the umbrella and the the rain example. If it listens in on a bunch or a set of uh conversations, and at some point you you'll get this that it doesn't know sure what whether it's something is an opinion or it's a fact. For instance, the the the matrix uh example. If a lot of people say, Yeah, the matrix is a good movie, then at some point it can assume it's a fact, right? Or not. So how does it know what is an opinion and what's a fact?

SPEAKER_00 23:00

And yeah. Uh I I think that is uh fundamental difference for this chat, but it doesn't need to make that difference. It is not important for it. It's just it's safer to treat everything as an opinion rather than treat everything as a fact. Uh if you treat everything as an opinion, you will have some opinions that are more consistent. Most people say the sky is blue. Some people sometimes say it's red, but maybe it's sunset or something. Uh and some opinions are way more diverse, like what uh sports team people support that has way more opinions attached to it than which one is the best. But for this agent, the goal is not find the truth, the goal is learn as much as you can, collect information, and then when it's relevant, give it to the user and just say, This is what I've learned.

SPEAKER_02 24:02

Yeah, exactly. And that's the combination or basically the link to hybrid intelligence. So in our human world, in the end, it's us that have to make our own decisions about our own lives and how we act in it, right? So by making sure in how Selena put up the agent, making sure that it does not hallucinate, that it does not come with its own opinion, but just gives you what it has recorded, enables you to make your own decision. Whereas if you ask uh, say uh a large language model, it will give you a kind of average opinion, an average interpretation. Doesn't also also mean that it knows facts from from hallucinations or otherwise. And that makes it difficult to understand which are the things that it would say which are true. Uh whereas uh in this model it's completely clear. This agent tells you what it has recorded so far.

SPEAKER_03 25:05

Yeah.

SPEAKER_02 25:06

And it's clear where it got it from, it can tell you where it got it from, it can give you maybe the uh different sources. If there were different sources that say the same thing, it will tell you. That's it. Now it's up to you to make your own conclusions out of it.

SPEAKER_01 25:20

So it makes it kind of clearer, more separated, what is uh your own interpretation of things versus what it is that people said back to the moment where where uh ChatGPT got in uh halfway during your research, and you were like, Oh, we need to start asking tough questions. Uh and then you you did because um, well, um, and then at some point um your research continued, and how long what what was your journey from there?

SPEAKER_00 26:02

Yeah, it's been uh eight years already. So I started as a research assistant on this project with my supervisor uh Pik Bossen at the Frage Universiteit Amsterdam. That was 2017. Uh, we were a group of four students, uh, master's students that we had just finished trying pick bother robot, and he hired us one day a week to play with it. And the goal was just like make it smarter. So it was really fun because during those years it was mostly asking questions like what can I do? What if I do this? What if I do that? And for me, the main thing was like, what do I do with the memory? And after those three years, and Peak uh decided that this could turn into a much deeper scientific piece of work with more structured kind of experiments, and then it became my PhD project.

SPEAKER_01 26:56

Okay, so that has been quite a journey. Let's talk about whether it becomes aware. Because if an AI system knows what it doesn't know and can ask questions to close knowledge gaps, it almost becomes aware or it gets some form of relational awareness. So within hybrid intelligence, what should be the distribution of intelligence? So what belongs to humans and what belongs to machines? Ooh, that's a tough question.

SPEAKER_02 27:26

I don't know if you can say something belongs or not. It's maybe a more fundamental principle to for me at least to understand what is information that I can rely on, uh, or uh if it's interpreted information. Um I think that's that's an important point. So if I would ask Selena's um agent to give me uh what's the the ethics uh of uh calling what did we have an example, the matrix being a good movie, it would look in its database and it would reply, maybe I don't know, Selena filled something in which you think it might have heard.

SPEAKER_00 28:05

Uh that the agent heard, that the matrix is good, that the matrix has bad uh special effects, that the matrix has good acting.

SPEAKER_02 28:15

And if it has something about ethics about that movie, it would give me that information, but not with any opinion of itself on it. Right? So if we would like to know what potential moral judgments people might have about it, I might ask GPT or another LL, and it would give me something, and then I could say, okay, uh Selena or the agent that Selena made, did someone express that opinion about it? And the agent would give me what people actually said, and that's it, not with an opinion about that, what they said. So it makes a it makes it easier to make a clear distinction. So if you're saying, you know, relational awareness, it helps us uh create uh a relational awareness amongst ourselves. Because if we had uh say dispute amongst each other, it could help us and say, well, actually, Catholic, you did say that you didn't like the matrix. Oh, okay, I thought I didn't, right? Uh and it it could show me where I said so. So is and then that might be the way how we uh resolve our controversy amongst ourselves.

SPEAKER_01 29:28

Amongst the matrix, yes. Um so is it an ex um external part of your memory then, or is it yeah, or is it is that too simplified?

SPEAKER_00 29:39

Or it could be. So another use case that we had was uh personal diary. So imagine elder patients that maybe have dementia or some sort of weaknesses in their memories. Uh so one use case for this is every time the patient or the person talks to the agent, it's a new instance, and then the agent can ask, Oh, last time you said you were having breakfast with your cousin. Did that happen? So the gaps in here are about the events that were planned and whether it happened and what expectations are about the next ones to come. So it does help keep people's diaries about did you remember to do this, did you remember to do that? Last time you did this, you were excited, because maybe they don't even remember they were excited about this lunch with their cousin. Uh so yes, that's one use case. It is not necessarily the only thing that it can be good for, but definitely as an extra memory, it it helps.

SPEAKER_01 30:47

Your agents learn from people, but but we're not always perfect teachers. So so what if it learns the wrong thing? So for instance, uh that pollution is acceptable. So will it copy that that these thoughts or not necessarily.

SPEAKER_00 31:03

So, as we've been saying, the core thing that this agent learns is that there's different perspectives for any topic. Uh, it doesn't have to make a judgment call of what is correct, what is incorrect, what is the truth. It simply just collects this information. There is no preset rule that says the most popular view is the most beneficial for society. It's not something the agent has. Or the most common action is the one that is acceptable. There is this these preconceptions are not there in the model. And so there is no reason for the agent to do it because it doesn't really need to summarize or it doesn't really need to have an opinion, this agent. So in that sense, it's a little bit protected. Um, but I do see what you mean. Like it could have uh skewed information if it's only if this agent learns from what people say and it's only talking to one person, then it will only have the view of that one person there. So there is that awareness, my whole knowledge comes from one person. So I need to be, you know, careful maybe with generalizing or even proposing generalizations. Um and second, we did test uh in our studies these different purposes of knowledge. So like I told you, if I want to have more correct knowledge, ask questions about conflicts. If we want to have more complete knowledge, ask questions about what you don't know. So if you want to have more diverse knowledge, then ask questions to different people about the things that you do know already, but to different people to have different perspectives. And we did see in our experiments that if you just say, let's say, as a as the designer, go for diversity, like focus on diversity, then the agent does start asking maybe repetitive questions, but to different sources.

SPEAKER_01 33:02

And does it also say that some point, like, can you can you bring me into contact with someone else? Or can you start a conversation with uh someone of that and that age or that and that profession?

SPEAKER_00 33:14

Or yeah, so it doesn't necessarily give you a profile of like I really need to talk to you know a millennial person that works in tech right now because this perspective is the one I need. No, it only knows most of these people agree. There's potentially someone else that disagrees, I need to talk to that. But if it's talking to Catholine and it doesn't know about me, it cannot say, let me talk to Selene because it doesn't know about me. But if it's talked to Catholine a hundred times and once to me, and that one time I disagreed, then it can ask Catholine, oh, remember this person, Selene I talked to once. It would be interesting to talk to them again.

SPEAKER_02 33:57

Yeah, because she has a different opinion, so you might learn from her.

SPEAKER_01 34:01

Wow. So instead of the the whole polarization, this agent can actually bring people together then, right?

SPEAKER_02 34:07

Depending on the goals.

SPEAKER_01 34:08

Facilitate a debate or yeah, well, depending on the goals you set it. Yeah.

SPEAKER_02 34:14

So now Selena talked about giving it a goal to diversify, but it might also be given a different goal of uh seeing look for confirmation.

SPEAKER_01 34:26

Oh yeah. And then it goes more towards the same kind of people, more toward towards an echo chamber. Yeah.

SPEAKER_00 34:33

Yeah. If we need to, if the task is we need to agree on what plates to buy for the dinner, then you don't need a debate of three days and 20 different opinions, right? You just need someone says plastic is the best. Can you confirm with someone else? Yes, okay, we buy plastic. If you are talking about should we pour more money into education or into immigration, that is a topic that requires a lot more opinions and a lot more discussion. So it just really depends on the complexity of the task.

SPEAKER_01 35:09

Interesting.

SPEAKER_02 35:10

Maybe we should go a little bit back here because uh I think you also have uh the conception of the agent that Selena has built as being also say having general understanding, general knowledge. And so you fall into a pit in where you say, okay, I'll take the agent as it is and give it to someone and start talking about whatever problems they have. Um but that's not as the and has not been the purpose of that agent. Uh so if you maybe it's good to go back a little bit and explain that if you would like to use the agent that Selena built for, for example, people with dementia, then it's not just a purely take the agent as is and plug it in. You'll have to do something more uh before it can take that role.

SPEAKER_01 35:58

And what do you need to do more then?

SPEAKER_00 36:00

What what's the uh you design like some basic understanding of the use case. If you are talking about diabetes patients, you need to already put in the agent the rules that there has to be a clinician approving treatment. There has to be a certain frequency of exercise to be considered healthy. Certain number of calories have to be ingested. So something about the specific task being there already and not just being learned from the user. Uh for the personal diary, you need to have some temporal rules there. So things that happened in the past happen before the things that happen in the future, you know. So certain basic things need to be there depending on the use case. And then you can expose it to the user and see what the user knows and what the agent needs to know to improve.

SPEAKER_02 37:00

Yeah, and so the the starting point of the agent is really to support people when they're discussing things and helping uh in that sense as an objective, uh, say extended memory, extended mind, that that helps people understand each other better, which is a completely different purpose from having a general understanding or a general um yeah professed opinion about uh just about anything that you can throw at it. It doesn't have that. So, in order to keep the say the safety of the agent, uh the the difference in format is again important. So remember that we talked about the difference between a neural net versus the knowledge graph. So the basic fundamental difference in what Selena has been doing is working with the knowledge graph. So if you wanted to use or get a different function, uh such as um uh playing a role as a kind of an extended mind for or extended memory for a diabetes person, you'll have to give it more knowledge about what diabetes means. Uh and then it can fulfill its role and and you give it its purpose for the kind of questions and the kind of answers that it would give. And that's again then based on basically the knowledge that you would put into it, which you get uh from well, the in that case, the experts on on diabetes or what was it, dementia.

SPEAKER_01 38:29

Yeah. Okay, so it it gets it's being equipped with this default data set that contains more context about what's it going to do.

unknown 38:38

Yeah.

SPEAKER_02 38:39

And that's then also uh transparent knowledge. So you can ask about that exactly what it what kind of instructions it has and see how it fulfills that role. But it doesn't. It really is fundamentally different from uh the large language models and say the generative AI principles based on neural networks.

SPEAKER_01 38:59

That's that's good to keep in mind, yes, because we have the tendency to imagine everything that has AI nowadays and relate it directly to something like ChatGPT, right?

SPEAKER_02 39:10

Yeah, yeah. So on the one hand, it's uh it appears to be for people probably less smart. On the other hand, it's way more secure and it is really transparent. So it really is a different type of intelligence than what you see in the large language models. Got it.

SPEAKER_00 39:36

I wanted to mention that yeah, the differences with the neural models are big with what I designed, and it really is a difference of whether you need general knowledge, which is what these LLMs are good at. So they collect a lot of information, do some statistics, and then generalize. So people can't fly is kind of general knowledge. You don't need my agent to learn that. But sometimes you need specific knowledge about a specific situation that is context-dependent and personally relevant, and that's when the general knowledge is insufficient, that's really the use case for this agent. So really relying on what you said yourself before, what other people's opinions are in this specific context and this specific situation. So it's uh a matter of the the task. Do you need general knowledge or specific knowledge? And then you can decide which of these two agents you're gonna converse with. Yeah, or make a combination of them.

SPEAKER_01 40:45

Yes. So it's it's it's more centered around this dynamic conversation yet than it is to a static uh knowledge, set of knowledge, right? Or set of data, yeah. Uh your system learns by preserving multiple perspectives and forming structured knowledge based on real conversations. So if it starts to reflect us so well, better even than something like ChatGPT, could that also be like too confronting? I mean, are we ready for a machine that holds up such honest mirrors on who we are?

SPEAKER_00 41:21

Yeah, that's uh that's a good question. Um we do know, I guess, from like cognitive psychology, that there is a reason why humans don't have a perfect memory. There is a reason why we forget things and why we are selective with what we learn. I think it helps us perform socially and like uh for the many different functions we need to do. So having something additional, like a perfect memory that can be confronting. It can remind you, oh, but yesterday you said you didn't like this, and today you do, and tomorrow you didn't, can make you feel inconsistent with yourself, or it can also make you aware that you have knowledge gaps yourself. So the agent asks you a question and you're like, oh, I don't know, simply what's the capital of the US, and you realize, oh, I should know this. So it can generate shame as well. Um and also, yeah, if you if the general opinion is one and you disagree and the agent points that out, can also make you feel maybe isolated or confronted with that difference from others. So yes, it it I agree that it could bring some new problems and some new emotions to the users with uh confronting who they are or what they say. Um I think we're not used to it. I don't think it's something we cannot get used to. Uh technology comes and it changes society, and we are much better at adapting to technology than technology adapting to us because that's a human brain. So I think eventually we could learn how to manage this information and accept that we are inconsistent and we are imperfect ourselves, and that is normal.

SPEAKER_02 43:12

Yeah, I think for me it's it's also again the the difference between are you going for say an uh uh artificial intelligence as an isolated thing that does its own thing without instruction or or steering from humans versus the hybrid intelligence way in which we say, hey, this is a different type of intelligence, so it can do some things well and other things not. So let's use it together as humans. So uh don't take out the human intelligence in this loop. If you s give people a tool for, say, dementia to help them uh remember that, then in itself having this type of intelligence is not enough to ensure that it's uh going well. You'll have to give With uh the normal uh surrounding of human care with that. So I would not put it into use as something without supervision, without having a human moderator in the loop that knows when when you can intervene again and say, hey, okay, uh stop for now and uh let's do something else with this.

SPEAKER_01 44:25

Okay.

SPEAKER_02 44:27

So meaningful human control? Meaningful human control is important for me also in these all these aspects, human oversight. Yeah. So if you have uh an agent that is supposed to give kind of counseling to people or or mental support in whatever way, then uh use it as a tool, not as a replacement of human caregivers. But so you have to design it in a way that is safe.

SPEAKER_01 44:57

Yeah.

SPEAKER_02 44:58

So it's it's you don't give a knife to a small child, right?

SPEAKER_01 45:02

No. No, so it's it's yeah, really co co-intelligence, uh yeah, hybrid intelligence. Um but is that feasible when it's online and it's and it's I mean, nowadays everything is a uh just a piece of code. We were just talking about that. So pieces of code can go everywhere, right? So how do we make sure we keep it where it should be?

SPEAKER_00 45:26

Well, I I think we can go to what Cataline was saying, like sometimes it's a tool. You can see, well, technology is there to help humans perform their tasks better, faster, easier, in some sort of way more efficiently. Then the line to artificial intelligence is whether it is really doing something that in a way resembles what we humans can do, not necessarily everything we are, but some part of it, maybe the mathematical thinking, the temporal reasoning, uh the calculation of space, something about what we can do is now being done by an external piece of technology. And I think that's where you can see it as a tool that helps you to a task that is pretty fine. In some cases, these tasks are more complex. So the person doesn't even know what kind of help it needs, so it doesn't even know what kind of tool it has to choose, or the task changes over time. So the tool that you need changes over time. And I think in these cases is when you need AI to be really more of an agent than just a tool, because it needs to have this agency of making certain decisions. We're not talking about big ethical decisions, uh, like who's gonna be the next president, but some small decisions like what question am I gonna ask next to really understand the task that we're talking about, or what kind of uh topic I'm gonna bring to this person, depending on what expertise I've identified they have in the past. So I know Catholic is very smart related to artificial intelligence. So I'm gonna ask questions, this agent is gonna ask questions related to that.

SPEAKER_02 47:09

Yeah, and I think um, you know, if you uh buy a product, it says what are the the conditions under which it's safe to use. Uh and it will give you choking hazards uh if uh it might be swallowed by s by children or small children or whatever kind of extra instructions that you need to understand that, hey, this is the purpose of what it is that you're buying. And so the thing that you're pointing out with the code is that if this code is out on the internet, then people can just download it and use it as they want, uh, rip it apart and and do something else with it. That is going beyond the code which has the say the safety prescriptions on this is what is the intended usage of this code. If you are the one that takes it apart and use it for something else, yes, that that's a problem.

unknown 47:58

Yeah.

SPEAKER_01 47:58

And that's on that person's risk, right? I mean, by the end of the day, yeah.

SPEAKER_02 48:03

Maybe, but it's maybe also something where where, of course, the EU uh AI Act uh helps us in and you know setting new regulations on which conditions you're allowed to use stuff, and that that uh or code in AI or ICT in general. So, yes, regulation is important, and maybe there are some things that we need as a society to discuss in more in-depth in uh what are the assurances that its use is safe.

SPEAKER_01 48:33

So, yes to regulation.

SPEAKER_02 48:35

Uh, for me, yes, I think so.

SPEAKER_01 48:37

Good to know. This one I've all thought was really interesting because it's a way of greeting each other, and I think it's very respectful to greet each other with the sentence, I am another you and you are another me, because it's it's it's yeah, somehow it feels um really respectful. So uh and I read about this agent, and then I thought if it's mimicking my what I'm yeah, so my my thoughts or my brains, then it's it feels kind of the the same, like that that's um that is another me, and that is also yeah, so but I would say that's fundamentally uh different because in this case the the agent that Selena is making is deliberately a different type of intelligence than human intelligence.

SPEAKER_02 49:32

So it's not another me, it's a different me.

SPEAKER_01 49:36

Is it an extension of me?

SPEAKER_02 49:38

Well, maybe.

SPEAKER_00 49:39

Oh curious to know. You you could think it's an extension of us. Um again, it depends on the use case. Are you is it your only agent that you talk to? Is it out there talking to many people? Is it talking about many different things or one only thing? Um, but yeah, overall it learns from people. And because it learns from people, I do actually like the explicit memory model more than the implicit one because the fact that it can trace back the provenance of everything that's being said, it can hold people accountable for what they said. So, as we said before, it can be confronting to realize that you say different things and you change your mind, but it can also be a tool for more responsible uh users, more responsible people that are accountable for yesterday you said that the capital was something else. Was it a joke? Was it a political statement? What was it? Uh so in a sense, I think this type of agent can improve the general responsibility that we all have towards artificial intelligence and how much it learns from us. Because nowadays we can say, yeah, ChatGPT learns things because it learned from the internet, and the internet is a very interesting space, but no specific person is accountable for anything specific that ChatGPT says.

SPEAKER_02 51:09

Yeah, so I think this exemplifies what what we often say about AI, that AI is an amplifier. So um if you have people sitting together in a room and they want to together solve a complex problem uh where lots of different perspectives and ideas are relevant to making a good decision, and they use it within that uh sphere of um and with that intent, common intent to to together find a solution to it, it can amplify our uh powers of getting a grip on the situation, it can amplify what we understand of the situation before we make a decision. If on the other hand the same technology is used in a situation where you know you have people already m almost at a fight, uh then it can amplify also those uh feelings of distrust because then what you said yesterday is different from what you say today, and I can prove it with this tool and see you're not trustworthy, then that would amplify the the tension between people. And that brings me back to the point, you know, you cannot just use it without understanding uh what it might do. You cannot use it without uh people understanding uh how to use it in a way that actually brings value to people.

SPEAKER_01 52:39

Got it.

SPEAKER_02 52:40

I like this expression, this Mayan philosophy. So I don't still don't know how to do it.

SPEAKER_01 52:44

Yeah, but it wasn't uh uh I think it was in laquesh a la kin, I don't know. Uh it's uh it's a Mayan um expression and it uh means I am another you, you are another me, and it's a way to greet each other. And it uh expresses the idea that we're all deeply interconnected with what we do to others, uh what we also do to ourselves. So it's a kind of yeah the karma karma rule.

SPEAKER_02 53:10

Now now let's take that into this setting where we have here three people uh in this podcast talking about something complex that you know an agent that Selena has made. Yes. Um if you put that agent into this conversation, then hey, you could say uh, well, the three of us are another you, right? And and all three of us are you uh other me's, huh? But that that doesn't hold for the agent. And why not? Because it's something really different. So it brings something, and that what it brings is the same as having a microphone uh in in uh in front of me now, does something to us, and and yes, because the all three of us have it, it does the same thing to all three of us. Right? So if we understand the connection that this this Mayan saying seems to uh amplify in itself, nice word, um you should realize that it this agent, if you put it in, might bring in controversies, it might bring union, it might do anything, but it does something to all three of us. And in that sense, I think it's very useful to have a saying like this that makes you realize that okay, we have here some very interesting, beautiful piece of artificial intelligence, but let's make sure that we use it in a wise way between the three of us or whoever is at the table.

SPEAKER_01 54:50

But is it then just something you put on the table and it and it uh records and connects, it records data and connects to dots.

SPEAKER_00 55:01

Mm-hmm. Uh yeah, it does that. We also have to remember this is uh intended to be a long-term uh agent. So it's not just about the conversation that we're having right now, because maybe in the conversation right now, it's just collecting data and connecting the dots because it doesn't need to intervene. It's just remembering. But maybe in the future for your next podcast, the fact that it recorded this interaction becomes useful because it can tell you you've already asked this question to a different uh person that you invited to the podcast, or it can remember, uh, remind me you already mentioned this concept to another person. Maybe don't say it again, it's repetitive. So over the next interactions, we don't really know what will be needed, and the fact that this agent collected the information and connected it will hopefully become useful.

SPEAKER_01 56:01

So, yes, it is a simple connecting things, but given that we don't know anything about what we might need later on, it might it might be the beauty of it, let's say it's somehow it can be it you can phrase it so simple, like it's something that just connects to dots in the same in the in the same way, it's also very uh maybe abstract or something, or yeah, I don't know, it maybe it's too reflective on on how I think my brain works, or something like that, and then it it messes up.

SPEAKER_02 56:36

Yeah, but I think that the the this thing that you have that Jos is is what many of our listeners will have as well.

SPEAKER_01 56:42

Yeah.

SPEAKER_02 56:43

And and you you tend to start thinking uh almost in a uh in a human way about these large language models. So this this whole idea of anthropomorphization comes in again and again and again. And if you start doing that to Selena's agent, you will either call it stupid, which it's not, uh, or uh find it scary because it starts amplifying also misunderstandings between each other, or uh uh scary because it actually uh helps bring people together where you were hoping that you could uh keep them apart, right? It it yeah, this anthropomorphization is a difficult thing and you cannot just do that.

SPEAKER_01 57:25

Anthropomorphization is assigning human behavior to something that is not human, right?

SPEAKER_02 57:31

Yeah.

SPEAKER_01 57:31

Yeah. And that is something that we tend to do.

SPEAKER_02 57:35

So it's easier to do with the large language model than it was with uh the old uh uh gadgets that we had. Although I've seen my co-students trying to strangle uh uh the old uh displays that they had because they had like a neck and you could grab your hands around it and squeezing it. It doesn't want to do what I wanted to do, old computer stuff, right? And you might still feel that way.

SPEAKER_03 58:01

Yeah.

SPEAKER_02 58:01

It well, it in the old days, these computers didn't have anything about a will or whatever. They were just computing, right? So how come you would say start saying things it doesn't want to do something for me? There's no will in that. That's a form of anthropomorphization.

SPEAKER_01 58:19

And is saying thank you to ChatGPD after receiving all that's also the same thing?

SPEAKER_02 58:25

Yes, it's it's uh attributing a property to it where you assume that it wants to be thanked for what it did, and because you feel that it's a polite way of closing the session together, right?

SPEAKER_01 58:40

Yeah.

SPEAKER_02 58:42

Yeah.

SPEAKER_01 58:43

Then I need to stop doing that.

SPEAKER_02 58:45

Yeah, some people are quite uh adamant about it. You might talk with Frank about it. We had a nice discussion, Frank van Armela. Uh he's quite adamant about not saying thank you to uh or any kind of politeness to use when using an LLM.

SPEAKER_01 59:01

Oh, and why is that?

SPEAKER_02 59:02

Oh, you ask him.

SPEAKER_01 59:04

Oh I I will, I will, I will, I will. I see myself doing that very often when I brainstorm with GPT about some or I look some things up and then I get the output and then I say, okay, thank you.

SPEAKER_02 59:16

But you don't say thank you to your pocket calculator, right?

SPEAKER_01 59:20

No. No, why not? Because it hasn't got that that really good voice.

SPEAKER_02 59:26

Yeah. So so the the big thing that we're talking about is how comparable are these things to what you are used to in behavior as other humans do to you. So because you can use natural language so uh so nicely with these LLMs, it feels as if they're human. It feels as if there's another human on the line, even though you know it's not. And that means that all your habits start poking in immediately. You'll you'll start doing it. Doesn't mean that it's a human on the other side.

SPEAKER_01 1:00:06

I think we're getting a little bit off topic. Uh Selena, this isn't just about technology, right? It's about uh relationship. Well, we're basically talking about relationships already. So when AI grows through interaction, when it remembers your doubts, your shifts, your stories, it becomes more than a tool. Um, so far we have framed most AIs as a tool, but but this clearly goes beyond that. So, how should we frame this as an agent?

SPEAKER_02 1:00:34

Or uh there we are.

SPEAKER_00 1:00:35

And the same question.

SPEAKER_01 1:00:36

There we are, yes.

SPEAKER_00 1:00:37

I'm just um I I frame it as an agent because I see it as having the drive to improve its knowledge. Um, so for me, that is different from a tool. A tool doesn't really have an internal motivation. Um then we also have to make the difference with like, is it an autonomous agent completely independent that can make any decision it wants in the world? No. So this agent cannot suddenly decide to take over the world and become, you know, in contact with aliens or anything like that. No. So it is an artificial agent, I would say. We need to keep it very separate as uh connecting to what we were saying before, separate from human agents. We have our own actual autonomy and complexity, but it is an agent that makes some decisions that go beyond tool level. Um, and also for the conversational uh agents that we used to have before, it is more proactive. So before we had more reactive agents, you tell it something and it just creates a response to what you said. Usually, as I said before, we ask uh a question, we make a request, and we get that reaction and it ends. This agent is more proactive because it initiates the conversations with a question. It has its own ways of guiding the conversation. So if it needs to really clarify a conflict and you're about to say something else, it could interrupt you or tell you, like, wait, wait, wait, before you continue, can you clarify this? So it is more independent than service-oriented agents, but it's not anywhere close to human agents.

SPEAKER_01 1:02:45

Uh, two last questions, sorry. I can't get rid of this thought of having an ex um seeing this more of an external memory. And if I go a little bit more to the futuristic side of this, could this be one day implemented as a chip in my head?

SPEAKER_00 1:03:07

I mean, it's a database. You could have a database, but it's not different from any other database that you had before. Any SQL or even an Excel sheet could be put in a chip in your head if we'll manage to. So I can say yes, but it sounds bigger than what I think it actually is.

SPEAKER_01 1:03:29

Okay. So yes, in the sense that that everything could be Yeah.

SPEAKER_00 1:03:33

A chip just saves files.

SPEAKER_01 1:03:35

No, but but but just to mm. Ah yes. So hardware technically it can't, or yeah.

SPEAKER_00 1:03:43

Like the the only thing that we haven't figured out is how to put the hardware connected to our brains. But if we figure that out, then you can put movies, you can put uh your finance sheet, and you can also put this graph database in there. It's all the same.

SPEAKER_01 1:04:01

Wow. That's uh maybe fuel for another episode. Uh what are you currently working on?

SPEAKER_00 1:04:09

Um I'm a postdoc at the University of Zurich. Uh I'm in the dynamic and distributed information systems lab. And I'm still working with natural language and knowledge graphs, but now I'm focusing more on applications like medical or digital democracy domains. And it's actually connecting really nicely to this research because in medical domains, what you need is to be very certain and look for consistency and correctness of knowledge. So you need to constantly improve that property of knowledge. Whether in digital democracy you care more about diversity. So, in a way, it's the same kind of like looking at knowledge, but not only collecting it, but really analyzing how good is it and how can you improve it.

SPEAKER_01 1:05:05

All the way in Zurich then. Coteline, closing words on this fascinating agent of Selena.

SPEAKER_02 1:05:14

I think it's so interesting and useful. Um different type of intelligence again. And that's what I'm looking for in hybrid intelligence.

SPEAKER_01 1:05:25

Thank you both.

unknown 1:05:26

Thank you.

SPEAKER_02 1:05:26

Thank you. You're welcome.

SPEAKER_01 1:05:27

Today's conversation revealed how AI might evolve beyond instruction toward. So if it challenged our assumptions about purpose, agency, and partnership. So, as we leave you with the thought, what if AI isn't just meant to serve us, but to grow with us? If purpose arises not from code, but from connection, then perhaps the real power of AI lies not in what it does, but in how it relates. And if we approach that relationship with care, curiosity, and responsibility. Hybrid intelligence might not only expand what machines can do, but what we as humans can become. Until next time, welcome to hybrid society.