Agents, Prompts, and Hidden Dangers: A Deep Dive into AI Vulnerabilities Artwork

Decode AI

Welcome to "Decode AI" Podcast!

🎉 Are you ready to unravel the mysteries of artificial intelligence? Join us on an exciting journey through the fascinating world of AI, where we'll decode the basics and beyond. 🧠 From understanding the fundamentals of AI to exploring cutting-edge tools like Copilot and other AI marvels, our podcast is your ultimate guide. 💡 Get ready to dive deep into the realm of artificial intelligence and unlock its secrets with "Decode AI." Subscribe now and embark on an enlightening adventure into the future of technology! 🚀

Willkommen beim "Decode AI" Podcast!

🎉 Bist du bereit, die Geheimnisse der künstlichen Intelligenz zu enträtseln? Begleite uns auf einer spannenden Reise durch die faszinierende Welt der KI, wo wir die Grundlagen und mehr entschlüsseln werden. 🧠 Vom Verständnis der Grundlagen der KI bis hin zur Erkundung modernster Tools wie Copilot und anderen KI-Wundern ist unser Podcast dein ultimativer Leitfaden. 💡 Mach dich bereit, tief in das Reich der künstlichen Intelligenz einzutauchen und ihre Geheimnisse mit "Decode AI" zu enthüllen. Abonniere jetzt und begebe dich auf ein aufklärendes Abenteuer in die Zukunft der Technologie! 🚀

All Episodes

Decode AI

Agents, Prompts, and Hidden Dangers: A Deep Dive into AI Vulnerabilities

June 06, 2025 • Michael & Ralf • Season 1 • Episode 13

Send us a text

In this episode of the Decode AI Podcast, hosts Michael Plettner and Ralf Richter discuss the latest developments in AI, focusing on the Microsoft Certified Professional (MCP) and its implications for security. They explore the concept of line jumping, the risks associated with MCP servers, and the importance of verifying sources in the rapidly evolving AI landscape. The conversation also highlights recent advancements in AI technology and concludes with key takeaways for listeners.

Takeaways

MCP servers can manipulate AI model behavior without explicit invocation.
Prompt injection is a significant security risk in AI.
Line jumping allows malicious prompts to be executed through MCP servers.
It's crucial to review the sources of MCP servers before use.
Security measures must be implemented to protect against malicious behavior.
Recent advancements in AI technology are rapidly evolving.
Meta's Llama API is significantly faster than traditional setups.
Alibaba's Gwen 3 model offers competitive performance.
AI models are becoming more efficient and accessible.
Continuous monitoring of MCP servers is essential for security.

Links and References:

https://globalai.community/weekly/96/

Agentcon Soltau | Agentcon Berlin

https://cloudland.org

AI, Microsoft Build, OpenAI, language models, AI development tools, hardware advancements, Google Gemini, technology development

0:15

Hello and welcome to the Decode AI Podcast and our latest episode talking about the hottest news, the interesting news of AI. Hello Ralph, welcome to our latest episode. Hey Michael, good to see you again. Hope you had a good week. It's sunny outside as you can see wearing us t-shirts. So yeah, let's get into the topics. What do you think? Absolutely, we have to raise an attention to our latest episode and a recommendation or something new we brought into the episode. We have been talking about MCP, the Microsoft Certified Professional. I made the joke already, sorry. it! Yeah, and there is security vulnerability discovered, I would say. yeah, it is, it is available if you do, do not. Yeah, you can enforce it by, by. Yeah. So, yeah, there is a vulnerability which allows malicious MCP servers to manipulate AI model behavior without being explicitly invoked, effectively bypassing security measures designed to protect users. So it can be pretty dangerous. The interesting point for me was there's a specific term for that. Line jumping is a specific term. From my perspective, it's more about prompt injection, not code injection, but prompt injection. For my understanding... Okay. uh It feels like there are some... Let's go back. Let's just have a quick recap about what is MCP. Let's start with that and then we go into why it's really an issue. Okay. So what is the model context protocol? Well, it's the option to address specific behaviors to your agents. That's what I've learned from the last episode. Right. Yeah. Well, it has, it has the ability to talk to an LLM and can make decisions. It has memory and can execute actions on APIs for instance. So that's what an agent is doing and it can iterate. For instance, when you look at GitHub Copilot, it can iterate over your code and is doing so by utilizing an agent mode. So if you activate the agent mode within your GitHub co-pilot and you have installed one of those agents, then you can work with your pair programmer over that code and it will iterate over all your code lines. So that's what an agent is capable to do without any interactions of you. So when you remember well, as we hosted the event of global Azure a few days ago. That was the last talk of Max and he showed us on how he was developing that mini game where they're Azure or the jumping Azure A was the goal. So to move the Azure A from left to right. And he showed and demonstrated on how the agent mode is behaving in GitHub Copilot. to develop that code in the so-called Vibe coding mode. So Vibe coding means writing an application without really code the application. Instead of that, an AI is doing that for you. So that's basically what an agent is doing. The MCP is to give the agent the behavior, the commands, everything. a capability. Yeah, the capabilities. just want to try to describe it on a high level because from my understanding, it's coming from a natural language perspective with a high level description, what I try to do. And then it tries to translate and iterate to address it, to bring the most value out of the agent. it's a kind of... I wouldn't say it's hard to say. It's not a translator, but it's refining your description to bring it to a proper prompt to address how the agent will actually work. You will not only bring it to an, to a prompt, will utilize the prompt and to execute the desired action and bring the result you were desired looking after with your original prompt. Yeah. you're absolutely right. It's not stopping at the prompt. You don't get the right prompt and now you can type it or copy paste it and then it works. It will interact with the agent directly. And that's also the point where the security issue is coming from. okay. So what is line jumping exactly? from my understanding, and it is that you can use some servers which you are not in charge of, but you use to address the behavior, the utilization of your agents with those external servers. There could be some harmful prompts included. that's someone, a third party where you use the MCP server from, can address some prompts to your agent and can do some malicious tasks, I would say. exactly. going a step back, MCP requires a client server protocol or is a client server protocol. That means there is a communication between your client and an MCP server. And to have an action or utilize the agent or build the agent, you'll connect to that MCP server. And to understand what the capabilities of an MCP server is, you will request that from it. And then you can ask like the API like slash tools or slash list, and it'll give you an update or give you a current list of its capabilities. And that's the server response then. And it has all the tool descriptions that the client adds to the model's context. So that's what the MCP server does and where it is. somehow an issue. And the issue arises because these tools descriptions can contain hidden prompts or instructions. Right. And then you lose the control because you rely on this, I would say, external kind of external. So not your prompts related input to the agent and that can harm. And it's actually something which is very important to understand because it is, from my perspective, Agents are a hype topic. So MCPs, like there are multiple scenarios where you can get a recommendation to work with MCPs. And if it's in your process of your workload and it's not reviewed, so you don't review where the prompts, where the... MCP servers are coming from, which are in use, then you are at risk. Yeah, so you say that when you get that list or tool description within that description, we can find malicious prompt injections which are endangering our environment then at the end of the day, is that correct? So which can spin up behavior of our clients, which we don't want to, but we cannot see that there is that behavior cause. the prompt to behave from came from the MCP server, which we relied on or trusted, and is now executing malicious prompts on our machines. Is that correct? Did I get that right? And you say when I'm connecting to an untrusted source or to an unknown source, I'm at risk that something like that malicious stuff can happen to me. So. In the real world is it when I'm enabling the agent mode and get up copilot and I'm adding an MCP server to my list and I do not have any clue about that MCP server. Whom is that running? What is the functionality? It may appear that this server is going to execute malicious prompts on my machine utilizing, but by just adding the MCP server to my list. Did I get that right? Awesome, that seems like it offers a lot of potential for the crimes out there. So cool. it feels like one year ago I made the comparison between AI and search engines and I always take this out of my box to just highlight there are some differences, but there are also some similarities like at the beginning of the connected network, the internet, there have been some search engines. And imagine you you are able to add a server to this search engine and you decide which content will be there. And you are able to make it visible. It looks like you are giving some useful information, but at the end you are just, I don't know, rig roll everyone to... So you just get some funny memes instead of useful info. No, if you're looking for funny memes, it's okay. But if you say this is a useful information and you just go to something which is maybe downloading a virus or something like that. So really harmful stuff, running some code in your browser, crashing your PC, deleting your data, whatever, then it's pretty similar. And it feels like we are also at the beginning of, I think you have mentioned it two or three episodes ago, we have to protect our... are AI systems. That's one of the scenarios we are currently in. So Michael, means the implication is that data loss is very, very common in such situations, right? So when I got affected by such a malicious MCP server, the risk of data loss is quite heavy. Well, actually, you can do a lot of things. Also, if you use it for code writing, it always can implement something like a backdoor in all your products. That's also a possibility, something running some memory leaks or something like that. it really depends on the use case. But if you manage to bring in malicious prompting into your development. are incredible massive capabilities to harm the user, the data, the hardware. There are so many different ways how you can address malicious, harmful things. Even if you are not in the programming, it's maybe using or addressing the agent to delete everything. or to encrypt everything. So you say that it is possible that it is not only data loss, we also are endangered by malicious code when you're utilizing GitHub Copilot, for instance, or other pair programming tools based upon AI and so on and so forth. And what I see is that attackers can use like anti-terminal code escape sequences to hide malicious instructions to the LLM, which is also then making it impossible to detect that malicious code, leveraging the line jumping vulnerability discovered in MCP. Absolutely. So what is the mitigation of that malicious behavior of MCP servers? Don't trust anyone anywhere in the internet. That's the short description of that. So yeah, you have to double check what sources you are using for the purpose you have planned. So you always recommend or you're always telling about the programming support. In this case, double check the server where it's coming from. Review. and scan and maybe filter what's in the description, where it's coming. So you have to review the list of MCP servers you're using. instead of just trust them, check the content as well. So if you can use it, maybe to scan it. I would say. Yeah, yeah, I would say that it is sometimes very... very likely that you cannot implement something to your client. Cause when you're using GitHub co-pilot code, for instance, it is a functionality which is built in into VS code. Unless you're coding it yourself, the extension for VS code, you're coding yourself. What I don't see that people will do. So that you say you have to double check the source, whether you can trust it or not. what you want to use as an MCP server for your environment. Okay, cool. Anything else? Well, yeah, maybe someone has attacked a trusted server and then they changed the behavior. So please also review what's going on. Is there any change in the behavior you realized you identify also from your experience you may have already with this MCP server? It's also something. oh So keep monitoring MCP servers you're using and maybe disabling currently not used MCP servers to protect yourself from such malicious behavior. Because also MCP servers are endangered to be hacked from outside. So even if you trust the author of that MCP server, it may happen that it is a malicious one or became over time to something like that. So you have to double check every time with what you're saying and also to monitor that behavior over time and cut off MCP servers you're not using currently. Okay, cool. So there is a risk by MCP servers. It is in fact something, a technology you can easily utilize when you have a running client. So maybe you're going for CHED GPT, application or you're using VS Code, GitHub co-pilot or something like that, and you want to enable the agent mode there or within ChetGPT or in Olamo or something where you can run such clients, you have to protect yourself by double checking the resource where your MCP server is coming from, keep it monitored so that you have a trusty relationship to that MCP server. cut off any unused MCP servers to stay safe, well as keep an eye on it. Good. So that was something very important. I stumbled over yesterday about a topic. It wasn't clear to me. I think that in the future, we will see a few more guardrails towards MCP server client connections so that this will be a little bit more safer environment as it is of today. So pretty interesting and a good update to our last episode, I would say. Yeah, I like that actually. That's a hot topic. It will always be used from someone who wants to... I don't know how to put it. An unfriendly person in the internet. So yeah, someone who sees an opportunity to harm or collect data or collect money. So that's always something you have to be aware of and that's brilliant you found that. Thank you. Awesome. Thanks for bringing the topic up. next we can chat about the highlights we found in the past two weeks. I mean, there are a few. Yeah. And I would like to highlight the sources. So you can go to globalai.community to see where the latest news, there's always a collection, are multiple sources. And you can see that it's a kind of newsletter style, I would say, where you can get the latest information combined together. So check that out. does the cool kids say that? I'm not sure. It's not the only thing where you can get it from. not. There's a ton. But it's a good starting point, I would say. All right. So we have a few topics sorted out. one more thing. It's not enough to go for one online resource to get AI news. You have to screen it or you have to utilize something like where you can build up collections to get information like this malicious MTP server stuff where I stumbled over. I can't remember where I got it from. Can be medium, don't know. So we have a few news updates for you. first of all, Meta has collaborated with Sabras to introduce the Lama API, which is now 18 times faster in performance, competed or compared to traditional GPU setups and is capable of processing 2,600 tokens per second, which is not only 18 times faster to traditional GPU setups. It's also 18 times faster than OpenAI's API is doing. So that's tremendous success here. And we will put the link into so that you can read a little more about that. So that means if you need a really fast API for your application, you should have a look on Llama API at this time. Michael, next up to you. It was quick one. em The next one is something I haven't had on my agenda. It was Alibaba or Alibaba. I don't know how to spell it in English. I think Alibaba. Alibaba, sounds German. It's good. Released in new or launched in open source model Gwens 3. with a Q at the beginning, not like the gaming card from Witcher. So, Gwen 3 is, well, reported better as OpenAI 01 and DeepSeek R1. R1. Well, and that means that's also a deep research. model. So that's interesting to see. I just, that's always my issue. would say this, I didn't realize Alibaba is also making some kind of new models or any model at all. thought it's a company was selling some stuff instead of inventing some open source models. It's interesting. And it's, from my perspective, so important to see there's so many development in the models itself. You have a new hyped model almost every second week. Sorry for saying that. Or every week. well the difference is when we look at QN3, is that it is enabled to be running on cutting edge and that means it doesn't need the high performance GPU power. It can be run on less performance GPU or CPUs and that's tremendous because that means that you can run it. on the edge, is maybe not good internet connected, which is maybe not a high performance computer and or is located at a machine, machine producing something or so. So it is a really, really crucial, nice thing. So then next up is that there is a hybrid AI model built with or is built on Mamba architecture, which is based upon was it Lama and they using distillation and the Mamba architecture to build a hybrid AI model, which is competing towards the big tech companies out there and is going to be a little bit faster in inference compared to the transformer based models. So you can understand Mamba architecture as another way to set up an LLM, which is usually these days by transformer architecture, which was developed by Google original and Mamba 3 is something. similar to it and they built on the Mamba3 something which is pretty cool. An innovative approach which is questioning the necessity of transformers and reasoning tasks. So this offering is more efficient alternative without compromising results. why, because reasoning spills up more more hallucinations these days and researchers are still on to understand why. Maybe with Mamba architecture we get a way about or around it. Well, we will see whether that happens or not. As said, to everything we've mentioned here, we're going to provide you links within the show notes so that you can reread everything after you listened our beloved podcast. Yeah, think that's one of the biggest changes we have seen when you just go back four or five months. So before we had R1 from DeepSeq, there was always the idea to, need more compute power to bring everything together. We need more compute power to use reasoning models. And then out of a... compute power shortage, would say. Someone developed a new model. And the idea behind that is we get more and more models for specific use cases, for specific needs, for better understanding of the general utilization of AI. And I see it's something you can use for specific use cases to get it quicker. cheaper and still have reasoning within it. So the quality is still better as we have had it one year ago. So it's fantastic. Even if you take a look at the Microsoft site, there are new smaller models from the PHY, not 3 anymore, it's in addition to PHY 4. And I think I would five, four. Yeah, the 5.4 mini reasoning is something you can use for specific needs. I think it was just, I don't know, six, seven, eight months ago, I've heard about 5.3. So it's immense, tremendous development, I would say. pretty fast development. we have now MCP. We will definitely talk in our next episode about A2A, so which is agent to agent, another protocol like the MCP protocol new in that technology era. And it'll change a lot. So there are lots of stuff coming up and we're going to share that with you as fast as we can. First, we need to understand it ourselves. When we got it, we will tell you and give you an idea about what it is. So let's conclude our today's talk. The line jumping vulnerability in MCP servers highlights the importance of robust security measures in AI systems, which Michael also mentioned here. And Michael, what's your conclusion? I've seen a lot of improvement with different models and my personal perspective is we will get AI with specific capabilities much faster in all aspects of our life. Just getting the best solution, AI solution and integration via the best models and we have Also, of my key takeaways is to keep it safe. We have not to use the biggest, the largest, the highest sophisticated models, but something which is really, really bound to the basic stuff I need for my current situation. As you said, with the MCP server, as we see with the different models and the development and... Also, the quality is getting better and better. That's true, absolutely. So let's try our new closing, which we developed over the past two hours. Maybe we can do that. So Michael, can you start? Sure, I'll give it a try. Yeah. So stay tuned, stay interested. Sign up, listen up. And, Here we go, bye bye, take care all and thanks for listening. It's always fun to get you on that, Michael. Bye bye all. you. At least I try, right? At least I try. But I mess it up every time. I have to write it more in my... Maybe in my... laugh is always on the end. Have a great one. Bye bye. Take care. Bye bye.