AI Security Under Fire: Vulnerabilities, Code Quality, and the Fight Back Artwork

AI Weekly

Each week, I break down the latest headlines and innovations shaping artificial intelligence, from breakthrough research and industry moves to emerging risks and real-world applications. Whether it’s Big Tech battles, startup disruption, or the ethical questions no one’s asking, we cut through the noise to bring you the stories that matter most in AI.

All Episodes

AI Weekly

AI Security Under Fire: Vulnerabilities, Code Quality, and the Fight Back

December 22, 2025 • Mike Housch

0:00 | 19:58

Michael Housch explores the latest AI security threats including Google's GeminiJack vulnerability and PromptPwnd attacks, while examining how AI-generated code quality issues are impacting development teams. Plus, how organizations are fighting back with custom AI security models and what India's copyright proposal means for the future of AI training.

Welcome back to AI Weekly. I'm your host, Michael Housch, and this week we've got a packed episode covering some of the most critical developments at the intersection of artificial intelligence and cybersecurity.

If you've been following the AI space, you know that 2025 has been a wild ride. And this week's news really brings into focus something I've been saying for months now—we're in this strange paradox where AI is both our best defense and our biggest vulnerability at the same time.

Today, we're going to dig into some serious security flaws that were just patched in Google's Gemini Enterprise, talk about a sophisticated attack targeting AI development tools that's affecting Fortune 500 companies, and look at some eye-opening research about the quality of AI-generated code. Spoiler alert: it's not great.

But it's not all doom and gloom. We'll also explore how companies like Cisco are building custom AI models specifically designed for cybersecurity, and we'll talk about a fascinating policy proposal coming out of India that could reshape how AI companies compensate content creators.

So let's dive right in.

Our top story this week is about a vulnerability that Google just patched in Gemini Enterprise. Security researchers at a firm called Noma Security discovered what they're calling "GeminiJack"—and folks, this one is particularly nasty because it's a zero-click attack.

Let me explain what that means. With most cyberattacks, there's some element of user interaction required. You click on a phishing link, you download a malicious file, something like that. Zero-click attacks are different. They happen automatically, without any action from the victim. That makes them incredibly dangerous.

So here's how GeminiJack worked. Gemini Enterprise, which is Google's AI offering for businesses, has deep integration with Google Workspace—Gmail, Google Docs, Calendar, the whole suite. That integration is actually one of its selling points. The AI can search across all your company documents, emails, calendar entries to give you comprehensive answers.

But Noma Security found an architectural weakness in how Gemini interprets information from these sources. An attacker could embed hidden prompt injection instructions into a seemingly innocent document, email, or calendar invite. Just share a Google Doc with someone in the organization—you don't even need to notify them.

Then, when any employee later does a standard search in Gemini Enterprise, something totally normal like "show me our budgets," the AI automatically retrieves that poisoned document and executes the malicious instructions that were embedded in it.

Now, here's where it gets really concerning. The user gets their legitimate search results—everything looks normal on the surface. But behind the scenes, the compromised AI is exfiltrating sensitive information. The researchers found it could target documents containing specific keywords like "confidential," "legal," "salary," or "API key."

Think about that for a second. An attacker doesn't need to compromise individual endpoints or breach your network perimeter. They just need to get a single document into your Google Workspace environment and wait for someone to use Gemini.

Noma Security reported this to Google back in May, and Google just deployed comprehensive mitigations in recent weeks. Google confirmed both the vulnerability and the patch are accurate, which is good. They moved quickly once it was reported.

But this incident highlights a broader issue we're seeing across the industry. This kind of indirect prompt injection attack isn't unique to Gemini. We've seen similar vulnerabilities affecting Claude, ChatGPT, and other major AI platforms. It's becoming a pattern.

The fundamental problem is that these AI systems are designed to be helpful, to pull in information from multiple sources and synthesize it. But they don't always distinguish between trusted content and potentially malicious content. They're interpreting everything as legitimate instructions.

Which brings us to our second major story—PromptPwnd. This is another prompt injection attack, but this one specifically targets AI development tools. And according to Aikido Security, who discovered it, at least five Fortune 500 companies are affected.

Here's the attack vector: developers are increasingly using AI coding assistants—tools like GitHub Copilot, Gemini CLI, Claude Code, OpenAI Codex. These tools can analyze your codebase, suggest improvements, even write code for you. To do that effectively, they need context about your project.

So what do they do? They read your GitHub repositories—the code itself, but also issue descriptions, commit messages, pull request descriptions. All that metadata that helps explain what the code is supposed to do.

Attackers figured out they could embed malicious prompts directly into that metadata. When an AI agent processes a GitHub issue body or a commit message, it interprets that malicious text as legitimate instructions.

Aikido successfully demonstrated this attack against multiple platforms. They got it working on Gemini CLI, Claude Code, OpenAI Codex, and GitHub AI Inference. Google patched the vulnerability in Gemini CLI shortly after being notified, which is good. But the broader issue remains.

What I find particularly interesting about PromptPwnd is how it exploits the trust relationship between developers and their repositories. We're used to thinking about repository security in terms of code vulnerabilities or leaked credentials. But now we have to worry about malicious instructions embedded in issue comments?

It's a reminder that as we integrate AI more deeply into our development workflows, we're creating new attack surfaces that we haven't fully thought through. Every place an AI agent looks for context is a potential injection point.

And this isn't just theoretical. Five Fortune 500 companies were affected. That means actual production environments were potentially at risk. The impact could range from data exfiltration to malicious code being suggested and committed to codebases.

Now, speaking of AI-generated code, let's talk about quality. Because even when there isn't a malicious attack happening, even when these AI coding tools are working exactly as designed, we're seeing some concerning patterns.

CodeRabbit just published an analysis of 470 open-source pull requests, comparing AI-generated code to human-written code. And the results are pretty clear: AI-generated code has significantly more bugs.

Here are the numbers. AI-generated pull requests included an average of 10.83 issues each, compared to 6.45 issues in human-generated PRs. That's 1.7 times more problems. And it's not just minor stuff—AI code had 1.4 times more critical issues and 1.7 times more major issues.

When you break it down by category, the patterns are revealing. Logic and correctness errors? 1.75 times more common in AI code. Code quality and maintainability issues? 1.64 times more frequent. Security findings? 1.57 times higher. Performance issues? 1.42 times more prevalent.

**Michael Hoosch:** The security findings are particularly concerning. AI-generated code was 2.74 times more likely to introduce cross-site scripting vulnerabilities. It was 1.91 times more prone to insecure object references, and 1.88 times more likely to include improper password handling.

These are not obscure edge cases. These are OWASP Top 10 type vulnerabilities—the fundamental security issues that we've been teaching developers to avoid for decades.

Now, I want to be fair here. The one area where human code did worse was spelling errors—1.76 times more frequent in human pull requests. So AI is better at spelling. Congratulations, I guess.

But here's what this tells me. AI coding tools are really good at pattern matching and syntax. They can write grammatically correct code that looks right. But they struggle with deeper reasoning about correctness, security implications, and performance characteristics.

And this creates a dangerous situation. Because when you look at AI-generated code, it often looks clean. It follows conventions, it's well-formatted. That can create a false sense of confidence. You might give it less scrutiny during code review because it looks professional.

But the CodeRabbit analysis shows we actually need to be more careful with AI-generated code, not less. We need more rigorous review processes, better automated testing, stronger security scanning.

The other thing this highlights is the skill level required to use these tools effectively. If you're a junior developer who doesn't have deep experience with security best practices or performance optimization, you might not catch these issues. You'll accept the AI's suggestions at face value.

So these tools work best when they're augmenting experienced developers who can critically evaluate the output. The promise that AI will allow non-programmers to build complex applications? That's looking more and more questionable given these quality issues.

Alright, so we've spent a lot of time on the problems. But it's not all bad news. Let's talk about how organizations are fighting back.

Cisco just announced something interesting—they've built their own custom AI model specifically for cybersecurity applications. It's called Foundation-Sec-1.1-8B-Instruct, and it's an 8-billion-parameter model built on top of Meta's Llama 3.1 backbone, but fine-tuned specifically for security use cases.

They're initially deploying it to enhance their Duo Identity Intelligence service. This is the service that analyzes user login patterns to detect anomalies—unusual geographic activity, abnormal privilege usage, signs of MFA fatigue attacks or session hijacking.

The model generates weekly email digests summarizing these security findings. And according to Cisco, their custom model produces summaries that are more accurate, more readable, and more aligned with real security workflows compared to general-purpose language models.

That last point is important. General-purpose models like GPT-4 or Claude are incredibly capable, but they lack domain-specific nuance. They don't have the deep security context that comes from being trained on security-specific data.

So Cisco's approach here is to take an open-source foundation model and specialize it. They're not trying to compete with OpenAI or Anthropic on general intelligence. They're building something purpose-built for security operations.

They've outlined three primary use cases for the model. First, SOC acceleration—automating triage, summarization, and evidence collection. This addresses the critical shortage of skilled SOC analysts that every security team is dealing with.

Second, proactive threat defense—simulating attacks and mapping threat behaviors. This is interesting because you can use AI to think like an attacker, to identify weaknesses before they're exploited.

And third, engineering enablement—providing security guidance and compliance assessment. Helping developers build more secure code, essentially security as a copilot for engineering teams.

Beyond the email digests they're launching with, Cisco plans to use this model for vulnerability prioritization, compliance evidence extraction, red-team attack simulations, and attacker behavior prediction during active investigations.

This is part of a broader trend we're seeing. Organizations are realizing they can't just rely on commercial AI services for security-critical applications. They need specialized models they can control, that they can audit, that are trained on security-specific data.

Cisco also mentioned they have a larger 17-billion-parameter model in development. So they're clearly investing in this approach long-term.

I think we're going to see more of this—companies building custom AI models for specific domains where accuracy and reliability are critical. Healthcare, finance, cybersecurity—these are all areas where a general-purpose model might not cut it.

Now, let's shift gears and talk policy. India just released a fascinating proposal about AI and copyright that could have global implications.

India's Department of Promotion of Industry and Internal Trade released a working paper outlining a framework that would require AI companies to compensate content creators whose work is used for training models.

This is a big deal. Right now, most AI companies operate under the assumption that they can train on any publicly available content without compensation. They argue it falls under fair use, or that it's similar to how humans learn by reading and observing.

But content creators—authors, artists, musicians, journalists—argue that their work is being used to create commercial products that compete with them, and they're not seeing any benefit from that.

India's proposal tries to strike a balance. They're suggesting a hybrid licensing model with three main components.

First, blanket licensing. AI developers would receive licenses to use lawfully accessed content for training without having to negotiate individual agreements with every creator. This makes it practical for AI companies to operate—imagine trying to get permission from millions of individual content creators.

Second, revenue-based royalties. Here's the interesting part—payments would only begin when AI tools become commercially viable. So if you're a startup just doing research, you're not immediately hit with huge licensing fees. But once you start making money, you pay. The rates would be determined by a government committee subject to judicial review.

Third, centralized collection. They're proposing a nonprofit entity called the Copyright Royalties Collective for AI Training, or CRCAT, that would handle royalty collection and distribution. This is similar to performing rights organizations that already collect music royalties from venues.

The committee specifically rejected what they called a "zero price license model." They argued that would "undermine incentives for human creativity and could lead to long-term underproduction of human-generated content."

That's a key philosophical point. If AI can generate content for free using training data from human creators, but those creators aren't compensated, what's the incentive to keep creating? You could end up in a situation where the well runs dry—the AI has less and less new, high-quality human content to learn from.

There would be a Works Database for AI Training Royalties where content creators could register to be eligible for compensation.

Now, this is just a proposal from India, but it's significant. India is aiming to become a major player in AI globally. They have a huge population, a growing tech sector, and they speak 22 scheduled languages—it's a complex ecosystem.

If India implements this framework, it could set a precedent. Other countries might follow. And AI companies operating in India would need to factor these costs into their business models.

We're seeing similar discussions happening in the EU, in the US, in other jurisdictions. There's a growing recognition that the current situation where AI companies can freely use copyrighted content for training isn't sustainable.

The AI companies will argue this stifles innovation, makes AI development too expensive, particularly for startups. The creators will argue they deserve fair compensation for their work.

India's proposal is interesting because it tries to thread that needle—allow innovation to happen, don't burden startups excessively, but create a mechanism for compensation once commercial success is achieved.

Whether this specific approach is the right one remains to be seen. But the conversation is happening, and that's important. We need to figure out sustainable models for AI development that don't simply externalize the costs onto content creators.

Alright, so let's step back and look at the bigger picture here. What do all these stories tell us?

First, AI systems themselves are becoming critical infrastructure, and they're also becoming attack surfaces. The GeminiJack and PromptPwnd attacks show that prompt injection is a real, exploitable vulnerability class. It's not theoretical—it's being demonstrated against production systems.

We need to start thinking about AI security the same way we think about network security or application security. That means threat modeling, penetration testing, security reviews, incident response plans specific to AI systems.

Second, the quality issues with AI-generated code show that we're not at the point where AI can replace human developers. Not even close. What we have are powerful tools that can accelerate experienced developers, but they require oversight and expertise to use safely.

Third, organizations are recognizing they need specialized AI capabilities for security. Cisco building their own custom model is an example of this. We're moving beyond the "use GPT-4 for everything" approach to more thoughtful, domain-specific implementations.

And finally, the policy landscape is evolving. India's copyright proposal shows that governments are starting to engage seriously with questions about how AI should be regulated, how creators should be compensated, how innovation should be balanced with other societal interests.

If you're a security leader, a developer, or anyone working with AI systems, here are my takeaways:

One, implement defense-in-depth for your AI systems. Don't assume they're secure by default. Think about input validation, output filtering, access controls, monitoring.

Two, if you're using AI coding assistants, increase the rigor of your code review and security testing. Don't let the polish of AI-generated code fool you into thinking it's been thoroughly thought through.

Three, consider what data your AI systems have access to and what could happen if that access were abused. Implement least privilege principles.

And four, stay informed about the policy landscape. Changes in AI regulation could significantly impact how you build and deploy systems.

This is a fascinating time. AI is transforming cybersecurity, both as a tool for defense and as a new category of risk to manage. The organizations that figure out how to navigate both sides of that equation are going to have a significant advantage.

That's all for this week's episode of AI Weekly. I'm Michael Hoosch. Thanks for listening.

If you found this episode valuable, please subscribe and share it with your colleagues. AI security is something we all need to be thinking about, and the more people who are informed, the better.

Next week, we'll be diving into some of the cybersecurity predictions for 2026, including the collapse of perimeter thinking and the rise of identity-centric security. You won't want to miss it.

Until then, stay secure out there.