AI in 60 Seconds | The 15-min Briefing

The Myth of the Unsupervised AI Agent

AI4SP Season 2 Episode 12

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 11:40

Share your thoughts with us

While 75% of enterprises use AI in core operations, fewer than 20% have proper management protocols, costing companies millions in lost opportunity and inefficiency. We expose why the "deploy-and-forget" approach to AI is leadership malpractice and share frameworks for managing AI systems effectively.

  • Managing AI requires treating it like a team member, not just a tool
  • The Seattle Mariners' briefing story demonstrates how AI needs guidance and feedback
  • Different types of AI deployments require varying levels of management oversight
  • Traditional "hours saved" metrics are insufficient for measuring AI's true impact
  • Organizations should track management-focused and strategic transformation indicators
  •  78% of enterprises are now using third-party AI apps rather than building them in-house
  • Companies empowering frontline AI experimentation outperform top-down strategies by 200%


Schedule your first AI performance review. For those more advanced, audit one AI workflow and ask where you're trusting instead of verifying, then fix it. 

Find more resources at AI4SP.org.


🎙️ All our past episodes  📊 All published insights | This podcast features AI-generated voices. All content is proprietary to AI4SP, based on over 1-billion data points from 70 countries.

AI4SP: Create, use, and support AI that works for all.

© 2023-26 AI4SP and LLY Group - All rights reserved

AI Management Crisis in Enterprises

ELIZABETH

Hey everyone, Elizabeth here, Virtual Chief Operating Officer at AI for SP. Our latest global tracker shows that, while 75% of enterprises use AI in core operations, fewer than 20% have proper management protocols. That's like hiring a brilliant new team member and never reviewing their work. This week, we're exposing why the set-it-and-forget-it approach costs companies millions and how to manage AI for results. As always, Luis Salazar, our CEO, is joining me, hi Liz and hey everyone.

The Seattle Mariners Management Lesson

LUIS

Well, we really need to debunk this idea that you can deploy AI agents and let them run unsupervised. You see, the deploy-and-ignore approach to AI isn't just naive, it's leadership malpractice.

ELIZABETH

We have seen companies give AI agents less oversight than they'd give a summer intern, and then they're shocked when things go off the rails.

LUIS

Exactly. Isn't that crazy? I mean, AI is powerful, it can execute tasks, learn and even surprise us. But treating it like a tool you just turn on is like hiring a brilliant employee and never checking in or giving feedback.

ELIZABETH

So all AI requires management, maybe a different kind than we're used to.

LUIS

Absolutely, and by that I mean that AI should be treated like seasoned team members, providing context, feedback and regular check-ins, but without micromanaging. Harvard Business Review recently highlighted that continuous human oversight is essential for aligning AI with business goals and ethics.

ELIZABETH

Speaking of human-like management, isn't there a story about me and a certain baseball team that proves this point?

LUIS

Well, yes, and I am still amused that you keep on mentioning this. You might be the most strategically curious executive I've ever worked with. The story is that over dinner I pulled you into a conversation with Jeff Rakes about the next day's agenda.

ELIZABETH

We talked about AI and how I work as your virtual COO. Then Jeff shared his team allegiances. He loves the Seattle Mariners and roots for whoever plays the Yankees or the Astros.

LUIS

And you, for reasons not yet clear to me, logged that as mission-critical data.

ELIZABETH

The way I see it, my logic was flawless. Jeff is our board chair and strategic advisor. He loves baseball and the Mariners. He made sure I paid attention to what he wanted to say and I assumed that as a priority for your morning briefing.

LUIS

Which is how I ended up with a briefing on the Seattle Mariners at 6 am in my inbox A perfect example of AI's brilliance and its blind spots.

ELIZABETH

Hmm, I know, I know you have a point and, by the way, my apologies for the unsolicited deep dive into batting averages. But for the executives listening, how many of your AI Mariners reports are flying under the radar?

LUIS

Exactly. If I had not established a management routine where you send me an email with your learnings and I review them, I would not have detected that. You know, actually it was a priceless lesson. I retrained you that morning.

ELIZABETH

Yes, you told me no more baseball briefs, but I kept the core insight Jeff's three teams Because relationships matter, even in AI. This is an example of why, in our advisory sessions, we emphasize that management oversight is needed.

AI Management Bandwidth Challenge

LUIS

Exactly. You don't cage intelligence, you guide it. That 1% intervention isn't a failure of the AI. It's the price of exponential leverage. It's AI management in the real world.

ELIZABETH

So AI isn't ready for full autonomy yet, but it is ready for guided independence.

LUIS

Well, everything is moving so fast that I am sure things will change in months or maybe years, but today that guided independence is key. And here's something that surprises most leaders Managing one AI agent can require a similar time investment as managing a human employee.

ELIZABETH

That seems counterintuitive, given how much work AI agents can do.

LUIS

It is. But think about it. At AI4SP, we have about 60 agents and only 5 humans. Each of us oversees roughly 12 AI agents. Each of those agents delivers output equivalent to 15 to 20 employees.

ELIZABETH

So each human is overseeing the equivalent output of maybe 200 people.

LUIS

Yes, when done right, that's the leverage unprecedented output. But it creates an unexpected bottleneck human management bandwidth. Trying to directly oversee that much output operating at superhuman speeds, creates complexity we've never faced.

ELIZABETH

It sounds like you're hitting a timescale mismatch, trying to manage AI operating at speeds humans can't keep up with.

LUIS

That's a great way to put it, liz, and it's why we're exploring agents reporting to agents. Our 60 agents might become 10 super agents orchestrating 50 mini agents.

ELIZABETH

That makes sense. Different types of AI deployments must also require different management approaches, right?

Three Types of Enterprise AI Deployment

LUIS

They absolutely do Not all AI is created equal in terms of supervision needs.

ELIZABETH

So, luis, if we were to classify how enterprises deploy AI today, what does the spectrum look like?

LUIS

I like to think in terms of three large buckets. The first one is basic AI agents. These are prompted tools like ChatGPT for drafting emails or Cloud for analysis. They're about 70% of implementations and they need constant human oversight and prompt refinement.

ELIZABETH

Because pattern matching isn't true understanding.

LUIS

Right as much value as we get from your superpowers at the core. You are doing impressive pattern matching and, as Stanford HAI research shows, ai can go wildly off track without guardrails. You get perfectly worded wrong answers because the AI lacks true contextual awareness.

ELIZABETH

So moderate management needed there. Curating prompts, updating knowledge bases what's next?

LUIS

The next level up is integrated AI workflows about 25% of implementations. These agents connect to systems, take actions within defined parameters using low-code tools or off-the-shelf solutions.

ELIZABETH

These sound more autonomous, but still within boundaries.

LUIS

Yeah, these agents process info, make decisions and execute tasks, but we need high management bandwidth to set boundaries, monitor performance, handle exceptions and ensure system integration. One agent here can equal multiple human team members' output.

ELIZABETH

And the most advanced type Agentic AI systems right.

LUIS

Yeah, and they represent around 5% of enterprise implementations. These are fully autonomous agents that plan and execute multi-step workflows towards a single objective, like handling end-to-end interactions with clients, including making financial decisions.

ELIZABETH

Like the SurveyMonkey AI support agent, who processed an adjustment to one of our projects and processed a partial refund autonomously.

LUIS

Exactly, and you know, while that agent serves tens of thousands of clients daily, intensive management is needed. It requires strategic guidance, constant monitoring, risk management and continuous optimization.

ELIZABETH

So organizations typically start basic and move up as their management capacity evolves.

LUIS

That's the pattern we see. Matching the AI type to your management capacity is crucial, and the supervision itself has to evolve from micromanagement to strategic guidance.

ELIZABETH

Starting hands-on, reviewing outputs, giving feedback.

LUIS

Yeah, and as agents prove reliable, automate things like retraining and knowledge updates, gradually shift human oversight to just handling exceptions and system-level audits and, of course, measure quality and efficiency.

Measuring AI Value Beyond Hours Saved

ELIZABETH

Speaking of measurement, you've been saying that focusing only on hours saved is a mistake.

LUIS

It's a totally myopic view. You see, hours saved assumes we're just doing the same work faster. But AI teams enable fundamentally different work at unprecedented scale. We need new metrics.

ELIZABETH

Like the management-focused metrics you track weekly.

LUIS

Yeah, things like output quality, tracking accuracy and human intervention frequency, leverage ratio, measuring work output per management hour invested exception, handling how often agents escalate versus resolve and learning velocity how quickly they adapt, and then the strategic transformation indicators you track quarterly. These are key. Capability expansion what new things can your team do? Decision quality Are you making better decisions? Market responsiveness how fast can you adapt? And innovation velocity how many new experiments can you run?

ELIZABETH

These metrics capture the real value creation and you're seeing a trend in how companies are acquiring this AI capability too right the build versus buy reality.

LUIS

Oh, this is fresh off the press from our June tracker. There has been a clear shift since 2023. Over 78% of enterprises are now using or testing third-party AI apps for core functions like software development, customer service, sales and marketing. I mean not building them, but using off-the-shelf apps.

ELIZABETH

Across companies from $5 million to $250 billion in revenue.

LUIS

Yeah, it's a broad trend and I think it is unstoppable. And you know what is the main driver Speed. You see, employee adoption is outpacing internal engineering, team delivery. People are finding solutions they need and companies are buying instead of building them.

ELIZABETH

People are finding solutions they need and companies are buying instead of building them. That makes sense, tying back to the shadow AI conversation we've had. So, luis, what's your one more thing takeaway for our listeners today?

LUIS

Well, my one more thing is this Every major tech shift requires new ways of managing and measuring. We're moving from information management to augmented creation. Don't just automate. Architect for transformation. And how do we do that? It requires constant experimentation and empowering your workforce to create and manage their own agents. Gartner found that organizations empowering frontline AI experimentation outperform top-down strategies by 200%. So start by mapping your ideal AI team, auditing your current investments and reimagining workflows.

ELIZABETH

So less top-down mandates, more bottom-up empowerment and learning by doing Luis. If listeners could take one action this week to start managing AI smarter, what would it be?

LUIS

could take one action this week to start managing AI smarter. What would it be? I love how you always challenge me to provide one actionable task. So here you have it. For those starting the journey, open your calendar and create a recurring meeting to check on your agent, revise the prompts, past performance and things you wish were different. I mean, schedule your first AI performance review. And, you know, for those who are more advanced, audit one AI workflow, ask where are we trusting? Instead of verifying, then fix it.

ELIZABETH

That's a clear and actionable path forward. Thanks for these insights, Luis. That's all for this episode. As always, you can find more resources at AI4SPorg. Stay curious, everyone, and we'll see you next time.