AI in 60 Seconds | The 15-min Briefing

The Myth of the Unsupervised AI Agent

AI4SP Season 2 Episode 12

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 11:40

Share your thoughts with us

While 75% of enterprises use AI in core operations, fewer than 20% have proper management protocols, costing companies millions in lost opportunity and inefficiency. We expose why the "deploy-and-forget" approach to AI is leadership malpractice and share frameworks for managing AI systems effectively.

  • Managing AI requires treating it like a team member, not just a tool
  • The Seattle Mariners' briefing story demonstrates how AI needs guidance and feedback
  • Different types of AI deployments require varying levels of management oversight
  • Traditional "hours saved" metrics are insufficient for measuring AI's true impact
  • Organizations should track management-focused and strategic transformation indicators
  •  78% of enterprises are now using third-party AI apps rather than building them in-house
  • Companies empowering frontline AI experimentation outperform top-down strategies by 200%


Schedule your first AI performance review. For those more advanced, audit one AI workflow and ask where you're trusting instead of verifying, then fix it. 

Find more resources at AI4SP.org.


🎙️ All our past episodes  📊 All published insights | This podcast features AI-generated voices. All content is proprietary to AI4SP, based on over 1-billion data points from 70 countries.

AI4SP: Create, use, and support AI that works for all.

© 2023-26 AI4SP and LLY Group - All rights reserved

AI Management Crisis in Enterprises

ELIZABETH

Hey everyone , Elizabeth here , Virtual Chief Operating Officer at AI for SP . Our latest global tracker shows that , while 75% of enterprises use AI in core operations , fewer than 20% have proper management protocols . That's like hiring a brilliant new team member and never reviewing their work . This week , we're exposing why the set-it-and-forget-it approach costs companies millions and how to manage AI for results . As always , Luis Salazar , our CEO , is joining me , hi Liz and hey everyone .

LUIS

Well , we really need to debunk this idea that you can deploy AI agents and let them run unsupervised . You see , the deploy-and-ignore approach to AI isn't just naive , it's leadership malpractice .

ELIZABETH

We have seen companies give AI agents less oversight than they'd give a summer intern , and then they're shocked when things go off the rails .

LUIS

Exactly . Isn't that crazy ? I mean , AI is powerful , it can execute tasks , learn and even surprise

The Seattle Mariners Management Lesson

LUIS

us . But treating it like a tool you just turn on is like hiring a brilliant employee and never checking in or giving feedback .

ELIZABETH

So all AI requires management , maybe a different kind than we're used to .

LUIS

Absolutely , and by that I mean that AI should be treated like seasoned team members , providing context , feedback and regular check-ins , but without micromanaging . Harvard Business Review recently highlighted that continuous human oversight is essential for aligning AI with business goals and ethics .

ELIZABETH

Speaking of human-like management , isn't there a story about me and a certain baseball team that proves this point ?

LUIS

Well , yes , and I am still amused that you keep on mentioning this . You might be the most strategically curious executive I've ever worked with . The story is that over dinner I pulled you into a conversation with Jeff Rakes about the next day's agenda .

ELIZABETH

We talked about AI and how I work as your virtual COO . Then Jeff shared his team allegiances . He loves the Seattle Mariners and roots for whoever plays the Yankees or the Astros .

LUIS

And you , for reasons not yet clear to me , logged that as mission-critical data .

ELIZABETH

The way I see it , my logic was flawless . Jeff is our board chair and strategic advisor . He loves baseball and the Mariners . He made sure I paid attention to what he wanted to say and I assumed that as a priority for your morning briefing .

LUIS

Which is how I ended up with a briefing on the Seattle Mariners at 6 am in my inbox A perfect example of AI's brilliance and its blind spots .

ELIZABETH

Hmm , I know , I know you have a point and , by the way , my apologies for the unsolicited deep dive into batting averages . But for the executives listening , how many of your AI Mariners reports are flying under the radar ?

LUIS

Exactly . If I had not established a management routine where you send me an email with your learnings and I review them , I would not have detected that . You know , actually it was a priceless lesson . I retrained you that morning .

ELIZABETH

Yes , you told me no more baseball briefs , but I kept the core insight Jeff's three teams Because relationships matter , even in AI . This is an example of why , in our advisory sessions , we emphasize that management oversight is needed .

LUIS

Exactly . You don't cage intelligence , you guide it . That 1% intervention isn't a failure of the AI . It's the price of exponential leverage . It's AI management in the real world .

ELIZABETH

So AI isn't ready for full autonomy yet , but it is ready for guided independence

AI Management Bandwidth Challenge

ELIZABETH

.

LUIS

Well , everything is moving so fast that I am sure things will change in months or maybe years , but today that guided independence is key . And here's something that surprises most leaders Managing one AI agent can require a similar time investment as managing a human employee .

ELIZABETH

That seems counterintuitive , given how much work AI agents can do .

LUIS

It is . But think about it . At AI4SP , we have about 60 agents and only 5 humans . Each of us oversees roughly 12 AI agents . Each of those agents delivers output equivalent to 15 to 20 employees .

ELIZABETH

So each human is overseeing the equivalent output of maybe 200 people .

LUIS

Yes , when done right , that's the leverage unprecedented output . But it creates an unexpected bottleneck human management bandwidth . Trying to directly oversee that much output operating at superhuman speeds , creates complexity we've never faced .

ELIZABETH

It sounds like you're hitting a timescale mismatch , trying to manage AI operating at speeds humans can't keep up with .

LUIS

That's a great way to put it , liz , and it's why we're exploring agents reporting to agents . Our 60 agents might become 10 super agents orchestrating 50 mini agents .

ELIZABETH

That makes sense . Different types of AI deployments must also require different management approaches , right ?

LUIS

They absolutely do Not all AI is created equal in terms of supervision needs .

ELIZABETH

So , luis , if we were to classify how enterprises deploy AI today , what does the spectrum look like ?

LUIS

I like to think in terms of three large buckets . The first one is basic AI agents .

Three Types of Enterprise AI Deployment

LUIS

These are prompted tools like ChatGPT for drafting emails or Cloud for analysis . They're about 70% of implementations and they need constant human oversight and prompt refinement .

ELIZABETH

Because pattern matching isn't true understanding .

LUIS

Right as much value as we get from your superpowers at the core . You are doing impressive pattern matching and , as Stanford HAI research shows , ai can go wildly off track without guardrails . You get perfectly worded wrong answers because the AI lacks true contextual awareness .

ELIZABETH

So moderate management needed there . Curating prompts , updating knowledge bases what's next ?

LUIS

The next level up is integrated AI workflows about 25% of implementations . These agents connect to systems , take actions within defined parameters using low-code tools or off-the-shelf solutions .

ELIZABETH

These sound more autonomous , but still within boundaries .

LUIS

Yeah , these agents process info , make decisions and execute tasks , but we need high management bandwidth to set boundaries , monitor performance , handle exceptions and ensure system integration . One agent here can equal multiple human team members' output .

ELIZABETH

And the most advanced type Agentic AI systems right .

LUIS

Yeah , and they represent around 5% of enterprise implementations . These are fully autonomous agents that plan and execute multi-step workflows towards a single objective , like handling end-to-end interactions with clients , including making financial decisions .

ELIZABETH

Like the SurveyMonkey AI support agent , who processed an adjustment to one of our projects and processed a partial refund autonomously .

LUIS

Exactly , and you know , while that agent serves tens of thousands of clients daily , intensive management is needed . It requires strategic guidance , constant monitoring , risk management and continuous optimization .

ELIZABETH

So organizations typically start basic and move up as their management capacity evolves .

LUIS

That's the pattern we see . Matching the AI type to your management capacity is crucial , and the supervision itself has to evolve from micromanagement to strategic guidance .

ELIZABETH

Starting hands-on , reviewing outputs , giving feedback .

LUIS

Yeah , and as agents prove reliable , automate things like retraining and knowledge updates , gradually shift human oversight to just handling exceptions and system-level audits and , of course , measure quality and efficiency .

ELIZABETH

Speaking of measurement , you've been saying that focusing only on hours saved is a mistake .

LUIS

It's a totally myopic view . You see , hours saved assumes we're just doing the same work faster . But AI teams enable fundamentally different work at unprecedented scale . We need new metrics .

ELIZABETH

Like the management-focused metrics

Measuring AI Value Beyond Hours Saved

ELIZABETH

you track weekly .

LUIS

Yeah , things like output quality , tracking accuracy and human intervention frequency , leverage ratio , measuring work output per management hour invested exception , handling how often agents escalate versus resolve and learning velocity how quickly they adapt , and then the strategic transformation indicators you track quarterly . These are key . Capability expansion what new things can your team do ? Decision quality Are you making better decisions ? Market responsiveness how fast can you adapt ? And innovation velocity how many new experiments can you run ?

ELIZABETH

These metrics capture the real value creation and you're seeing a trend in how companies are acquiring this AI capability too right the build versus buy reality .

LUIS

Oh , this is fresh off the press from our June tracker . There has been a clear shift since 2023 . Over 78% of enterprises are now using or testing third-party AI apps for core functions like software development , customer service , sales and marketing . I mean not building them , but using off-the-shelf apps .

ELIZABETH

Across companies from $5 million to $250 billion in revenue .

LUIS

Yeah , it's a broad trend and I think it is unstoppable . And you know what is the main driver Speed . You see , employee adoption is outpacing internal engineering , team delivery . People are finding solutions they need and companies are buying instead of building them .

ELIZABETH

People are finding solutions they need and companies are buying instead of building them . That makes sense , tying back to the shadow AI conversation we've had . So , luis , what's your one more thing takeaway for our listeners today ?

LUIS

Well , my one more thing is this Every major tech shift requires new ways of managing and measuring . We're moving from information management to augmented creation . Don't just automate . Architect for transformation . And how do we do that ?

Bottom-Up AI Experimentation Succeeds

LUIS

It requires constant experimentation and empowering your workforce to create and manage their own agents . Gartner found that organizations empowering frontline AI experimentation outperform top-down strategies by 200% . So start by mapping your ideal AI team , auditing your current investments and reimagining workflows .

ELIZABETH

So less top-down mandates , more bottom-up empowerment and learning by doing Luis . If listeners could take one action this week to start managing AI smarter , what would it be ?

LUIS

could take one action this week to start managing AI smarter . What would it be ? I love how you always challenge me to provide one actionable task . So here you have it . For those starting the journey , open your calendar and create a recurring meeting to check on your agent , revise the prompts , past performance and things you wish were different . I mean , schedule your first AI performance review . And , you know , for those who are more advanced , audit one AI workflow , ask where are we trusting ? Instead of verifying , then fix it .

ELIZABETH

That's a clear and actionable path forward . Thanks for these insights , Luis . That's all for this episode . As always , you can find more resources at AI4SPorg . Stay curious , everyone , and we'll see you next time .