The Digital Transformation Playbook

ROI That Boards Can Believe

Kieran Gilmurray

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 25:14

Budgets are climbing, slides are shiny, yet boards still ask the same hard question: where is the ROI? We dig into the paradox of aggressive AI investment with thin or invisible returns and lay out a concrete path to results that show up on the income statement. 

Our focus is practical and board-ready: what to measure, how to attribute, and how to avoid pilot purgatory by fixing data, integration, and sponsorship first.

At A Glance / TLDR:

  • the ai roi paradox and why it persists
  • data quality, ownership and sponsorship as limiters
  • minimum viable data stack and integration pathways
  • three-tier readiness model with timelines and targets
  • four-pillar roi framework efficiency, revenue, risk, agility
  • board-ready one-page business case and scenarios
  • metrics baseline, dashboard cadence, and attribution
  • size-specific guidance for small, mid-market, and enterprise
  • real-world benchmarks and examples
  • common pitfalls vanity metrics, no baseline, hidden costs

We unpack a minimum viable data stack—auditable governance, clear lineage, and API access to systems of record—so agents can read, act, and write back. Then we map a three-tier readiness approach to plan timelines, budgets, and expected payback without hype. 

High-readiness teams often move from pilot to production in about 16 weeks; foundation-builders invest in plumbing but still reach solid first-year ROI once adoption stabilises. 

Throughout, we translate activity into outcomes using a four-pillar ROI framework: efficiency gains across end-to-end workflows, revenue generation through higher conversion and reduced churn, risk mitigation with quantified avoided costs, and business agility measured by decision speed and time to market.

To help you win support, we share a one-page business case format your CFO can audit, with scenario modelling, conservative attribution, and a metrics dashboard that tracks response times, CSAT, unit costs, and churn over time. 

We also highlight real benchmarks and examples—from large-scale service operations to sales enablement—showing how integrated data and human-in-the-loop design compress cycle times and unlock capacity. If you’re ready to move from proofs of concept to production value, this playbook shows how to measure what matters, fund what works, and expand across adjacencies with credibility. 

Subscribe, share with a teammate, and leave a review telling us which pillar you’re tackling first.

Like some free book chapters?  Then go here How to build an agent - Kieran Gilmurray

Want to buy the complete book? Then go to Amazon or  Audible today.

Support the show


𝗖𝗼𝗻𝘁𝗮𝗰𝘁 my team and I to get business results, not excuses.

☎️ https://calendly.com/kierangilmurray/results-not-excuses
✉️ kieran@gilmurray.co.uk
🌍 www.KieranGilmurray.com
📘 Kieran Gilmurray | LinkedIn
🦉 X / Twitter: https://twitter.com/KieranGilmurray
📽 YouTube: https://www.youtube.com/@KieranGilmurray

📕 Want to learn more about agentic AI then read my new book on Agentic AI and the Future of Work https://tinyurl.com/MyBooksOnAmazonUK


The AI ROI Paradox

SPEAKER_00

Chapter 10 Measuring Success ROI that matters to the board. Boards are increasing AI budgets while simultaneously complaining about thin or invisible returns. That is the paradox you must resolve. Most large organizations claim that AI is now in production somewhere. McKinse Research reports that 78% of them use AI in at least one function, yet many have not translated activity into bottom line impact. Google Cloud's latest enterprise study reports that 74% of executives achieve ROI within the first year, and 52% say their organizations have deployed AI agents. At the same time, sober analyses show the gap between AI spending and value. IBM's 2025 CEO study reports that over the past three years, only 25% of AI initiatives delivered an expected ROI, and only 16% of organizations have scaled enterprise-wide. IDC research likewise highlights scale-up as a significant business problem. 88% of AI pilots failed to reach production. Most organizations report skill, governance, and data hurdles that stall the values that could be captured from AI. What explains the contradiction? It is down to measurement. Traditional IT performance metrics track delivery and cost. A gentic AI creates value in several places at once, time back for teams, happier customers who stay, fewer errors and risks, and faster testing of ideas. If we measure only one dimension, say labor savings in one team, we undercount the return. Yet the upside is real. IDC estimates that every new$1 spent on AI solutions could add about$4.60 to the global economy by 2030. Setting your organization up to win. Why AI programs stall? Failure patterns repeat. Data quality is the first limiter of any successful AI program. 85% of organizations cite data quality as the biggest challenge in their AI strategies. Without clean, accessible volumes of data. Asians cannot resolve full workflows or make correct writebacks. Ownership and coordination are the second limiter. Additionally, IDC and others have found that many AI initiatives lack clear executive sponsorship and cross-functional alignment. ROI expectations are often unclear at the outset, which undermines prioritization and post-implementation credibility. In one recent enterprise survey, 86% of organizations cited a lack of well-defined ROI expectations as a mistake they made in past AI initiatives. Other common breaks include talent gaps and regulatory compliance hurdles, especially in regulated sectors. Minimum viable data and integration. Before forecasting ROI, put in a minimum viable data stack. Establish a simple, auditable data governance schema. Document data lineage for key data sources that agents will use to read and write. Provide API access to systems of record, allowing agents to close the loop by reading a record, taking an action, and writing the outcome back to the source of truth. Teams that skip this fundamental groundwork rarely reach scale production. Gartner forecasts that at least 30% of Gen AI projects will be abandoned after POC by the end of 2025, while SP Global. 451 research reports a 46% median abandonment rate pre-production, with data integration, and unclear value among the top causes. 3-Tier Readiness Assessments. Timelines and ROI for AI agents vary. Depending on organizational readiness, typical delivery windows for well-run AI projects are 3 to 6 months, from pilot to production, with some projects taking up to a year for more extensive integration. High ROI outliers exist in specific, well-scoped deployments. Treat company size thresholds guidelines, not gatekeepers. Readiness equals data plus integration plus sponsorship. This is a fast triage for business planning, not a maturity score. Use it to scope investment and timelines. It helps to decide what to tackle first, how long it will take, and the likely investment to return ratio based on three key ingredients Data Foundation, integration pathways, and executive sponsorship. Ranges are indicative, not promises, regulated contexts or heavy tech debt trend longer and higher. Rule of thumb If you match two of the three ingredients for a tier, plan with that tier. When in doubt, plan for the lower tier. Tier one, high readiness organizations, clean, accessible data, executive sponsorship at the VP plus level, prior automation wins, cross functional team in place, four week pilot and twelve week implementation are typical. Target outcome one hundred and fifty to two hundred and fifty percent ROI in the first year from a well scoped use case. Investment 75k to 250K Often$50 million plus revenue or 500 plus employees. Tier 2 Medium Readiness Partial Data Readiness with gaps, limited AI expertise, but willingness to partner, a two-month preparation period, followed by a pilot, and then a 16-week build. Target outcome 100% to 200% ROI within 12 months. Investment 100K to 350K 10 million to 50 million revenue or 100 to 500 employees. Tier 3, Foundation Building, Scattered Data, Minimal Prior Automation, Weak Documentation, Legacy Constraints. Expect a 6 month foundation phase before the pilot. Target outcome 50 to 150% first year ROI once the first use case is in production and adoption stabilizes. Investment 150K to 500K, including data plumbing. Often less than$10 million revenue or less than 100 employees, or larger firms with heavy tech debt. Critical success factors to avoid pilot purgatory for any tier. Executive sponsorship with a committed budget. A supportive single business owner with PL accountability. Measurable current state metrics and a clearly defined digital process. Confirmed data success of sufficient quality and quantity to proceed. Willingness to redesign workflows, not just overlay a digital assistant on broken workflows. If any of the following are true, hit pause and fix foundations. There is no quantifiable business problem. Nobody can explain the current process costs or flow. Critical data exists, but only in spreadsheets or mailboxes. Prior IT projects routinely stall. Our leaders have unrealistic delivery expectations and demand a two-week implementation for a multisystem workflow that takes months to build, test, and implement. Those flags correlate with the high abandonment rates we mentioned earlier. The four pillar ROI framework. Boards care about outcomes that hit the income statement and protect enterprise value. Utilize a four pillar approach that encompasses tangible strategic returns. Keep the math simple and conservative. Your CFO will add the complexity to the spreadsheet. One, efficiency gains, time, labor, and process improvements. two. Revenue generation, conversion, retention, capacity to serve. three. Risk mitigation, compliance, security, and quality costs avoided. four. Business agility, speed, adaptation, and option value. Pillar one, efficiency gains. What to measure? Measure end to end workflows rather than isolated tasks. Track labor hours saved, fully loaded. Include cycle time improvements and error rate reductions. Count software licenses or infrastructure you can retire. Benchmarks. Support agents using AI tools handle 13.8% more inquiries per hour. Controlled studies show knowledge workers are 33% more productive during hours when they use Gen AI for suitable tasks. When scoped effectively, service operations typically achieve double digit cost reductions. Pillar two Revenue Generation. What to measure? Track conversion rates, lead quality, and sales cycle velocity. For subscription businesses, quantify churn reduction and upsell. Attribute conservatively. I recommend allocating 50 to 70% of the attribution to the agentic system when multiple initiatives run concurrently. Benchmarks Personalization at scale remains one of the most reliable levers for driving growth. McKinsey that companies successfully delivering personalization can expect a 10-15% revenue uplift, according to Salesforce. Teams that use AI for sales prospecting and content generation can streamline campaign development and scale personalized engagement more efficiently. Pillar three Risk Mitigation What to measure? Quantify avoided costs, fines, penalties, breach remediation, SLA penalties, fraud losses, and the rework costs of quality failures. Express each as a probability weighted expectation. The cost of failure here is not immaterial. We have already noted that the global average cost of a data breach is four point four five million dollars. Pillar four business agility. What to measure? Decision speed, experimentation velocity, time to market, the number of initiatives the same team can support, option value created by reusable capabilities. Why it matters? AI leaders report 20-30% improvements in productivity, speed to market, and revenue from systematic deployment at scale. Google Cloud's 2025 study shows that firms investing in agentic capabilities are more likely to report a first-year ROI and to quickly expand to adjacent use cases. How to discuss this with the board. Use a before and after narrative with timestamps and headcounts. After the Agentic Content Ops pipeline, launches require six weeks with two marketers, plus an agent that personalizes and tests copy across channels. Tie agility to share gain, quicker cash inflows, and optionality. Building the business case. Your business case should fit on one page for the executive summary, backed by a spreadsheet that a finance partner can audit. Executive summary one page. The problem quantify the status quo. For example, average response time is 4.2 hours and cost per interaction is$12.50. The solution specific agentic capabilities tied to one workflow. The investment year by year cost, one time versus recurring. The return. Expected ROI across the four pillars with conservative assumptions. The timeline, milestones and payback period, the risk, cost of inaction, and risk mitigation plan. Investment breakdown Illustrative Planning Ranges Cost Category One Time Costs Discovery and Design Year 1 15 25 K Development and Integration Year 1 50 to 150 K Data Infrastructure Year 1 20 50 K. Year 3 Upgrades as needed Training and Change Management Year 1 15 to 30 K Year 2 5 to 10K Year 3 5 10 K Recurring Costs Platform and Infrastructure Year 1 24 60 K Year 2 24 60 K Year 3 24 60 K Model API costs Year 1 12 to 36 K Year two 12 to 36 K Year 3 12 to 36 K Maintenance and Optimization Year 1 20% of DEV Year 2 20% of DEV Year 3 20% of Dev Total Investment Year 1 136 to 351 K Year two 41 to 106 K Year 3 41 to 106 K. These ranges align with current market evidence on pilot to platform efforts and reflect the three to six month implementation windows seen in industry programs. Scenario modeling present three scenarios with explicit probabilities conservative sixty percent efficiency plus fifteen percent churn reduction five percent year one ROI eighty five to one hundred and twenty percent payback in eight to eleven months moderate thirty percent efficiency plus twenty five percent churn reduction fifteen percent plus five percent upsell year one ROI one hundred and fifty to two hundred percent Payback five to seven months aggressive ten percent efficiency plus thirty five percent or more churn reduction twenty five percent plus ten percent upsell year one ROI two hundred and fifty to three hundred and thirty three percent Payback three to four months document every assumption volumes, rates, attribution share, and loaded costs. For sensitivity analysis, vary adoption, volume, and deflection by plus or minus ten points to identify breakpoints that matter to finance. Metrics dashboard and tracking. Pre implementation baseline. Treat baseline data as your before photo. Collect at least three months of pre implementation metrics. Six to twelve months is better for smoothing seasonality, capture median and P ninety for response and resolution times, CSAT or NPS, agent productivity, escalation rate, cost per interaction, and error or rework rates, implementation tracking weekly, uptime and availability, agent accuracy and human escalation rates, adoption, spend versus budget, ongoing performance dashboard monthly metric average response time baseline four point two hours current one point eight hours target less than two point zero hours Trend Down fifty seven percent metric CSAT baseline three point eight five point zero current four point three five point zero target greater than four point two trend up thirteen percent metric cost per interaction baseline twelve dollars fifty cents current eight dollars seventy five cents target less than nine dollars trend down thirty percent metric churn rate baseline twelve percent annual current nine point five percent annual target less than ten percent trend down twenty one percent Quarterly board summary present a single cohesive view of progress and impact how many interactions the AI system handled versus human teams, the revenue retained or generated with supporting methodology, costs avoided with documented, and the cumulative ROI to date. Close with a brief view of key risks, mitigations, and next phase recommendations to keep oversight focused on outcomes rather than activity. Company size specific guidance 1 million to 10 million revenue, less than 100 employees. Focus on quick wins in customer support triage and lead qualification. Plan 25K to 75K for a pilot and 75k to 150k to scale. Expect positive ROI in 6 to 12 months. Choose turnkey tools and avoid heavy custom builds. Mid-market 10 million to 100 million revenue, 100 to 1000 employees. Target department level transformation, customer success swarms, sales enablement, or marketing automation, plan 100k to 350k for a comprehensive solution. ROI often turns positive within 3 to 9 months if you have clean data and exec sponsorship. Enterprise 100 million plus revenue, 1000 plus employees. Pursue platform plays across multiple functions. Budget 250k to 1 million plus for an enterprise capability with shared APIs, identity, and observability. ROI is typically 6 to 18 months, but scales to a very large absolute value. Build an internal center of excellence to standardize patterns and avoid one-off rebuilds. These ranges reflect current enterprise patterns and the platform investments described in Scale Up Casework. Real-world ROI examples. Customer service transformation In 2025, Verizon reported a nearly 40% sales increase, attributed to an AI assistant that supports 28,000 service representatives with faster answers from 15,000 internal documents. The system reduced call times and unlocked more time for value-added sales conversations. Sales Enablement. Lumen technology sales teams utilized Microsoft Copilot to compress seller research from 4 hours to 15 minutes, resulting in a projected$50 million in annual time savings value. The program combined with Copilot with connectors to enterprise data, allowing the agent to summarize interactions, harvest insights, and propose next steps within the seller workflow. Marketing automation A recent Forrester TEI study of an agentic platform found a 333% ROI and 12.02 million NPV over three years with under 6 month payback, 85% reduction in review times, and 65% faster onboarding for the composite enterprise studied. Common ROI pitfalls to avoid. Vanity metrics. Agent answered 10,000 queries is not a result. Agent resolved 35% of Tier 1 inquiries with 4.25 CSAT, saving 280 hours weekly is a result. Boards fund outcomes, not volume of AI activity. Report throughput only insofar as it drives outcomes, such as resolution rate, cycle time, unit cost, revenue, or risk. Tie each metric back to the four pillars efficiency, revenue, risk, and agility, so finance can book the impact. No baseline. If you do not capture pre implementation performance for 3 to 6 months, you cannot prove a delta. Treat baseline capture as phase zero. Attribution errors. Overclaiming AI results when multiple initiatives drive the same KPIs can undermine credibility. Attribute 50 to 70%. This figure really depends on the project specifics of the ROI, compared to AI, when multiple initiatives contribute and document your method. Ignoring hidden costs. Software is not the total cost. Budget 20-30% for change management, and 20% annually for ongoing optimization. These are common ranges in enterprise programs and are frequently underestimated in early plans. Also, call out common omissions, including security review, procurement cycles, data labeling, model risk management for regulated entities, and integration hardening. Short-term thinking. AI agents learn and improve with use and time. Evaluate their performance over 12 to 24 months and show learning curves with 30, 60, 90 day checkpoints, plot adoption, human in the loop rate, resolution rate, and unit cost. Tie the trend line to the ongoing performance dashboard cadence, so improvements flow. Into the ROI rollup. Strategic missteps. Don't try to boil the ocean. Start with a narrow, high paying business use case tied to a PL owner. Technology follows strategy, not the reverse. Treat the program as a product with weekly optimization and monthly performance analysis. Build the governance, security, and integration rails early to avoid the pilot traps. Conclusion From measurement to action. Make the mindset shift. AI agents should move you from a cost center to a profit center. With a scoped use case, clean data, and change management. A six to twelve months to payback is a reasonable target. Build once. Expand across adjacencies. Pilots have value only when they lead to production at scale. This is where the four pillars show up on financial statements. Three actions for immediate impact. 1. Run the three tier readiness assessment. Close obvious gaps. 2. Build your business case. Gather baseline metrics now, even if the build doesn't start until next quarter. Model conservative, moderate, and aggressive scenarios, and stress tests with finance. 3. Start with one high pain, measurable workflow, budget for integration and change management. Instrument from day one. The companies that achieve 200-333% ROI have clear problem statements, invest in readiness, and measure rigorously. They also treat AI as a strategic transformation, not a tooling project. The companies that join the failure statistics follow a script too. They skip the basics, underestimate data and integration, ignore change management, and expect magic results instead of redesigning their processes to achieve optimal business outcomes. The choice is yours.