The Digital Transformation Playbook

Turning AI Failure Rates Into A Real Strategy

Kieran Gilmurray

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 30:07

Most AI programmes don’t fail for lack of ideas; they fail because teams choose the wrong path and underestimate the system around the model. We dig into a candid build vs buy vs hybrid framework backed by current research, hard numbers, and battle‑tested operating patterns so you can make a decision that ships value fast and scales safely.

TLDR / At A Glance

  • market failure rates and momentum reversal
  • deterministic versus probabilistic architecture reality
  • DIY failure patterns across data, integration, and governance
  • talent scarcity, wage premiums, and TCO impact
  • real costs, timelines, and hidden line items
  • when building makes sense and when it does not
  • partner advantages for time to value and risk control
  • three‑year TCO ranges for DIY and partner paths
  • executive decision matrix and rules of thumb
  • hybrid roadmap with phased capability building

We start by grounding the stakes: high abandonment after proof of concept, scarce senior AI talent, and governance gaps that surface at scale. From there, we contrast deterministic software with probabilistic, agentic systems and explain why early architectural choices compound through data pipelines, retrieval strategies, identity, permissions, observability, and compliance. 

You’ll hear why DIY efforts often stall on brittle data, weak integration, and unclear goals, and how professional partners change the odds with established patterns, evaluation harnesses, and safety from day zero.

Then we get practical. We map real costs and timelines, separating model hype from the true TCO drivers: data plumbing, security, monitoring, and continuous testing. 

We walk a three‑year comparison of an internal build versus a partner‑led implementation running on a commercial platform, highlighting time to first value, risk profiles, and success probabilities. 

Our decision framework for executives distils five dimensions—core competency, urgency, capability, budget, and long‑term vision—into plain rules of thumb, plus a scenario matrix that points to build, buy, or hybrid recommendations.

Finally, we outline a phased hybrid plan that most teams can execute: partner‑led delivery in months one to six to prove value; collaborative ownership by month twelve to codify playbooks; and internal‑led innovation thereafter to invest in defensible differentiators on top of a proven substrate. 

If you’re serious about moving from POCs to production, reducing risk, and protecting your moat, this guide will help you choose with your eyes open. If the conversation helps you, subscribe, share with your team, and leave a quick review so others can find it.

Like some free book chapters?  Then go here How to build an agent - Kieran Gilmurray

Want to buy the complete book? Then go to Amazon or  Audible today.

Support the show


𝗖𝗼𝗻𝘁𝗮𝗰𝘁 my team and I to get business results, not excuses.

☎️ https://calendly.com/kierangilmurray/results-not-excuses
✉️ kieran@gilmurray.co.uk
🌍 www.KieranGilmurray.com
📘 Kieran Gilmurray | LinkedIn
🦉 X / Twitter: https://twitter.com/KieranGilmurray
📽 YouTube: https://www.youtube.com/@KieranGilmurray

📕 Want to learn more about agentic AI then read my new book on Agentic AI and the Future of Work https://tinyurl.com/MyBooksOnAmazonUK


Setting The Stakes With Hard Data

SPEAKER_00

Chapter 11. The Build vs. Buy Decision Framework. Over 80% of AI projects fail, which is roughly twice the failure rate of traditional IT projects. That is not Internet folklore. It's a sobering finding echoed in a 2024 study from the RAND Corporation, which analyzed the ANSI patterns that derail AI initiatives. In 2025, SP Global reported a sharp reversal in AI momentum. The share of companies that abandoned most of their AI initiatives jumped from 17% to 42% in a single year. Gartner's independent forecast aligns with this direction. At least 30% of Gen AI projects will be abandoned after proof of concept by the end of 2025, largely due to poor data, weak risk controls, and unclear business value. And MIT's Project Nanda adds a more stinging coda. Only about 5% of generative AI pilots deliver rapid revenue acceleration. The rest stall with little measurable P and Dell impact whatsoever. That is the backdrop for your build versus buy decision on a genetic AI. You're not choosing between two tech strategies. You're selecting a path with explicit consequences for time to value, cost, risk, competitive advantage, and, most critically, your probability of shipping an AI solution or strategy that has an achievable and measurable business impact. In today's market, almost every company is investing in AI, yet just 1% say they have reached AI maturity. This chapter provides a candid view of AI to help you decide what to build, what to buy, and when a hybrid build and buy approach is a winning move. Understanding the high stakes. Traditional software is like following a recipe. If you use the same ingredients and steps, you'll always get the same result. That's called deterministic AI. Agentic AI is probabilistic. Results change with data, context, and feedback. Early architectural choices therefore echo for years through integration, operating costs, and compliance. The wrong early decision amplifies later risks, e.g., fragile integrations, hidden operating costs, and compliance gaps that only surface at scale. This chapter addresses this choice build in-house or buy with a partner. While McKinse highlights large productivity potential from AI, and reports that organizations are beginning to take steps that drive bottom line impact by redesigning workflows, only 1% of organizations say they have achieved AI maturity. A related reality is that a significant portion of life cycle costs accrues post-deployment, including maintenance, integration, hardening, monitoring, and governance. Early architectural choices compound in impact, both positive and negative, over the years, and utilize AI solutions. The AI DIY reality check. Internal builds often fail for predictable reasons. Most problems can be traced back to brittle data, weak application integration, poorly defined goals, or a lack of governance. To fix this, we must understand each fail pattern, why it blocks production-winning code, and fix it for that. The goal here is not to argue against building an application. It is to give you a clear risk map and checklist to manage against, if you choose the DIY path, so that you don't fail. Common failure patterns in internal AI development. Data quality traps. Machine learning systems do not figure it out later. They encode your current data reality. RAN's research report highlights data limitations as a primary driver of AI program failure. Why? Data leakage, biased data samples, or insufficient volume of quality data are the repeated sources of AI program failure. Enterprises systematically underestimate the effort required to clean, tag, govern, and serve data to AI agents reliably, without latency. Integration nightmares. Organizations must not treat the ML code as the entire AI ecosystem. An AI ecosystem encompasses data pipelines, retrieval layers, user access, identity, and permissions, as well as a messaging bus and a change management cadence. Governance afterthoughts. Teams that bolt on compliance late in the project lifecycle get burnt. Effective AI programs implement risk controls, logging, and human-in-the-loop checkpoints from day zero. We argued this in chapter 9, and in that chapter, when AI program guardrails were mapped to NIST and ISO standards for auditability. The Talent Gap Challenge. Talent is the choke point in AI. Scarcity and skill gaps slow teams more than tools or models. Most failures trace back to missing people, limited experience, and slow hiring. Scarcity raises costs. Conventional developers struggle with agents. Wage premiums push the total cost of ownership higher. The scarcity of expertise. Investment in AI has outpaced the labor market's ability to supply experienced talent. Bain projects a US AI hiring gap of approximately 50%, with demand for skilled AI talent potentially exceeding the talent supply by 700,000 workers over the next two years. Bain further reports that 44% of executives cite a lack of in-house AI expertise as a primary barrier to the implementation of Gen AI. Randstad's 2024 Global Survey likewise shows a steep rise in job postings requiring AI skills and persistent scarcity across markets. Traditional software developers may struggle to understand and implement AI agents, as agentic systems differ from conventional apps. They are probabilistic, depend on retrieval strategies and prompt orchestration, and require human-in-the-loop patterns that most teams have never shipped. These gaps show up in maturity data. McKinsey notes that organizations invest broadly in AI while remaining early in maturity. Capabilities and operating models lag ambition. Without specialists in prompt design, retrieval, and evaluation, and safety and monitoring, teams stack up proofs of concept and stall before production. The cost of competing for talent. Wages for AI-skilled roles have moved faster than general tech compensation. Employers currently pay distinct premiums for AI skills. For example, Oxford Internet Institute's 2025 analysis shows AI skills carry a 23% average wage premium across roles, and a higher differential in science and engineering. Outpacing many other degree premiums. Even if you hire AI generalists, upskilling costs and time to competence are non-trivial, and your freshly trained team is immediately on competitors' recruiting radar. IBM highlights a widening training gap, a 50% talent shortfall. Needed skills include Gen AI and LLM basics, prompt tuning, and the need for security and privacy skills, plus C-suite literacy. Real DIY costs and timelines. Budgets fail when leaders price only the model and ignore the system. Real cost sits in data plumbing, integration, security, and ongoing operations. This section outlines likely ranges for spending and timeline, and makes plausible estimates based on the cited sources. It flags hidden line items that trigger overruns and explains why underestimating testing adds months. Use it to sketch a credible three-year TCO and to choose a fast path to production. Financial Reality Published market ranges for custom enterprise AI builds are consistently six-figure to low seven figure when accounting for integration, data work, and security.$100,000 to$500,000 plus for enterprise grade implementations is common corridor. More ambitious, cross-system builds can cost between$2.5 million to$4.8 million all in. Hidden costs include building the data plumbing, model monitoring, compliance reviews, the cost of creating incident playbooks, and code rework as requirements evolve. Organizations that run under budget and disregard these costs are more likely to fail. Timeline Reality. Professional implementations typically reach production in 4 to 6 months. DIY software builds can take between 12 to 18 months before a stable production environment is created, if it even gets there. When companies say an agent's application will take between 10-12 weeks to go live, they are usually describing a minimum timeline for a targeted deployment with a very controlled scope. Credible guides recommend allocating around 30% of that time to testing. We have already observed that nearly half of POCs are scrapped before production. Yet, every month you delay, a competitor can lock in channel advantages and data feedback loops ahead of you. Risk factors. The statistical picture converges. 30-50% of initiatives never make it past the prototype stage, and 42% of firms scrapped most of their AI efforts in 2025. Scope creep, knowledge loss from attrition, and governance retrofits are the usual suspects. When building in-house makes sense. There are legitimate reasons for your internal team to build agentic AI systems. AI is a core differentiator. If your competitive moats hinge on proprietary data and workflow nuance, off the shelf, one size fits all templated AI products won't fit. You already have a seasoned team, real ML engineers, data scientists, and architects with production credits. You have time and budget, 12 to 14 months to learn, iterate, and harden, a tolerance for several failed experiments along the way. Your requirements are truly unique. Market tools don't address your constraints or controls. For everyone else, choose the hybrid path, buy a proven base to deliver value now, build internal capability in parallel. For planning ROI and proving value early, use the measurements approach. In chapter 10. The Professional Implementation Advantage. Professional implementation changes the odds. Experienced teams arrive with established patterns, controls, and the necessary skills on day one. They shorten the path to production, reduce rework, and surface risks early. The goal is not outsourcing for its own sake. It is a faster value, clearer costs, and a system your team can operate and extend. Battle Tested Architecture. Experienced partners have seen your movie before. They bring patterns, accelerators, governance templates, and program experience refined across dozens of deployments. They also know where projects can crack apart, data contracts, identity boundaries, and evaluation harnesses, and engineer those from day one. Speed to value. Reaching first production for a scoped agentic AI use case typically takes 4 to 6 months with an experienced partner. DIY efforts often need 12 to 18 months. Partners arrive with cross-functional teams across data engineering, machine learning, application integration, security, and QA. This avoids a hiring ramp for these skills. Work runs in parallel on data, integration, and safety, which shortens the critical path to production. Faster cycles create faster feedback loops, so you start collecting real user, cost, and quality signals while competitors are still in the staffing phase. Risk mitigation. Vendors amortize RD across many clients and ship with established governance. Professional teams front load risk identification, testing, and observability, which increases the likelihood that your pilot will reach production. And because the scope and deliverables are clear, the cost variance narrows compared with DIY teams discovering complexity mid-built. A practical TCO comparison. Over a three-year horizon, a DIY program commonly includes Year 1, hiring a core team, e.g., senior ML engineer, data engineer, backend integration engineer, product QA 300K to 500K Total Cash Comp Data and Infrastructure Setup 100K Application Development 200K to 400K Experiments and Rework Contingency 100K to 200K often needed and often underfunded. Years two to three per year team salaries for the internal build team noted above 300K to 500K Maintenance and Monitoring 100K Tools and Platform Licenses 50K Training and Upskilling 50K Total 3 years 1.5 million to 2.5 million Time to Product 12 to 18 months Success probability roughly one third Third Party Implementation Path typical three year view. This means hiring a professional third party implementation partner and running on a commercial platform. The aim is faster time to value, lower execution risk, and structured knowledge transfer to your team. Year one Partner services for discovery, solution design, data work, integration, security, QA, and knowledge transfer 150K to 300K Platform Subscription 60K Targeted Customization and Integration 50K to 100K Years two to three per year Platform Subscription 60K Partner Support and Maintenance 30K Internal Training Center of Excellence 20K Total K to 800K Time to Product 4 to 6 months. Success rates are closer to two thirds, according to MIT, which reports that purchase solutions paired with the right partners dramatically outperform in-house builds in terms of time to impact. These are planning ranges, not guarantees, but they reflect what the 2024 to 2025 body of evidence says about time, risk, and cost drivers. Actuals vary depending on the scope, security requirements, and usage volumes. See the ROI model scaffolding in chapter 10 for how to present this to boards. A decision framework for executives to use. The five assessment dimensions. 1. Core Competency Analysis. Is AI central to your differentiation? Do you possess proprietary data or workflows that a market product cannot capture? Will your agentic AI capability define your market position? Decision rule. If core, bias toward build with targeted partner support. If enabling, buy and configure. 2. Urgency and time to value. What is the real cost of waiting? Revenue. Share. Cost structure? Customer trust? Can you afford 12 to 18 months to ship and learn? Decision rule. Need results in less than 6 months. Buy. You can invest 12 to 18 months. Consider building. 3. Current technical capabilities. Do you currently have production grade ML talent and data engineers? Is your data clean, tagged, and accessible through governed APIs? Do you have evals, monitoring, and incident playbooks to find? Decision Rule. Missing two or more of the above. By strong foundation, consider build. 4. Budget Reality What can you fund over three years, including maintenance, governance, and iteration? Market evidence suggests that deployments of six figures or more are common. However, DIY typically carries higher variance in the longer payback period. Decision rule Constraint or uncertain budget by flexible multi-year funding. Long-term strategic vision Is building internal AI capability a multi-year priority? Will you reuse the platform across multiple systems? Are you willing to learn through controlled failure? Decision rule. One off initiative buy. Journey to an AI center of excellence. Build phased. A practical build versus buy decision matrix. Scenario AI is core plus strong internal capability plus flexible timeline. Build score high. Recommendation build with professional consultation. Scenario AI enables core functions plus limited capability plus less than six months urgency. Build score low. Buy score high. Recommendation buy. Scenario. AI differentiator plus some capability plus moderate urgency. Build score medium. Buy score medium. Recommendation hybrid. Buy and customize. Scenario. Build score low. Buy score high. Recommendation buy. Scenario. Strategic investment plus building capability plus long timeline. Build score high. Buy score medium. Recommendation phased. Buy, then build. Red flags for DIY. Proceed with extreme caution if you have zero in-house AI ML expertise. Your data governance is weak. Your timeline is less than six months. Your budget is volatile. Executive sponsorship is tentative. What to ask professional partners? To evaluate a professional implementation partner, you need to confirm that they can deliver production ready solutions in your environment on time, within scope, and hand over operations to your team. Ask each partner the same questions and request proof. Experience and expertise. How many agentic deployments to production in our industry? What is your production success rate and your average time to value? How do you handle quality gaps? Methodology. Show your implementation playbook, requirements, governance checkpoints, evaluation harness, and knowledge transfer. How do you manage scope changes and run post-deployment optimization? Cost and timeline. What are typical ranges for projects like ours? What's included versus not? Fixed price or value-based options. Who is on our squad? How senior are they? And how much of our team's time is required? Answers here are more predictive of success than any slide on AI Vision. The hybrid path forward. Hybrid is the practical default for most companies. You buy a stable platform and core components. You bring in a professional partner to ship the first production system. You grow a small internal team while the system goes live. The aim is speed now and control later. Why most organizations choose hybrid? The MIT reporting is clear. Programs that purchase core tools and partner for implementation materially outperform internal green field builds in terms of success rate and speed to measurable impact. Taking the hybrid approach is not some compromise or fence sitting. In most cases, the appropriate strategy. Is to buy the proven substrate and then build your differentiators on top. A phased capability build. Phase one, partner led months one to six. Ship the first production system with a professional partner. Make value proof the goal, not model cleverness. Establish governance, logging, and an evaluation loop. Phase two collaborative month seven to twelve. Co-develop the second wave. Your team assumes more ownership. Start your internal center of excellence and codify playbooks. Phase three internal led month thirteen plus. Your team leads net new builds. Partners supply specialist help, benchmarking, and architectural reviews. Your differentiators live here. What to build internally? Invest in AI literacy across roles. Hire for Keystone Skills, ML Engineers, Data Engineers, Evaluation, QA for AI, Prompt Retrieval Specialists, Partner for Cutting Edge Domains, Safety, Compliance, Emerging Frameworks. Write and maintain internal playbooks because your process knowledge will become compounding IP over time. Making your decision A 10-point self-assessment. Use the checklist that follows as a quick gate review with your CFO and head of engineering. Answer each line, yes or no. Then apply the rules of thumb below. The goal is clarity you can act on today. How to use the checklist Rules of Thumb. If any of the first three are no, objectives, capability, and data quality. Three year TCO, pause the build, choose buy or hybrid for now. If you have around four to six, yes, and you need results inside six months, lean by. If you have around six to eight, yes. And AI is not your core differentiator, buy with customization or go hybrid. If you have around eight to ten, yes. And AI is core to your edge, plan a build, ideally with targeted partner support. Now work through the 10 points below. Objectives are clear and quantified. Technical capability and data quality are honestly assessed. Three year TCO is modeled for both paths. Time to value needs are explicit. AI's role in competitive advantage is agreed. Budget and risk tolerance are aligned. Candidate partners are evaluated. Executive sponsorship is secured. Success metrics and ROI method are defined. Governance, safety, compliance, observability is in scope from day one. Recommended build by or hybrid development path by organization type. Organization type startups and SMBs. Recommended path by Rationale. You need speed, not a research lab. Focus scarce resources on product and customers. Add hybrid later after repeatable value. Organization type mid market. Recommended path buy with customization or hybrid. Rationale some engineering strength but limited specialized AI depth. Optimize for time to value while building internal capability alongside partners. Organization type Enterprises Recommended Path Hybrid or Selective Build. Rationale Mixed Portfolio Buy for Utility and Ingration Heavy Workflows. Build only truly differentiating capabilities. Many start with partners to accelerate first wins and de risk governance. Your next steps if you're choosing to buy or partner. One, document requirements and outcome metrics in business terms, specify use case, users, data sources, systems to integrate, constraints, security, privacy, compliance, and success metrics. two. Shortlist three to five partners with live customer deployments. Three. Request proposals with scope, timeline, price, and an AI governance plan. Four. Check references of current and former clients. Confirm the time to first value, scope, timeline, integrations, security, and support. Ask if they would rehire. Five. Run a four to six week pilot with a clear success bar and exit criteria. Include knowledge transfer and internal capability ramp up in the statement of work. If you're choosing to build, one run a readiness check on the team, data, access, and integrations, and record any gaps with the owners and corresponding dates. 2. Draft an 18 to 24 month hiring and training plan anchored to the roadmap. 3. Fund data plumbing and governance first. 4. Stand up evaluation and monitoring early. 5. Ship the smallest viable production ready AI version of your application, not a demo, and instrument it. 6. Commission architecture reviews from external experts at key project milestones before the pilot, before production, and before scale. 7. Expect and plan for timeline slips and budget variances. Manage scope effectively. If you're choosing hybrid, one. Buy proven components and safety slash observability layers. 2. Build your differentiators. 3. Select a partner for the first wave to compress time to value. 4. Run a structured knowledge transfer plan. 5. Hire and train for sustainability. 6. Transition ownership over twelve to 18 months. Conclusion. The decision that shapes your AI future. The weight of current evidence suggests that most internal only AI applications building approaches stall. Not because people aren't smart, but because AI systems are sociotechnical and the integration slash governance burden is heavy. The build versus buy call should follow a plain rule. Buy the proven base and partner to ship now. Build in-house later, only where it strengthens your moat. Executives who win combine pragmatism and ambition. They buy a proven base platform to get working systems into production quickly. They partner so their team learns and adopts the workflows and controls to run the system in production, and they build only where they can defend a moat with proprietary data, process, or brand. Done this way, the hybrid path becomes your on-ramp to a genuine AI capability. Your board will back it, your teams can operate it, and your customers will notice the change. Choose whether you build, buy, or adopt a hybrid approach to AI with your eyes wide open. Programs that use partners reach production faster and more often. Internal only efforts often stall or get cut. In AI, success follows the teams that execute, measure, and learn, not the teams that romanticize a build for AI's sake.