Enterprise Artificial Intelligence: Building Trusted AI in the Sovereign Cloud

Chapter 5: Data Governance—The Foundation of Trusted Enterprise AI

OpenText

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 31:30

Before AI can think, it must trust the information it’s built on—and governance is what makes that possible. Explore the core pillars that form the backbone of trusted, compliant, and secure intelligence. 

SPEAKER_00

Chapter 5. Data Governance The Foundation of Trusted Enterprise AI Before AI can think, it must trust the information it's built on. Governance is what makes that possible. It's the discipline that turns scattered content into a coherent, compliant, and usable asset. One that can safely feed intelligent systems without compromising security or integrity. Enterprise information management treats governance as an operating principle, not a checklist. It rests on four interlocking pillars metadata, permissions and access control, retention and lifecycle management, and auditability. Each one defines how data behaves across its lifespan, and together, they form the backbone of trusted intelligence. In this chapter, we'll examine each of these pillars and their relationship to optimized, compliant, and secure AI. Forrester forecasts that AI governance software spending will quadruple by 2030. Good governance is good business. Information governance is the practice of implementing policies, processes, and controls to manage information in support of regulatory, legal, risk, environmental, and operational requirements. As volumes of enterprise information increase, so too does the need for digital governance to ensure that this information is managed, secured, and searchable. From a technology perspective, governance relies on the effective management of information throughout its life cycle, from creation or capture and classification to long-term archival or deletion. Successful information governance programs demand that companies balance the needs and priorities to mitigate legal and business risk with the costs required to manage both unstructured and structured information. For an information governance strategy to be effective, key resources and stakeholders need to be identified, empowered, and supported. Policies must be incorporated into relevant processes. Education and training should be provided to all employees, technology infrastructure optimized, and the appropriate solutions implemented to support secure and reliable operations. In the following feature, ASR Nederland is demonstrating how good governance is good for their business, enabling them to comply with regulations and providing a strategic advantage through improved customer service. Case study ASR Netherland. One of ASR's core business processes is disability income insurance. Previously, this claims process was paper driven. Both medical and technical information were kept in one folder and accessible to unqualified personnel, consequently, leading to a nonconformance with Dutch privacy law. In addition, ASR required a significant amount of storage space to store the continuously growing folders. ASR recognized the need for a solution that would improve business processes, enable collaboration between departments, reduce costs across the organization, and permit only authorized access to information to comply with regulations. Using a combination of business process modeling and operational improvement solutions, ASR has been able to modernize existing processes while responding to legislative change. For example, medical and technical information related to disability claims are now separated and only accessible to qualified personnel, helping ASR comply with privacy legislation. As well, the entire claims management process can be measured to give management visibility into processes. The flexible environment supports a new way to benchmark so that business activities can be monitored across numerous divisions. The solution gives ASR an enterprise-wide standard claims processing system that has dramatically improved internal efficiencies and increased productivity. Employees now process 80% of the claims on time, which has led to a 25% reduction to the claims processing team, and services costs and indemnity have been significantly reduced, all of which enable ASR to deliver new products faster, comply with regulations, and provide better customer service. Now that we've seen how data governance benefits the enterprise, let's consider the four key pillars of strong data governance introduced at the beginning of this chapter. Pillar one, metadata, the context behind every decision. Metadata is the DNA of digital information, the hidden context that tells systems what something is, where it came from, and how it should be used. It links content to business purpose and transforms raw data into something searchable, governable, and meaningful. Information enters the enterprise through many doors. Some is born digital, created by people using word processors, spreadsheets, CAD software, or email clients. Some originates in business systems, generated by enterprise resource systems, ERPs, customer relationship management, CRM, or databases with defined schemas and relational structures. Other content is captured from analog sources, as scanners. Then there's machine data from sensors, logs, and telemetry, and web and social content from internets, collaboration tools, and portals. Organizations also produce multimedia, training videos, marketing assets, and recorded meetings, all of which carry operational and legal value. Managing this diversity requires a control layer. Metadata, classification, retention tags, provenance, and sensitivity serves as that control plane. It allows automated permissioning, targeted search, version control, and records enforcement. It's also the foundation for responsible enterprise AI. Without metadata, an AI model can't distinguish between a draft and a final version, or between a public brochure and a privileged client file. Metadata provides the essential context: who created something, when it was changed, where it came from, and how sensitive it is. These details give meaning to raw content and help both people and intelligent systems understand what can be trusted, what can be shared, and what should be protected. Without unified metadata, AI training is unreliable or unsafe. EIM provides the framework to map governance directly to information models, ensuring that automation and AI respect the same business, legal, and ethical boundaries that already define trusted data. Metadata isn't a static label, it's a living framework for policy enforcement and machine reasoning. As AI matures, metadata becomes the connective tissue between governed data and intelligent action. Discover how HBO relies on metadata to consolidate and manage assets throughout their life cycle. Case study. HBO HBO is America's most successful premium television network, offering rich digital media content, blockbuster movies, innovative original programming, provocative documentaries, concert events, and championship boxing. HBO sought a solution that would allow them to easily access and share digital content both within HBO and the larger Time Warner family. The requirements for the overall system functionality and user experience entailed the system handling large volumes of content, as well as addressing disparant databases, workflows, and use cases for each of the organizations. HBO's media management implementation encompassed all of HBO's digital photographs, supporting such areas as marketing, promotions, advertising, and sales. These assets can range from location shots from films to a gallery of quality professional photos of HBO celebrities. Part of their overall strategy was to ensure careful management of metadata. Assets are tagged with corresponding metadata, such as contractual information, as early as possible to ensure that metadata travels with the asset throughout its life cycle. This meta tagging process is enforced with an embedded workflow component. The HBO Digital Asset Management System is accessed by all of the regional offices and currently holds more than 325,000 assets. If governance defines the rules, permissions enforce them, one decision at a time. Pillar two permissions and access control. Who can see what, when, and why? Permissions define the boundaries of trust. They determine who can view, edit, or share information, and under what conditions. For decades, these principles have protected corporate and personal data. In the AI era, they take on new urgency. Every decision an intelligent system makes depends on access, what data it can see, what it can learn from, and what actions it's authorized to perform. In an enterprise information environment, permissions are not simple IT switches. They are the enforcement layer of governance. Version control, workflow approvals, records holds, and selective publication all depend on them. Effective permission models regulate not only what users can do, but also when and why. A document that can be edited today might be locked tomorrow under a regulated process or legal hold. This dynamic control ensures that information remains traceable and trustworthy, even as it moves through the complex life cycles and collaborative environments. Modern EIM systems achieve this precision through granular permissions. Every object, document, folder, workflow, or image carries its own security profile, defining access for each group user and group across the system. At enterprise scale, where users may number in the hundreds of thousands, these models can translate into billions of unique permission combinations. Yet this complexity is necessary. Without the ability to assign security at the most granular level, an information system cannot truly be considered secure. It's this flexibility that allows organizations to emulate the physical controls of a secure workspace, digitally and at scale. As AI becomes another user in the system, those same permission structures must extend to intelligent agents. If a document is confidential, the AI must know it too. Permissions are no longer just about control, they're about confidence, ensuring that every person and every system interacts with information purposefully, within defined boundaries, and under full accountability. This is how organizations protect privacy, preserve competitive advantage, and ensure that AI operates safely inside the zones it's authorized to learn from. The case study below, featuring a European investment bank, illustrates the effective use of permissions in classifying documents across locations to achieve operational and governance objectives. Case study A European Bank. An investment bank in Europe finances capital investments aligned with European Union policy objectives. The literal infrastructure of a more integrated Europe. With operations spanning roughly 150 countries outside the EU, secure and efficient remote access to documents isn't a convenience. It's mission critical. To achieve this, the bank implemented an EIM system as part of a broader IT modernization effort to transform every major process in the bank, borrowing, lending, and administration. The system was fully embedded into the bank's IT ecosystem so that content, data, and workflows could move seamlessly between systems, ensuring consistency and compliance across all operations. Governance lies at the heart of this system. The investment bank developed a bank-wide taxonomy that defines not only how content is categorized, but how it aligns with business processes and regulatory frameworks. Based on international best practices, including the DIRKS methodology and the ISO 15489 standards, the taxonomy is paired with a sophisticated access control model that applies at the highest level of classification. Together, these models form a living governance framework. The taxonomy maps what information exists and where it belongs, while the permissions model governs who can access it, under what conditions, and for what purpose. The result is a digital knowledge map that reflects the institution's structure, accountability, and decision rights. This disciplined information architecture now provides the foundation for AI enablement. With a consistent taxonomy and granular permissioning, they can train AI tools to retrieve, summarize, and classify documents safely, knowing that every action taken by an intelligent agent adheres to the same access and compliance rules as a human user. Governance ensures that AI doesn't just automate tasks, but operates within the same boundaries of trust that define the bank's human workflows. The results validate the approach. Two months after launch, user adoption rates were 20% higher than projected, with 100% of vital records from new lending and borrowing operations supported in the system. Within weeks, the repository contained over 600,000 documents, growing by roughly 100,000 per week. This success demonstrated that when governance, taxonomy, and access control work together, they don't slow innovation, they make it scalable. Pillar three, retention and lifecycle management, knowing when to keep and when to let go. Information governance isn't just about storage, it's about stewardship. Every piece of content has a life creation, use, revision, retention, and eventual disposal. Managing that life cycle is how organizations stay compliant, efficient, and sustainable. Regulated information or records come from every corner of the enterprise. ERP and CRM systems, email, documents, scanned paper, telemetry, medical devices, even aviation maintenance systems. The first step in governance is capture, bringing this information in through the controlled channels such as digital mailrooms, system connectors, or APIs. Each record must arrive with its metadata, timestamps, and provenance intact to ensure authenticity and legal defensibility. Governance begins not when information is stored, but at the moment it enters the system, when the trail of trust is first created. Once captured, records are distributed across layered storage environments that reflect their purpose and risk profile. Operational systems handle active records. Content management repositories enforce versioning, classification, and retention, and immutable archives preserve communications and evidence for legal or regulatory use. In analytics environments, regulated data may be tokenized or masked to protect privacy while still enabling insight. Long-term archives on tape, in object storage, or in sovereign clouds provide non-erasable retention where required by law. An EIM platform embeds retention policies directly into the systems where content lives. This ensures that the same document that supports a business decision today can be archived tomorrow or automatically deleted when its legal or operational value expires. The same principles that govern enterprise records now extend to artificial intelligence. As AI systems generate, consume, and learn from enterprise content, their inputs and outputs must be treated with the same rigor as regulated data. Each model interaction becomes its own record, subject to capture, classification, retention, and auditability. Governance ensures that AI learns from trusted sources, acts within defined boundaries, and produces outcomes that are explainable, defensible, and aligned with enterprise policy. In this way, the disciplines that built confidence in data governance become the guardrails for responsible intelligence. Pillar four, auditability. Proof that governance works. In traditional records management, auditability meant logs, version histories, and paper trails. In the AI era, it also means model transparency, understanding what information shaped an outcome. Auditability should be incorporated into the content lifecycle. Every document, transaction, and system event carries a traceable history of changes and approvals. When extended to AI, the same principle provides explainability, showing not just what a model decided, but which data contributed to that decision. This visibility is what turns governance into trust. Auditability assures regulators, executives, and the public that automation operates within defined boundaries. It transforms compliance from a reactive process into a verifiable standard for responsible intelligence. These governance capabilities are bundled in an EIM platform in the cloud. With the evolution of AI, governance has to come first because it defines the rules of engagement between people, data, and intelligent systems. Metadata provides the map. Permissions define access. Life cycle management ensures balance, and auditability proves accountability. Without these foundations, AI is guessing. With them, it becomes part of a disciplined information ecosystem. One that learns, reasons, and acts within the boundaries that make intelligence safe, lawful, and human aligned. More than 100,000 rules and regulations and growing. North America. Dodd Frank. PCI-DSS Pipada. SEC Rule 17A-4, Sarbanes Ackley, Europe and Asia, Basel III with Basel II, Capital Accord, Financial Services Authority, UK Bribery Act, BSI, PD5000, Mobile Payments Security in Europe, UAE wallet, PSD two, Financial Inclusion, SEPA slash E SEPA, SEPA for cards, NPCI, Global Facta, Basel III, Capital Norms, Basel and Intraday Liquidity Norms, Real Time Retail Payments, Anti-Money Laundering, AML, Anti-Terrorism Financing, ATF, ISO 20022 Standards and Payments, CPSS-IOSCO, a complex governance landscape. In today's global marketplace, the regulatory landscape is complex, especially for global firms. Organizations are subject to industry specific regulations and standards as well as regional or national regulations. According to these regulations, they are held accountable for their actions and must be able to access years of historical data in response to requests for information at any given time. The relationship between compliance and governance is reciprocal. Compliance serves as a driver for information governance. And information governance, in turn, can simplify compliance. In the face of growing volumes of data, there is a strong need for governance programs to help transform organizations to enable them to benefit from the better management of their information. Enterprises that implement EIM as a governance platform are realizing the opportunities it gives them to drive business transformation efficiently and successfully through optimized intelligence and AI. Compliance, sovereignty, and the shape of modern governance. Data sovereignty has moved from a compliance footnote to a design principle. It now defines where data lives, who can reach it, and under what jurisdiction those actions fall. In an AI-driven world, this matters deeply. Models trained or hosted in one region may still be governed by the laws of another. The result is that sovereignty is no longer a legal abstraction. It's an architectural constraint. Every storage choice, API, and training data set must now account for the regulatory geography it touches. Privacy laws and regional regulation. Europe set the global pace with the General Data Protection Regulation. GDPR. It transformed cross-border data flows from a technical assumption into a legal engineering challenge. GDPR codified principles of lawfulness, purpose limitation, minimization, and accountability, mandating clear consent, impact assessments, and data subject rights. Its enforcement has reshaped how enterprises design systems, data inventories, metadata driven workflows, and automated deletion policies are now baseline governance features, not optional controls. The EU Data Act extends those ideas to cloud mobility. Providers must now support interoperability and enable customers to exit cloud environments freely. No lock-ins, no excessive transfer fees. For architects, that means building with portability in mind, open standards, reversible formats, and cloud independence by design. Sovereignty in Europe is being legislated into existence. Across the Atlantic, the United States is assembling a de facto federal standard one state at a time. Over twenty states now enforce their own comprehensive privacy laws, each defining consent, sensitive data, and user rights differently. This patchwork demands policy as code, automated rules that adapt to jurisdictional nuance and ensure the right law applies to the right record, user, or transaction. The Cloud Act further complicates matters by granting U.S. authorities access to data held by U.S. based providers, even when stored abroad, forcing global enterprises to think carefully about contractual control and storage sovereignty. Canada's Layered Model. Canada approaches sovereignty through layered accountability. Federally, PIPADA or the Personal Information Protection and Electronic Documents Act, sets the baseline for responsible handling of personal information, permitting cross-border transfers but maintaining that organizations remain accountable for protections end to end. Its forthcoming replacement, the Digital Charter Implementation Act, Bill C27, introduces the Consumer Privacy Protection Act, CPPA, a data protection tribunal and the Artificial Intelligence and Data Act designed to govern responsible AI development and deployment. At the provincial level, compliance becomes more granular. BC's FOIPA, or Freedom of Information and Protection of Privacy Act, reforms have relaxed residency restrictions for public data, while Ontario's health privacy law, the Personal Health Information Protection Act, FIPA, continues to enforce strict standards for health information. Financial regulators such as OSVI or Office of the Superintendent of Financial Institutions through Guidelines B-10 have elevated data location and cloud oversight to board level responsibilities. The message is consistent. Sovereignty in Canada is both practical and provincial, demanding careful mapping of where data resides and who can access it. Global compliance and industry specific rules. Beyond North America and Europe, data sovereignty pressures are global. China's personal information protection law requires explicit security assessments for cross-border transfers of important data. India's Digital Personal Data Protection Act adds localization style provisions and new transfer conditions that could influence where and how AI workloads operate. Each regulation tightens expectations for lawful transfer, explicit consent, and retention discipline. Industry regulations amplify these demands. The U.S. HIPAA, or Health Insurance Portability and Accountability Act, is framework governs electronic health records with strict access, control, encryption, and breach notification. The FDA's Food and Drug Administration's 21 CFR Part 11 sets standards for trustworthy electronic records and signatures in regulated manufacturing and clinical environments. The FTC, or Federal Trade Commission, enforces reasonable care over consumer data security, while the FAA, or Federal Aviation Administration, defines authenticity and traceability requirements for digital maintenance records in aviation. Together they reinforce a common truth. Regulated data must be captured, retained, and auditable throughout its life cycle. Sovereignty and AI. For artificial intelligence, these laws translate into operational limits and design choices. Enterprise AI cannot learn from what it cannot lawfully see. Sovereign clouds, built to localize storage, processing, and access, are becoming the architectural answer to regulatory fragmentation. They allow organizations to deploy AI where the data lives, respecting jurisdictional boundaries while maintaining the control needed for compliance and trust. As EAI models grow more agentic, sovereignty will define their perimeter, what they're allowed to read, what they can retain, and how their actions are logged and explained. Compliance is no longer about static records. It's about dynamic systems that think, learn, and act under legal supervision. An enterprise information management platform with embedded governance is now the operating system for intelligence itself. Governance as the operating system of trust. In a world where information moves across borders, clouds, and algorithms, governance defines the rules of engagement. It ensures that data remains accurate, traceable, and defensible, no matter where it travels or how it's used. Effective governance requires more than documentation. It demands executive sponsorship, streamlined processes, automated policy enforcement, and identity-centric security. It requires that every action on data, from capture to disposal, be visible, auditable, and aligned with legal and ethical standards. Modern information management frameworks now embed these controls directly into daily operations. Metadata, permissions, and retention aren't afterthoughts, they're embedded logic that keeps systems honest and AI accountable. By treating information as a managed asset, one with provenance, purpose, and lifecycle, organizations transform governance from a cost of doing business into a source of competitive advantage. Ultimately, governance is what allows enterprises to trust their intelligence. It connects the ethics of how we manage information with the mechanics of how AI learns it. When done right, governance doesn't slow innovation, it makes it sustainable. We discuss the governance of AI in more detail in the following chapter. The FAST Five Download one, make governance an enterprise mandate. Establish a cross-functional governance council that includes business, IT, legal, and compliance leaders. Give it authority to set enterprise-wide data policies. Approve AI use cases, and monitor adherence. Governance isn't an IT project, it's a management discipline. Two, operationalize permissions and access control. Move from static role-based permissions to dynamic policy driven access. Map who can see or use specific information and extend those same controls to AI systems. Treat every AI interaction as a governed event with audit trails, expiry dates, and explicit accountability. Three, map and classify critical data assets. Conduct enterprise data inventories to locate high value, regulated, and sensitive information. Use EIM tools to tag content with metadata, ownership, sensitivity, and retention so it can be used safely for AI training, automation, and analytics. Four, embed compliance and sovereignty into architecture. Design for jurisdictional complexity from the start. Choose sovereign or regional cloud configurations where data residency matters. Automate compliance through metadata and policy as code, so rules about where data can live or move are enforced by design, not by audit. 5. Govern AI like you govern data. Treat models as managed assets with the same expectations as data, documented provenance, life cycle control, and retraining governance. Require that every AI initiative demonstrate lawful data use, explainable decisions, and measurable ROI before it scales.