Overview
CIOs should treat artificial intelligence (AI) models as lifecycle-managed dependencies, not permanent infrastructure. Sora is the warning shot: OpenAI says Sora web and app experiences were discontinued on April 26, 2026, and the Sora API is scheduled for discontinuation on September 24, 2026.1 The practical takeaway is not to avoid managed AI platforms, but to not let a production workflow depend on a model you cannot replace.
Executive Decision: For production AI systems in insurance, financial services, and government/public sectors, approve new deployments only when the model dependency is inventoried, observable, regression-tested, and replaceable within a defined outage tolerance.
What Is Happening
AI models now behave like volatile platform dependencies. OpenAI’s deprecation page lists multiple retired or scheduled-for-retirement models, APIs, and snapshots, including the Videos API and Sora 2 model aliases scheduled for API removal on September 24, 2026.2 That makes Sora memorable, but not unusual.
Companion Tools
Download the AI Model Dependency Register and AI Model Migration Readiness Scorecard companion tools from the resource banner. These tools will show which AI workflows can proceed, which need remediation, and where executive risk acceptance is being used as a substitute for operational resilience.
The pattern is cross-vendor. Anthropic describes model states as active, legacy, deprecated, and retired. Requests to retired Claude models fail.3 Google says that once a Gemini API model is shut down, it is completely turned off and the endpoint is no longer available.4 Amazon Bedrock uses active, legacy, and end-of-life states, with transition planning, potential higher pricing during extended access, and failed requests after end-of-life unless a private access arrangement exists.5 Microsoft Foundry Models move through a lifecycle from preview to general availability to eventual retirement so customers can evaluate replacements and migrate workloads.6
While this is normal product hygiene for providers, it is a significant operating risk for enterprises that treat model names as if they were infrastructure primitives.
The Implications
The CIO must think about two risks: outages and behavioral drift.
A retired endpoint is easy to explain: the call fails. Explaining a replacement model is harder. Outputs may shift in tone, format, refusal behavior, reasoning style, latency, cost, citation discipline, or tolerance for ambiguous prompts. For a developer demo, that is manageable. For a claims triage workflow, credit policy assistant, citizen-service chatbot, fraud investigation aid, or contact-centre knowledge tool, “mostly similar” can still fail the control intent.
Anthropic recommends testing applications with newer models before retirement dates.3 Microsoft similarly frames lifecycle management around giving customers time to evaluate replacements and migrate workloads.6 CIOs should read that as a governance requirement. Model migration is not a patch. It is a business-behavior change that needs evidence.
The Tactive View
The current market question, “Which model is best?” is too narrow. The better operating question is: “How much of our business process have we quietly coupled to a model we do not control?”
The failure mode is not model retirement. The failure mode is treating model retirement as a vendor notice instead of a resilience event. Treat model replaceability as an operational resilience control, not an AI architecture preference.
Model retirement should trigger the same executive reflex as a material supplier change. CIOs should ask: what changes, who is exposed, what evidence proves continuity, and where do we need risk acceptance?
The model may be technical, but the exposure is operational due to changed outputs, changed controls, changed costs, changed evidence, and changed accountability. NIST’s AI Risk Management Framework is designed to help organizations manage AI risks to individuals, organizations, and society, and its generative AI profile emphasizes trustworthiness considerations across design, development, use, and evaluation.7,8
Do not suddenly ban managed models or insist every workload move to self-hosted open-weight models. That simply trades provider lifecycle risk for operating, security, infrastructure, talent, and cost risk. The board-safe posture is narrower: use managed models where they fit, but design operational AI so the model can change without breaking the service promise.
What Changes Operationally
Production AI needs a service-management treatment, not a lab treatment.
Teams must be able to trace each business service to the specific model it uses—including the provider, region, endpoint, contract path, data boundary, and fallback route. Architecture must separate business logic from model-specific prompts, request formats, model identifiers, and response parsing. Risk and compliance teams need to know when a model change affects explainability, recordkeeping, customer communications, or regulated decisions. Procurement must stop treating AI model access as a generic cloud consumption line and start asking what happens at deprecation, emergency retirement, price change, and regional unavailability.
The subtle operational mindset change is ownership. A model used by a production service should have an accountable service owner, instead of just an application team that “knows the prompt.”
For this article, a material AI workflow is one where model behavior can affect money, rights, obligations, regulated records, security outcomes, customer commitments, or public trust. The goal is not to impose enterprise architecture ceremony on every AI experiment; it is to apply controls that are proportional to the risk of the workflow and the cost of enforcing them.
Minimum Viable Controls for First Release
Before a material AI workflow goes live, require six controls:
- model registry entry;
- named lifecycle owner;
- representative prompt/output baseline;
- vendor deprecation watch source;
- fallback or degraded operating mode; and
- telemetry for model, provider, token usage, latency, error rate, and request volume.
Table 1 sets the control posture by workflow materiality so low-risk experiments are not burdened with production-grade review. For measured organizations with mature telemetry, OpenTelemetry’s generative AI semantic conventions provide a useful emerging vocabulary for GenAI events, metrics, model spans, provider-specific attributes, and client operations.9
| Workflow type | Required posture |
|---|---|
| Experiment or internal productivity | Inventory and acceptable-use guardrails |
| Non-critical operational support | Inventory, owner, telemetry, and lifecycle watch |
| Material workflow | Full approval gate, output baseline, fallback mode, and remediation path |
| Regulated or public-trust workflow | Full approval gate plus audit evidence and human-review design |
Table 1. AI workflow materiality tiers and required control posture
Decision Aid: AI Model Dependency Approval Gate
Table 2 converts model lifecycle risk into approval outcomes that architecture, risk, procurement, and service owners can apply during production review.
| Control question | Minimum evidence | Approval outcome |
|---|---|---|
| Where is the model used? | Service, workflow, model ID, provider, region, endpoint, API version, owner | Unknown dependency = Hold |
| What breaks if it retires? | Business impact, user group, outage horizon, manual workaround | High impact without workaround = Executive risk acceptance |
| Can we swap it? | Provider-specific code isolated; prompts and response schema documented | Hard-coded dependency = Approve with remediation |
| Will behavior change be detected? | Prompt/output baseline, refusal examples, formatting expectations, latency and cost thresholds | No baseline = Hold for material workflows |
| Are lifecycle signals monitored? | Vendor deprecation calendar, alert owner, quarterly review | No watch owner = Approve only for non-critical use |
| Is telemetry sufficient? | Model, provider, token use, latency, error rate, request volume | Missing telemetry = No scale |
| What is the degraded mode? | Manual queue, alternate model, feature disablement, customer messaging | No degraded mode = Executive risk acceptance |
Table 2. AI model dependency approval gate for production use
Use Table 2 as the approval shortcut: non-critical use cases may proceed with remediation; material workflows affecting money, rights, obligations, or public trust should not scale until all “Hold” items are closed.
-
Green: proceed when ownership, telemetry, lifecycle watch, and fallback expectations are clear for the workflow type.
-
Amber: proceed with remediation when the workflow is non-critical and gaps are owned, dated, and tracked.
-
Red: hold or require executive risk acceptance when a material workflow lacks a behavioral baseline, fallback mode, lifecycle owner, or auditable evidence path.
Execution reality: most organizations can implement the inventory and lifecycle-watch controls quickly. The heavier lift is creating behavioral baselines, fallback modes, and model abstraction for workflows where AI affects money, rights, obligations, security, or public trust.
Where to Focus First
Start with workflows where model behavior affects money, rights, obligations, or public trust.
- Insurance: focus first on claims triage, fraud indicators, underwriting assistance, and service-agent summarization. The first control is the output baseline: insurers need evidence that replacement models preserve structured fields, escalation triggers, and explanation quality.
- Financial Services: focus first on customer communications, investigation support, policy interpretation, software development assistants touching regulated systems, and operational resilience reporting. The first control is auditability: log the model, prompt version, output version, human review, and final decision path.
- Government/public sector: focus first on citizen-facing service assistants, caseworker copilots, grant or benefits intake, and security operations. The first control is degraded mode, where public agencies would need a human-review or service-continuity path when a model is withdrawn, unavailable, or behaviorally unsuitable. NASCIO’s 2026 Top Ten puts AI in the number-one position for state CIOs, while budget and cost control moved upward. Its 2026 cybersecurity study says CISOs are taking on AI governance and whole-of-state cybersecurity responsibilities amid tightening budgets and workforce challenges.10,11
Scenario: Insurance Claims Triage Under Model Retirement
This scenario shows how model retirement moves from a technical lifecycle notice into a business-process decision. Claims triage is a useful example because a model change can simultaneously affect structured data extraction, fraud escalation, manual workload, customer response time, and auditability. Every AI use case does not carry this level of risk. The point is that material workflows need a migration plan before the vendor’s retirement date becomes the organization’s operating deadline.
An insurer uses a managed large language model to summarize incoming property claims, extract key fields, and route suspicious cases to special investigations. The affected workflow is first-notice-of-loss triage. The failed dependency is a deprecated model snapshot embedded in the claims intake service, with response parsing tuned to that model’s output format.
The outage horizon is 30 days before retirement, when testing shows the replacement model returns better prose but less consistent structured fields. The degraded operating mode is a partial manual review queue for claims over a fraud-risk threshold, with automated summarization disabled for complex cases. The executive decision is to migrate the low-risk claims path first, hold high-risk fraud triage on the older model until validation is complete, and fund a prompt/output regression harness before the retirement date.
The tradeoff is speed versus control. Migration could be faster if all claims moved at once, but the insurer would risk routing errors in the workflow where mistakes are most expensive. The defensible decision is segmented migration: keep throughput for routine claims while protecting the fraud-control path.
Recommended Actions
Next 30 days: create the AI dependency inventory for production and near-production use cases. Find model IDs, endpoints, providers, regions, service owners, business workflows, data categories, and known deprecation dates. Add a simple policy: no production AI workflow without an owner and a retirement watch.
Next 60 days: build a model lifecycle calendar and connect it to change management. Vendor deprecation pages should not live in a developer’s browser bookmarks. Assign architecture or platform engineering to monitor notices, open change records, and maintain a 90-to-180-day migration view.
Next 90 days: preserve behavioral baselines. Keep representative prompts, expected output structures, refusal examples, exception cases, latency thresholds, token-cost baselines, and human-review triggers. This is not a benchmarking contest; it is a regression set for business behavior.
Contracting checkpoint: for material AI workflows, require procurement and legal to review model lifecycle terms before renewal or production scale. The minimum negotiation points are retirement notice period, emergency discontinuation rights, legacy-access pricing, migration support, data retention or deletion at end-of-life, regional availability commitments, audit evidence, and whether service-level agreements apply to the specific model or only to the broader platform. Do not assume an enterprise cloud agreement automatically gives sufficient protection for model retirement.
Next two quarters: introduce minimum viable model abstraction for material workflows. Separate provider-specific calls from business rules, prompt templates, response validation, audit logging, and fallback logic. Avoid building a grand universal abstraction layer unless multiple critical services actually need it.
Within 12 months: perform one model evacuation drill for a high-value but bounded workflow. The point is to prove the organization can move from Model A to Model B without discovering ownership, testing, logging, procurement, and legal issues during a forced retirement.
Recommended Owner Mapping
The action plan in Table 3 assigns directional ownership so the CIO can move from recommendation to delegation without creating a full responsibility assignment matrix.
| Action | Primary owner | Supporting owners |
|---|---|---|
| 30-day dependency inventory | Enterprise Architecture or AI Governance | Platform Engineering, Service Owners, Security |
| 60-day lifecycle calendar | Platform Engineering or Architecture Governance | Vendor Management, Change Management |
| 90-day behavioral baselines | Product/Application Owners | Quality Engineering, Risk, Business Process Owners |
| Contracting checkpoint | Procurement / Vendor Management | Legal, Security, Enterprise Architecture |
| Two-quarter abstraction | Application Architecture / Platform Engineering | Security, Service Owners, Procurement |
| 12-month evacuation drill | Service Owner | Platform Engineering, Risk, Business Continuity, Business Owner |
Table 3. Directional ownership for recommended AI model lifecycle actions
Tradeoffs and Execution Burden
- AI dependency inventory: Light to moderate. Likely owners: enterprise architecture, platform engineering, AI governance, and service owners. Bottlenecks: shadow AI, unclear application ownership, and missing cost-centre tagging. Friction point: teams may resist exposing experimental dependencies that have quietly become production.
- Lifecycle calendar and change trigger: Light. Likely owners: architecture governance or platform operations. Bottlenecks: inconsistent vendor notifications and fragmented provider usage. Friction point: this looks administrative until a retirement notice arrives.
- Prompt/output regression baseline: Moderate. Likely owners: product teams, quality engineering, risk, and business process owners. Bottlenecks: lack of historical prompt logs, privacy constraints, and disagreement over “acceptable” output. Friction point: business owners must define quality, not just approve technology.
- Model abstraction layer: Moderate to heavy. Likely owners: application architecture, platform engineering, security, and vendor management. Bottlenecks: hard-coded model assumptions, provider-specific tooling, and response-schema fragility. Friction point: abstraction can become over-engineering unless limited to material workflows.
- Open-weight or self-hosted model option: Heavy. Likely owners: architecture, security, infrastructure, data science, legal, and finance. Bottlenecks: graphics processing unit capacity, patching, model security, evaluation skill, and operational support. This is justified only where control, sovereignty, cost predictability, or continuity outweighs the burden.
Bottom Line
Act now, but narrowly. AI model retirement is not an edge case; it is a predictable feature of managed AI platforms. Model replaceability is an operational resilience control, not an AI architecture preference. CIOs should require production AI systems to be inventoried, monitored, tested, abstracted where the dependency is material, and replaceable.
The board-safe message is simple: “We are not betting the business on one model. We are using AI as a managed service dependency with lifecycle controls.” That is less exciting than chasing the best model of the month, but it is also more likely to survive the quarter.
Evidence and Sources
- OpenAI Help Center. 2026. “What to Know about the Sora Discontinuation.” Updated April 2026.
- OpenAI Developers. 2026. “Deprecations.” OpenAI API Documentation.
- Anthropic. 2026. “Model Deprecations.” Claude API Docs
- Google AI for Developers. 2026. “Gemini Deprecations.” Gemini API Documentation.
- Amazon Web Services. 2026. “Model Lifecycle.” Amazon Bedrock User Guide.
- Microsoft Learn. 2026. “Foundry Models Lifecycle and Support Policy.” Updated April 24, 2026.
- National Institute of Standards and Technology. 2023. “AI Risk Management Framework.” NIST.
- Autio, Chloe, Reva Schwartz, Jesse Dunietz, Shomik Jain, Martin Stanley, Elham Tabassi, Patrick Hall, and Kamie Roberts. 2024. “Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile.” National Institute of Standards and Technology, July 26, 2024.
- OpenTelemetry. 2026. “Semantic Conventions for Generative AI Systems.” OpenTelemetry Documentation.
- NASCIO. 2025. “State CIO Top Ten Policy and Technology Priorities for 2026.” National Association of State Chief Information Officers, December 15, 2025.
- NASCIO and Deloitte. 2026. “2026 NASCIO-Deloitte Cybersecurity Study.” National Association of State Chief Information Officers, April 27, 2026.