Most carriers are still pricing AI like SaaS — fixed seats,…
Observation
Enterprises are blowing through AI budgets in months, not years. Uber's engineers burned their entire 2026 token allocation in four months. 71% of companies exceeded AI budgets in 2025.
Angle
Most carriers are still pricing AI like SaaS — fixed seats, predictable renewals. Agentic workloads don't work that way. The cost unit is consumption, not license, and it scales with usage in ways finance teams have no model for yet.
Implication for P&C carriers
P&C carriers running claims triage, subrogation, or underwriting agents need consumption-based cost models before scaling. A single complex claims workflow can amplify token usage 55x versus a simple query. Budget overruns aren't a risk — they're the default outcome without architecture guardrails.
Most insurers are building AI business cases using SaaS math. That's the wrong model entirely.
Uber's 5,000 engineers burned through their entire 2026 AI token budget in four months. ServiceNow did the same. 71% of enterprises exceeded their AI budgets last year, and that was before agentic adoption really took hold.
Here's what's different about agents: a coding or claims-processing agent running 10 turns doesn't just use 10x the tokens of a single query. It re-reads its full context on every turn. The actual reasoning you care about is maybe 15-20% of total token consumption. The rest is invisible overhead your finance team never modelled.
For P&C carriers, this matters more than most industries. Claims workflows are long, multi-step, document-heavy. An adjuster-assist agent working a complex liability claim isn't a chatbot — it's a multi-hour autonomous process touching policy data, medical records, legal precedent, and reserve calculations. Each step burns tokens you didn't budget for.
The fix isn't a bigger budget line. It's a different architecture conversation.
Before you scale any agentic workflow in a carrier environment, you need context window management built into the design. You need cost ceilings per transaction type. You need visibility into ghost tokens — the tool calls, retries, and context reloads that never show up in the output but show up very clearly on the invoice.
The carriers who get this right early will have a genuine cost advantage over those who discover it at renewal time. The ones who don't will go back to their boards with AI ROI stories that don't survive contact with the actual bill.