What AI Agent ROI Really Looks Like Inside Real Companies
It’s one thing to say AI agents are autonomous software that plans, acts, and finishes a task without constant supervision. It’s another thing to watch a finance team’s mean-time-to-resolution drop by 40% and have a named company, a named workflow, and a verifiable number attached to that claim. 2026 is the first year enterprise AI agent reporting has shifted from “we expect” to “here’s what happened,” and the gap between those two sentences is where the real story lives.
This piece goes past the headline names already covered in our complete beginner-friendly guide to look specifically at what companies are reporting once an agent leaves the pilot stage and starts touching real customers, real contracts, and real money — including the deployments that didn’t go as planned, because that part of the story matters just as much.
The Customer Service Numbers That Started the Conversation
Klarna’s customer service agent is the case everyone cites first, and it’s worth being precise about why. The agent took over roughly two-thirds of all customer service chats, did the work equivalent of 853 full-time employees, cut average response time from about 11 minutes down to under 2, and the company has publicly attributed close to $60 million in savings to the deployment. Those are the kind of specific, checkable figures that separate a real case study from a marketing claim.
What gets less attention is the second half of that story. Once Klarna had the system running at scale, it found that complex, emotionally charged billing disputes and account issues didn’t resolve well with a fully automated handoff, and the company deliberately brought human agents back into the loop for that category of conversation. The lesson companies are drawing from this in 2026 isn’t “automate less” — it’s “automate the parts that are genuinely repetitive, and build a clean handoff for the parts that aren’t,” which is a more useful takeaway than either the success number or the correction alone.
The economics behind that decision are easier to see when you look at the per-conversation cost. Conversational AI company Crisp has published figures showing a human support agent handling a routine query costs an enterprise somewhere between $20 and $25, while an AI agent handling the same routine query costs roughly 50 to 70 cents. That cost gap is the entire reason customer service became the first place agents got deployed at scale — it’s also exactly why the categories that don’t resolve cleanly (the ones Klarna pulled back from) need a different cost-benefit calculation than the routine ones.
Beyond Customer Service: Contracts, Tickets, and Compliance
JPMorgan’s footprint is the clearest sign that this isn’t confined to one department. The bank runs more than 450 agentic AI use cases in production, touching fraud detection, internal operations, and document-heavy compliance work, every day rather than as occasional pilots. Salesforce has reported a more specific figure on one slice of similar work: its contract-review agents have cut roughly $5 million in legal costs by flagging risk clauses before a document ever reaches a lawyer’s desk, a number precise enough to model against a legal team’s existing budget rather than just gesture at “efficiency.”
IT operations is the other place this shows up with hard numbers attached. DXC Technology deployed agentic AI across its incident triage and resolution workflows and reported a 30% to 40% reduction in mean-time-to-resolution for Tier 1 and Tier 2 support tickets — the kind of metric an operations leader can put directly into a board deck because it ties to an existing SLA. Rimini Street took a similar approach to enterprise software licensing, using agents to automate multi-step contract review, and reported cutting cycle times by 45% to 50% on that workflow. Industry analysis comparing these results to older rule-based automation on the same processes found the agentic versions delivering roughly two to three times the improvement, which is a meaningfully different number than the more modest gains rule-based systems produced on the same tasks for years before agents existed.
What’s Happening Outside the United States
DBS Bank in Singapore offers a useful contrast because the scale is reported in a way few Western companies disclose: the bank says it generated close to S$1 billion (about US$740 million) in economic value from AI in its most recent fiscal year, running more than 1,500 models across its operations. Singapore’s broader customer service sector backs that up — industry tracking puts the country’s AI customer service adoption rate at roughly 94%, the highest reported anywhere, which suggests the Klarna-style economics are landing even faster in markets that started from a more digital-first banking baseline.
Agentic commerce is the sharpest example of where this is heading next. Alipay reported processing 120 million AI-agent-initiated transactions in a single week in February 2026, and DBS Bank and Mastercard completed what’s being described as the first fully agentic payment transaction in Singapore, where an AI agent booked a ride and paid for it without a human tapping to confirm. Whether that becomes the norm for consumer payments elsewhere is genuinely unresolved, but it’s a concrete preview of agents acting as the payer, not just the assistant drafting the request.
The Honest Counterweight
None of this is universal success, and treating it that way would be dishonest to the data. Gartner’s widely cited forecast puts the agentic AI project cancellation rate at over 40% by 2027, driven by unclear ROI, escalating costs, and risk controls that weren’t built in from the start. Deloitte’s 2026 survey on global governance maturity found that only about a fifth of organizations have what it considers a mature governance model for their AI agents, which means a lot of the deployments generating these headline numbers are running ahead of the oversight that’s supposed to catch them if something goes wrong.
The companies whose numbers hold up under scrutiny — Klarna, JPMorgan, DXC, Rimini Street — share a pattern worth naming directly: they picked one well-defined workflow, set a baseline metric before deployment, and measured the same metric afterward rather than reporting a vague efficiency story. The companies whose AI projects get quietly cancelled tend to skip that step and try to automate too much, too generally, before they’ve proven the narrower case. That difference in discipline, more than any difference in the underlying technology, is what separates this year’s reported wins from next year’s cancellation statistics.
Myths Worth Retiring
“If one company’s AI agent saved millions, mine will too.” Cost savings scale with how repetitive and well-defined the underlying task already was. Klarna’s gains came from an extremely high-volume, pattern-heavy workflow; a business with lower ticket volume or more idiosyncratic cases will see a different, usually smaller, return on the same kind of agent.
“ROI numbers from AI agent vendors can be taken at face value.” The most reliable figures in this piece — DXC’s MTTR reduction, Rimini Street’s cycle-time cut, Salesforce’s legal cost figure — come from the companies measuring their own pre-existing metrics, not from a vendor’s marketing page. A number with a named baseline and a named workflow is worth more than a percentage with no comparison point attached.
“Once an agent is deployed, the savings are permanent.” Klarna’s own pullback on complex cases shows that the right scope for automation can shift after deployment, once real usage data shows where the agent is actually struggling. Treating the initial rollout as the finished state, rather than the first data point, is how some of the governance gaps Deloitte flagged tend to start.
Frequently Asked Questions
Are these AI agent ROI numbers independently verified, or just self-reported?
Most of the figures here, including Klarna’s savings estimate and DXC’s resolution-time improvement, are self-reported by the companies involved, often in earnings calls, press releases, or industry interviews. They’re checkable in the sense that the companies are naming specific metrics and baselines, but they aren’t third-party audited the way a financial statement would be.
Why did Klarna add human agents back after automating so heavily?
The company found that complex, emotionally charged customer issues — disputes, account problems, anything outside a routine billing question — didn’t resolve as well with a fully automated handoff. Rather than treat that as a failure of the whole project, Klarna scoped the agent back to the routine cases where it was already performing well and reintroduced people for the rest.
How long does it typically take to see ROI from an AI agent deployment?
Industry surveys point to a median payback period of roughly five months across functions, with simpler customer-facing agents often paying back faster — closer to three to four months — and more complex finance or operations agents taking closer to nine months to show a clear return.
What’s the biggest reason agentic AI projects get cancelled despite these success stories?
Gartner’s research points to unclear ROI, escalating costs, and weak risk controls as the leading causes, more often than the underlying technology simply failing at its assigned task. Projects that start with a vague, broad goal rather than one measurable workflow are the ones most likely to end up in that cancelled bucket.
Do AI agent cost savings come mostly from replacing jobs?
The companies with the clearest numbers, like Klarna, generally frame their savings as workload-equivalent rather than direct headcount reduction, and several explicitly kept or expanded human roles for the cases agents couldn’t handle well. The honest framing is that agents absorb volume on repetitive work, which changes what the remaining human roles look like more than it eliminates them outright.
The takeaway across every case study here is the same one Klarna learned the hard way: the agents that produce real, defensible ROI numbers are the ones scoped to a specific, measurable workflow with a baseline already in place — not the ones deployed on a general promise to “make things more efficient.” For the architecture that makes coordinating several of these narrow, specialized agents possible, our piece on multi-agent systems in 2026 covers how companies like Genentech and Walmart structure that coordination, and the simple explanation of what AI agents are is the right place to start if any of the terminology here needs unpacking.