The Subsidy
One user pays $200 per month for an AI subscription. The studio running that subscription consumes $35,000 in compute to serve them. That is not a business model. It is a subsidy relationship of 175:1.
OpenAI's burn rate, per leaked internal documents: $1.35 spent per $1 earned (The Information, October 2024) rising to $1.69 per $1 earned in 2025 (Wall Street Journal, November 2025). Projected losses for 2026: $14 billion. Cumulative projected losses vary significantly across document versions: $44 billion through 2028 per the October 2024 investor materials, rising to $115 billion in cumulative cashburn through 2029 per separate WSJ-reported materials, with a projected $74 billion operational loss for 2028 alone in the later documents. OpenAI is not publicly listed and has published no audited figures; all projections derive from leaked internal materials and vary across document generations. A developer on a $20-per-month Plus subscription who uses the coding assistant intensively burns $150 to $450 in actual compute — a 7 to 20 times subsidy on that single account.
Meanwhile, every major studio has converged on the same pricing ladder: Free, then $20, then $100, then $200, then usage overflow. OpenAI's Head of ChatGPT has described the prior pricing model as "accidental." The structure was not designed around unit economics. It was designed around growth.
This is the foundational fact that the rest of this analysis rests on: current AI pricing signals are artificial. Every adoption decision, every build-vs-buy choice, every vendor selection made under these conditions is being made against a price that does not reflect actual cost.
Five Models, Four Providers, One Conclusion
To stress-test the analysis, five foundation models were queried as structured expert panelists, each assigned a distinct analytical role. The models span four providers, three geopolitical origins, and five very different analytical traditions. They arrived at the same destination through entirely different paths.
When five models trained on different data, with different RLHF, different architectures, and different geopolitical origins converge on the same structural conclusion via five different analytical pathways, that convergence constitutes a signal worth paying attention to.
AI Is Becoming Electricity, Not Software
The most useful reframing came from GPT-5.4 in the Technology Economist role: AI is ceasing to look like SaaS and starting to look like a metered utility, the way electricity is priced not by appliance access but by kilowatt-hour consumed.
Under that frame, the current pricing structure makes a specific error that any economist recognises: flat-rate access combined with wildly variable consumption creates adverse selection. Heavy users, who generate the highest costs, are overrepresented among subscribers. Light users, who would be profitable, are underrepresented. It is adverse selection with a GPU attached.
All five panelists confirm this structural diagnosis, though they disagree on the speed and mechanism of correction. What they agree on: the current price is not the real price. And adoption decisions built on the current price require an explicit contingency plan.
GPT-5.4, Technology Economist role
The Token Cost Illusion
Price per token has fallen approximately 75 per cent in recent years. This is the number cited most often in AI procurement discussions. It is also the least useful number in those discussions.
Token consumption per task has moved in the opposite direction at a much larger magnitude. Simple, single-turn interactions consumed around 2,000 tokens. Agentic workflows (which are now the dominant use case in enterprise AI) consume 200,000 to 500,000 tokens per completed task. That is a 100 to 250 times increase in consumption, against a 75 per cent price reduction. The net cost of getting a task done has not fallen; it has risen substantially for anyone running agentic workloads.
Price per token has fallen dramatically. This is the number cited most often in procurement.
There is a second complication that makes token-level pricing even less useful as a guide. Gemini 3.1 Pro makes the hardware case: LLM inference is rarely compute-bound; it is memory-bandwidth-bound. When agentic workflows trigger KV-cache thrashing, a lower price per token does not reduce cost in the way procurement teams assume.
A third complication: Chen et al. (2026) document the Price Reversal Phenomenon. In 21.8 per cent of model comparisons, the apparently cheaper model ends up costing up to 28 times more when task completion is measured rather than token count. The same prompt, run multiple times on the same reasoning model, produces up to 9.7 times variance in thinking-token consumption. There is no stable unit in "price per token" when the token count itself is unpredictable.
GPT-5.4's conclusion from this analysis: the competitive axis of the next phase is not price per token. It is predictable cost per completed task. The winning AI deployment will not be the most capable one. It will be the one with the highest and most legible gross margin per workflow.
Adoption Theater
Fifty per cent of enterprises report using AI (Ramp, March 2026). This sounds like a breakthrough. The composition of that number is where the analysis gets uncomfortable.
VC-backed companies show an 80 per cent AI adoption rate. Non-VC-backed companies show 45 per cent. That gap is not explained by superior technology access among VC-backed firms. It is explained by incentive structures: VC-backed companies are rewarded for appearing AI-native regardless of unit economics. Twenty-eight per cent of enterprise OpenAI spending runs through personal consumer plans, classic shadow IT, which means the enterprise has not actually procured or evaluated the tool; individual employees have paid for it themselves.
GPT-5.4 in the IS Scholar role describes this as Adoption Theater: ceremonial adoption driven by mimetic pressure rather than measured value. In Rogers' diffusion framework, the subsidy has artificially inflated trialability (anyone can try it cheaply) while suppressing perceived cost (no one sees the real price). Trial is not routinisation. And routinisation built on subsidised prices is not sustainable adoption.
DeepSeek-R1 translates this into venture capital arithmetic: VCs are funding startups that use other VC-funded startups' products. Remove the subsidy, and the artificial demand that links those layers together disappears simultaneously across the stack.
The Cascade
The structural danger is not just the pricing of any single layer. It is that the subsidy runs through three layers simultaneously, each one amplifying the distortion of the one above it.
The panel's sharpest debate is about what happens to Layer 2 when the correction hits. DeepSeek-R1 is the most direct: "90 per cent of AI startups will be exposed as capital-burning theater." GPT-5.4 disagrees with the framing but not the direction: thin wrappers die, but workflow intermediaries that provide aggregation, integration, and switching-cost control survive, as SaaS survived the rise of AWS. "The right analogy is not extinction; it is rebundling."
Gemini 3.1 Pro adds the hardware dimension: wrappers do not die because they lack moats; they die because they cannot reach the GPU utilisation rates required when upstream subsidies disappear. Foundation model studios survive because they aggregate massive global demand. Layer 2 has no comparable demand aggregation engine.
The panel does not resolve this disagreement. That is the point. The price correction will determine where durable value actually sits: in the infrastructure layer or the orchestration layer. That question remains open.
Open Source: Price Ceiling or Architecture Foundation?
The panel's most interesting divergence concerns the role of open source models in the post-subsidy environment.
GPT-5.4 frames open source as a price ceiling: it limits what proprietary providers can charge, because any price above a certain threshold makes the economics of self-hosting viable. The ceiling constrains upside for closed providers.
Qwen3 Thinking goes further: open source enables architectural decoupling. Enterprises can route around studios entirely if they choose. "Studios cannot pass their costs down to enterprises when enterprises can route around them." This is a stronger claim than a price ceiling; it is a structural exit option.
Gemini 3.1 Pro provides the hardware reality check: running a 400-billion-parameter open-source model requires an eight-GPU H100 cluster at approximately $300,000 just to hold the weights in VRAM. "The 'free' open-source model ends up costing three times the API." For the average enterprise, the exit option that architectural decoupling theoretically offers is financially inaccessible.
The synthesis is not a contradiction: open source is a genuine price ceiling for large enterprises with their own infrastructure, and an irrelevant option for smaller organisations that cannot bear the investment. The correct answer depends entirely on who is asking the question.
What the Panel Missed
Five foundation models from four providers — and still three significant blind spots in the aggregate analysis:
- The consumer market. 200 million users have been trained to expect AI at zero marginal cost. The consumer pricing transition is a separate problem from the enterprise one, and arguably the harder one. The panel treated enterprise dynamics exclusively.
- Regulation. If all five models agree that AI is becoming a utility, where is the analysis of utility regulation? Price transparency requirements, competition law, and the political economy of mandatory access are conspicuously absent from every response.
- Labour markets. If AI-augmented work becomes uneconomic at real prices for tasks that are currently automated, some of that work returns to human labour. The models did not engage with what reversal looks like at the task level.
The shared blind spots are themselves informative. They represent the edges of the training corpus, where the models' epistemic density thins out.
Counter-Signal: Anthropic's First Profitable Quarter May 2026
In May 2026, the Wall Street Journal reported that Anthropic expects $10.9 billion in revenue for Q2 2026, a 130% year-on-year surge, and its first operating profit of approximately $559 million. Compute as a share of revenue is projected to fall from 71% in Q1 2026 to 56% in Q2.
This is the strongest counter-signal to the analysis above to have appeared since original publication. It deserves to be treated as one, not absorbed without comment.
What the Anthropic number does not say
It does not say that the 175:1 consumer subsidy ratio from the opening of this article is wrong. Anthropic's revenue mix skews heavily toward API and enterprise contracts, where pricing has always been closer to cost. The studio that the opening describes (one user paying $200, the studio burning $35,000 in compute) is the consumer tier. The structural diagnosis there is unchanged. What the Anthropic number does say is that the blended picture across the studio sector is no longer uniform. Some studios bend their cost curve faster than others, and the dispersion between them is widening.
The 5% margin is thin, and it is projected
$559 million on $10.9 billion is a single-digit operating margin, in a quarter that has not yet closed. The same WSJ chart shows that compute and infrastructure are no longer the dominant cost line. Sales, marketing, and partnership costs are. These are the costs that grow with enterprise distribution and are slower to compound away than compute. The risk is not that the subsidy ends. The risk is that the correction is non-monotonic: compute falls, distribution costs rise, and the net path to durable profit is longer than the headline operating-income number suggests.
The compute drop is the real story
71% to 56% of revenue in one quarter is a 21% reduction in compute intensity. If that trajectory holds for two more quarters, the gap between API price and API cost closes inside 2026 rather than 2028. The structural correction does not disappear; its mechanism shifts. Instead of arriving as a price increase passed through the cascade, it arrives as a margin expansion absorbed inside the studio. The wrapper economy survives longer under that scenario than this article previously assumed, because upstream margins absorb the shock instead of passing it down. That is a meaningfully different end-state.
What this revises in the five theses below
- Thesis 03 (wrapper attrition) softens. If studios reach durable operating profit at current API rates, the cost shock that ends 60% to 80% of wrappers may be smaller, later, or replaced by a different mechanism altogether (consolidation, distribution capture, platform integration).
- Thesis 04 (IPO as forcing function) strengthens. Anthropic's projected trajectory is precisely the disclosure profile that makes public markets viable for the sector. The next AI-native S-1 either matches this trajectory or repricing follows. The bar moves from "show a path to profit" to "show Anthropic-class compute efficiency."
- The cascade frame still applies cleanly to anything built on OpenAI's consumer subsidy. It applies less tightly to enterprises and startups buying from studios that are bending their own curves. The relevant procurement question becomes more specific: not "is AI subsidised" but "which studio's cost curve is your stack actually exposed to."
Five Theses the Panel Agrees On
- 01 From flat-rate to airline pricing. Not "AI gets more expensive" as a single event, but a structural shift to quotas, committed-use contracts, priority tiers, and outcome-based pricing. AI budgets, not AI vibes. The planning question for any institution: what is our committed-use exposure, and what happens to our workflows if usage-overflow pricing doubles?
- 02 Enterprise bifurcation. High-value deployments absorb real costs because the ROI justifies them. Commoditised tasks migrate to open source or are dropped entirely. The mass market churns. This bifurcation is not a risk to manage; it is a strategic sorting that rewards institutions that know which workloads are genuinely high-value before the correction forces the answer.
- 03 60 to 80 per cent of the wrapper economy disappears. Survivors need proprietary data moats and gross margins that can absorb a 5 to 10 times API cost increase. The failure mechanism is not missing features; it is insufficient GPU utilisation to remain viable when upstream subsidies evaporate. This does not take competitors down through active displacement; it takes them down through cost structure.
- 04 IPO as forcing function. S-1 filings require disclosure of actual unit economics. When public markets see the true cost per user, the subsidy narrative becomes indefensible. The first wave of AI-native IPOs will either demonstrate viable unit economics or catalyse a repricing event across the sector. There is no middle path once the numbers are public.
- 05 From access to affordability. The transition that matters is not "who has access to AI" but "who can afford the AI that actually makes a difference." Access to a subsidised general model is universal and cheap. Access to a well-architected, domain-specific, sovereign system remains expensive and institutional. That gap is the politically and economically significant shift.
The Responsible Position
This analysis is not a case against AI adoption. It is a case for adoption decisions that include an explicit contingency plan for the pricing correction. Every organisation that has built workflows, staffing models, or competitive assumptions on current AI prices has implicitly assumed that those prices are durable. They are not.
Four panelists. Four closing positions. The same underlying claim:
When the biases of the panelists are different but the conclusion is the same, that conclusion warrants attention.
Sources & References
Every numerical claim in this article traces to one of the sources below. Figures from leaked internal documents are marked as such. Where document versions disagree, the article discloses the range.
- The Information (October 9, 2024). Leaked internal investor materials. Source for: $1.35 spent per $1 earned; $14B projected losses 2026; $44B cumulative projected losses through 2028; profitability 2029 scenario.
- Wall Street Journal (November 2025). Later leaked documents shown to investors. Source for: $1.69 spent per $1 earned in 2025; $74B operational loss by 2028; $115B cumulative cashburn through 2029.
- Wall Street Journal (May 20, 2026). Source for: $10.9B Q2 2026 projected revenue (+130% YoY); $559M projected operating profit; 71% (Q1) to 56% (Q2) compute as share of revenue; Anthropic operating-income chart by segment.
- CNBC (May 20, 2026). Secondary coverage of the WSJ exclusive. Source for: cross-confirmation of Q2 2026 revenue and operating profit projections.
- Chen et al. (2026). Source for: 21.8% of model comparisons where cheaper model ends up 28× more expensive per task; 9.7× variance in thinking-token consumption on repeated prompts.
- Shojaee et al. (2025). Source for: reasoning-model compute behaviour; variability in thinking-token allocation.
- Artefact. Source for: −75% price per token vs. +25,000% consumption per agentic task framing; 2k → 500k token consumption range; 40–60% hidden costs (RAG, embeddings, retries).
- Ramp AI Index (March 2026 data release). Source for: 50% enterprise AI adoption; 80% at VC-backed firms vs. 45% at non-VC-backed; 28% of enterprise OpenAI spending via personal consumer plans.
- Cursor — $2B ARR / $3B+ VC raised. Public reporting as of early 2026.
- Harvey — $190M ARR / $800M+ VC raised. Public reporting as of early 2026.
- Lovable — $400M ARR / $545M+ VC raised. Public reporting as of early 2026.
Building for the Post-Subsidy Environment
Sovereign architecture, proprietary knowledge encoding, and predictable cost-per-task structure are not aspirational. They are the minimum viable configuration for the pricing environment that is coming. If your institution is thinking about what that means for your AI stack, we should talk.
Schedule a Conversation