Daily AI Briefing · Thursday, 28 May 2026

01 / 04 · Markets & Sentiment

7 min read

Altman and Amodei walk back the AI jobs apocalypse — just in time for IPO season

Two CEOs who spent a year warning of a white-collar bloodbath now say they were wrong. Trillion-dollar listings are weeks away..

·01Primer

For most of the last year, the two loudest voices in artificial intelligence told the world that office work was about to be gutted. Anthropic’s Dario Amodei warned of a white-collar bloodbath that could eliminate half of all entry-level desk jobs within five years. OpenAI’s Sam Altman said entire categories of junior roles would vanish. This week, both men changed their tune. Altman, sitting on stage with the chief executive of Australia’s largest bank, said he had been pretty wrong. Amodei, briefing financial-services clients, reached for an old economic idea, the Jevons Paradox, to argue that automation creates work rather than destroys it. The pivot lands in the same month that both companies are reportedly preparing public listings at valuations near one trillion dollars each. For German boards and HR leaders, the question is no longer what the CEOs say. It is what their customers and their own data show.

·02What Happened

On a stage in Sydney on 26 May, Sam Altman leaned back next to Matt Comyn, the chief executive of Commonwealth Bank of Australia, and offered something rare from a Silicon Valley founder: a confession of error. CBA had paid for the venue, the audience and, in effect, the apology. Altman told the room that he and OpenAI had been roughly right about the technology since launching ChatGPT in late 2022, but pretty wrong on the social and economic implications. Then came the line that travelled around the world inside an hour. “I’m delighted to be wrong about this,” he said. “I thought there would have been more impact on entry-level white-collar jobs being eliminated by now than has actually happened.” It was a striking reversal from June 2025, when Altman had told interviewers that whole classes of junior knowledge work were headed for the exit. Three weeks earlier, in Lower Manhattan, Dario Amodei had performed a quieter version of the same manoeuvre. At Anthropic’s first dedicated briefing for financial-services customers, the Anthropic CEO shared a stage with JPMorgan Chase’s Jamie Dimon and reached past the labour data for a nineteenth-century coal economist. William Stanley Jevons had observed in 1865 that more efficient steam engines led Britain to burn more coal, not less, because cheaper energy unlocked new uses. Apply that to knowledge work, Amodei argued, and the picture flips. “If you automate 90% of the job,” he told the audience, “then everyone does the 10% of the job.” The remaining sliver, he suggested, expands to fill the day and multiplies a worker’s output tenfold. This was the same Amodei who, in May 2025, had told Axios that AI could erase 50% of entry-level white-collar roles and push unemployment to 20%. The shift was unmistakeable, and the timing was conspicuous. Both companies are racing toward public listings. OpenAI is reportedly targeting a September IPO at a valuation between $852 billion and one trillion dollars. Anthropic, whose secondary-market shares already imply a one-trillion-dollar price tag, is eyeing an October offering that could raise more than $60 billion. Run-rate revenue at Anthropic reportedly crossed $44 billion annualised in May, and the company is on track for its first operating profit. Pension funds and sovereign-wealth allocators do not, as a rule, buy stories about destroying the economy that hosts their liabilities. The catch is obvious: a narrative of orderly augmentation sells better in a roadshow than one of mass redundancy. As Fortune’s Jason Del Rey put it, the men who spent a year warning of catastrophe now have a strong financial reason to describe a softer landing.

·03The Numbers

The macroeconomic picture sits awkwardly between the two stories. On one side, the headline labour data refuses to confirm the apocalypse. The US unemployment rate held at 4.3% in April, the Yale Budget Lab has found no significant change in occupational mix or unemployment duration for high-AI-exposure roles since ChatGPT shipped, and broad measures of productivity growth remain close to the post-war average of around 1.5%. On the other side, the white-collar segment in isolation looks recessionary. White-collar payrolls in the United States have now contracted for 29 consecutive months, a stretch that, going back seven decades, has only ever occurred during full recessions. Employment in the core knowledge sectors of finance, insurance, information and professional services peaked in November 2022 and is down 1.9%, while jobs outside those industries are up 4.1%. Tech-sector layoffs through May 2026 have already passed 115,000, closing in on the full-year 2025 total of 124,000, and roughly one-fifth of recent cuts have been explicitly attributed to AI by the companies announcing them. Germany illustrates the same split screen. On 8 May, Commerzbank announced an additional 3,000 job cuts under its Momentum 2030 strategy, taking cumulative reductions to 6,900. CEO Bettina Orlopp’s statement was unusually blunt about the rationale: the bank intends to leverage AI even more, plans to invest roughly six hundred million euros into AI initiatives between now and 2030, and expects five hundred million euros in annual cost savings by decade’s end. The announcement landed alongside record first-quarter operating profit of 1.4 billion euros and raised 2026 and 2028 targets. In other words, the layoffs are not crisis layoffs. They are growth-era restructuring, justified to shareholders by an AI thesis that Amodei has just publicly softened. Boards across the DAX40 face the same internal contradiction: vendor pitches that promise tenfold productivity, HR plans that quietly assume headcount reductions, and a labour-market backdrop in which graduate hiring has cooled noticeably even as overall employment holds. More remarkable still is the gap between micro and macro evidence. Azeem Azhar, the analyst behind Exponential View, has spent the last month chronicling what he calls the productivity paradox. Anthropic’s own modelling suggests existing AI tools, frozen at today’s capability, could raise US labour productivity by about 1.8% a year for a decade. Recent measured productivity is closer to 1.5%, near the historical average. Azhar attributes the gap to implementation drag: supervision overhead, error correction, and what he calls productivity displacement, where gains in one team are absorbed by new bottlenecks elsewhere. Translated for a board: the AI investment may show up in cost lines long before it shows up in revenue.

·04Timeline & Context

The reversal is best understood as the closing chapter of a roughly twelve-month rhetorical arc that began with maximum alarm and ends in maximum reassurance. May 2025: Amodei tells Axios that AI is heading for a white-collar bloodbath and that producers of the technology have a duty to be honest about it. June 2025: Altman, in a series of interviews, says entry-level office jobs will be eliminated faster than the consensus expects. August through December 2025: tech layoffs accelerate; Microsoft, Meta, Salesforce and IBM cite AI in restructuring statements. February 2026: Axios reports that the white-collar jobs market was already weakening before AI, raising the question of attribution. 5 May 2026: Amodei takes the Anthropic stage with Jamie Dimon, introduces Jevons, and softens the framing from elimination to transformation. 8 May 2026: Commerzbank announces its AI-justified cuts. 23 May 2026: news breaks of Anthropic’s roughly nine-hundred-billion-dollar funding round. 26 May 2026: Altman tells Matt Comyn he was pretty wrong. Roughly the same week, OpenAI’s reported IPO timeline crystallises around September, with Anthropic’s around October. The pattern fits a recognisable Silicon Valley rhythm. The cycle resembles the social-media era around 2010 to 2012, when Facebook and Twitter executives oscillated between warning that the open web would destroy traditional media and reassuring advertisers and policymakers that they were partners in journalism’s survival. Then, as now, the most aggressive predictions came when the technology was novel and the founders were chasing attention. The softer framing arrived when the businesses needed institutional capital and regulatory tolerance. Critics including Ed Zitron and Gary Marcus have already drawn that parallel publicly, arguing that the new narrative is roadshow choreography rather than a genuine update on the evidence. Daron Acemoglu, the Nobel laureate, sits in the patient camp that always doubted the rapid-displacement thesis, and his stance has not changed. For European enterprises the practical lesson is to separate the two clocks. The CEO clock runs in news cycles and is now tuned for IPO season. The adoption clock runs in quarters and years, governed by integration cost, data quality, regulation under the EU AI Act, and the slow grind of process redesign. A DAX40 HR strategy built on Amodei’s May 2025 warnings would have over-rotated toward attrition. A strategy built on his May 2026 reassurance risks under-investing in the workforce transition that even the optimistic Jevons reading requires. The honest reading is that both CEOs were probably exaggerating in opposite directions for adjacent reasons, and the labour data is finally arriving in enough detail to call the question. Boards should treat this week’s headlines as a marketing event, not a forecast revision.

Three Perspectives What this story means for different readers

For Accenture, BCG, Deloitte and the German Mittelstand alike, the practical takeaway is that vendor narratives have shifted faster than internal plans can follow. CHROs who spent 2025 modelling double-digit reductions in junior analyst headcount on the strength of Amodei’s bloodbath warning now have permission from the same CEO to model augmentation instead. The risk is whiplash. Commerzbank’s 3,000-role cut shows that the cost-out case still pays for itself even if the apocalypse does not arrive. The smarter posture is to fund parallel scenarios, hold headcount targets loose, and invest aggressively in the supervision and orchestration skills Azhar identifies as the binding constraint on actual productivity gains.

European policymakers were among the most receptive audiences for Amodei’s original bloodbath warning, which fed directly into AI Act enforcement timelines and the German coalition’s debate over Kurzarbeit-style transition support. The reversal complicates that picture. If the producers of the technology now publicly doubt their own displacement forecasts, the political case for emergency intervention weakens, but the case for steady workforce-transition funding and graduate-hiring monitoring strengthens. Brussels will read the IPO-timing critique sceptically. Expect the Commission to lean harder on independent labour-market data from Eurostat and national statistical offices, and to treat self-serving CEO testimony as exactly that — one input among many, weighted against actual payroll and graduate-hiring numbers.

For investors, the reframing matters less than the valuations attached to it. A combined roughly two-trillion-dollar pricing for OpenAI and Anthropic at IPO would absorb an enormous slice of the institutional appetite for AI exposure. Founders raising in their wake should expect LPs to ask sharper questions about defensible revenue, gross margin and customer concentration, rather than narrative. The Jevons framing is also a gift to application-layer startups: if knowledge work expands rather than contracts under automation, the addressable market for vertical agents in legal, finance and healthcare grows rather than collapses. Expect a wave of pitch decks rewritten in the next quarter to cite Amodei’s coal-economist analogy verbatim.

Sources 11 references

02 / 04 · Law & Governance

7 min read

The Algorithmic Blackball: Stanford Lifts the Hood on AI Hiring

A 4-million-application audit of a single hiring vendor turns into the first empirical foundation for AI hiring liability — and a template Brussels is already studying..

·01Primer

An AI hiring tool can look fair when you average everything together and look biased the moment you stop averaging. That is the core of a Stanford-led study released Tuesday, May 26, covering four million job applications screened by one vendor — Pymetrics, now owned by Harver — across 156 large employers. When the researchers applied the U.S. Equal Employment Opportunity Commission’s four-fifths rule position by position rather than in aggregate, more than one in four applications submitted by Black candidates landed in roles where the algorithm produced legally adverse outcomes. The paper, to be presented at ACM FAccT in Montreal in June, gives regulators and plaintiffs’ lawyers the first large-scale, position-level evidence base for AI hiring liability — and it lands six weeks before the EU AI Act’s high-risk obligations apply to employment systems.

·02What Happened

Kathleen Creel, a Northeastern philosophy-of-computing professor and one of the paper’s five authors, framed the problem in unusually direct terms to the Financial Times: “As a single vendor comes to dominate decision-making in a space, their quirks or shortfalls can be present across that entire sector in a way that wasn’t possible before.” Her co-authors — Rishi Bommasani, Sarah H. Bana, Dan Jurafsky and Percy Liang, all anchored at Stanford’s Digital Economy Lab and HAI — spent four years working a single dataset that no academic group had ever obtained: every recommendation Pymetrics issued between December 2018 and December 2022. That is 4,197,168 applications from 3,372,132 people, sent to 1,746 specific job postings at 156 employers whose combined annual revenue reaches roughly 225 billion dollars. The industries span finance, manufacturing, warehousing and consumer goods — exactly the Fortune 500 and DAX40 employers most likely to default to an off-the-shelf screening vendor. Pymetrics’ pitch had always been that its game-based assessments were more objective than résumés because they measured cognitive traits — risk tolerance, processing speed, altruism — instead of names or schools. In a 2022 peer-reviewed paper, Pymetrics’ own scientists argued the platform cleared the EEOC bar. The Stanford team did not contest that math; they contested the question. Pymetrics had pooled outcomes across all employers and positions, then asked whether any racial group was selected at less than 80% of the most-selected group. The Stanford-led team did the calculation the way Title VII enforcement actually works in court: position by position. The numbers flipped. Of the 1,746 postings, 10.62% failed the four-fifths rule for Black applicants. Roughly 30% of Black job seekers in the dataset applied to at least one such position. In raw counts, the disparate-impact bucket covered nearly 40,000 application events. Then came the pivot that gives the paper its name. Because Pymetrics scores are deterministic and cached for up to 330 days, two employers using the platform are not running two independent screens — they are running the same screen twice. The authors call the result “systemic rejection,” and the empirical signature is unambiguous: 4% of applicants who tried 10 Pymetrics-screened jobs were rejected by every one. That rate is statistically incompatible with the assumption — embedded in U.S. labor-market models for half a century — that employers make independent decisions. The team formally named the pattern an “algorithmic blackball.” To drive home that this is not a quirk of who applied where, they convinced Pymetrics to re-run its production models against 1,000 applicants and every applicable posting. Result: to push systemic-rejection probability below 0.1%, an applicant would have to apply to at least 25 separate jobs — more than double the 10 that suffice when employers screen independently. Harver, asked twice by Fortune and once by The Register, declined to comment.

·03Timeline & Context

The intellectual lineage matters. In 1971, Griggs v. Duke Power gave U.S. courts the disparate-impact doctrine; the EEOC’s four-fifths rule, codified in 1978’s Uniform Guidelines on Employee Selection Procedures, is the operational shorthand. For 47 years it has been applied to one employer at a time — because, until cloud-delivered HR tech, no third party touched enough hiring decisions to make a cross-employer test meaningful. The Stanford paper is the first to argue, with data, that the doctrine’s unit of analysis is now obsolete. Pymetrics-style consolidation, the authors write, “impacts collective adverse impact rates and patterns of systemic rejection” in ways the 1978 guidance never anticipated. They cite a 2023 figure noted in their own paper: as of May 2023, over 60% of the Fortune 100 and eight of the ten largest U.S. federal agencies were using HireVue, a separate but structurally similar vendor. Concentration is the rule, not the exception. Regulatory timing is the second story. New York City’s Local Law 144, the first bias-audit mandate for automated employment decision tools, took effect July 2023; a December 2025 New York State Comptroller audit found enforcement weak and noted that existing guidance instructs auditors to pool data across positions — precisely the aggregation method the Stanford team argues hides discrimination. The HireVue EEOC complaint of 2019, brought by EPIC, was the first federal challenge to algorithmic hiring; it was settled quietly in 2021 with HireVue retiring its facial-analysis component. Until this week, plaintiffs had anecdotes. Now they have a 4-million-row dataset and a peer-reviewed methodology. The European clock is louder. The EU AI Act classifies hiring algorithms as high-risk under Annex III item 4. Article 6 obligations — risk management, data governance, human oversight, post-market monitoring — apply from August 2, 2026, ten weeks after publication of this study. German employers and works councils now have an empirical document, in English, with a citable per-position adverse-impact rate, to wave at any DAX40 HR department still using a U.S.-headquartered screening vendor. The Bundesarbeitsgericht has not ruled on algorithmic hiring; § 87 BetrVG co-determination rights over performance-monitoring systems remain the most likely doctrinal hook. The four policy recommendations in the paper — measure adverse impact at the position level, build cross-employer market surveillance, monitor algorithmic concentration, and create legal researcher access modeled on Article 40 of the EU Digital Services Act — read less like academic suggestions than like a draft for a Bundestag-level Beirat.

Three Perspectives What this story means for different readers

For DAX40 and Fortune 500 HR organizations, the immediate exposure is contractual, not philosophical. Most enterprise contracts with Pymetrics, HireVue, Eightfold and similar vendors include vendor-supplied bias-audit certificates that report aggregate selection rates. The Stanford paper makes those certificates evidentiarily weak: a plaintiff can now point to a peer-reviewed methodology showing aggregate audits mask the very disparities the EEOC rule was written to catch. CHROs should expect three near-term asks from general counsel — re-paper vendor contracts with position-level audit clauses, log every algorithmic recommendation for litigation hold, and obtain indemnification language that survives the vendor’s acquisition (Pymetrics was acquired by Harver in 2022; Harver itself has changed PE ownership twice). The cheapest defensive move is to demand from any incumbent vendor a per-posting four-fifths-rule report covering the past 24 months.

Brussels gets a usable enforcement template ten weeks before high-risk obligations bite. The paper’s call for DSA-style researcher access is the operative detail: Article 40 of the DSA already obliges very large platforms to share data with vetted academics, and the same logic extends naturally to Annex III item 4 systems under the AI Act. Expect the AI Office, the Bundesnetzagentur and France’s CNIL to cite this paper when drafting Article 16 provider guidance. In Washington, the EEOC’s 2023 AI guidance assumed audits would be vendor-supplied; Commissioner Keith Sonderling’s framework now looks under-specified. NYC’s Department of Consumer and Worker Protection will face pressure to revise Local Law 144 implementation rules, given that the paper explicitly identifies the city’s pooling guidance as the masking mechanism. Class-action plaintiffs’ bar — Outten & Golden, Lieff Cabraser — will read this paper as a litigation playbook.

The HR-tech investment thesis since 2019 has rested on the claim that algorithmic screening is fairer than human screening. Pymetrics raised 56 million dollars on that pitch before selling to Harver; Eightfold reached a 2-billion-dollar valuation; Paradox AI just raised at 1.5 billion. The paper does not retire that thesis — it relocates it. Vendors that can credibly provide per-position audits, deterministic-replay capability for regulators, and consortium-level cross-employer monitoring become defensible; vendors that cannot will be re-priced as legal risk. Watch for a wave of seed-stage entrants pitching audit-grade or DSA-compliant hiring infrastructure, and for the incumbents to acquire compliance toolchains the way they acquired sourcing tools in 2021. The European AI Forum and KI Bundesverband will likely push a German alternative; Personio, which has so far stayed out of scored screening, is the obvious consolidator.

Sources 8 references

03 / 04 · Frontier Labs & Capex

7 min read

Hassabis timestamps AGI to 2030 — and names the four missing pieces

At Google I/O, DeepMind's CEO narrowed his AGI window and listed the gaps still blocking it: world physics, memory, consistency, continual learning..

·01Primer

Artificial general intelligence — a system that matches human reasoning across most cognitive tasks — has been the industry's long-promised horizon. Timelines from frontier-lab CEOs range from 2027 to never. On May 20 at Google I/O 2026, DeepMind CEO Demis Hassabis tightened his own forecast to “2030, plus or minus a year,” with Sergey Brin appearing as a surprise guest. More usefully for planners, Hassabis named the four capabilities still missing: world physics (intuitive understanding of how the physical world behaves), memory (persistent recall across sessions), consistency (the same answer on the same question twice), and continual learning (improving from experience without retraining). For boards drafting multi-year AI roadmaps, the question is no longer whether AGI is plausible by decade-end, but which capability gap each business case quietly assumes is already solved.

·02What Happened

The Shoreline Amphitheatre in Mountain View was already two hours into a keynote stuffed with product announcements when Alex Kantrowitz of Big Technology sat down with Hassabis for what was billed as a fireside chat. Sergey Brin walked on unannounced, sat down beside the DeepMind CEO and crashed the interview. The audio circulated through podcast feeds across the following week, with the full transcript hitting Singju Post on May 26 and Axios pulling the most quotable line: Hassabis is “more confident now” that AGI arrives within his stated window. The stated window is precise enough to be operationally useful. “2030 is when I expect it to arrive, either plus or minus a year,” Hassabis told Fast Company's Harry McCracken the same week. In the on-stage conversation, he and Brin played a small game with Kantrowitz on which side of 2030 they would each bet — Brin took just before, Hassabis just after. Both put genuine money on a date inside the planning horizon of any Fortune 500 strategic plan still being drafted in 2026. The substance, though, was Hassabis's list. Asked what stands between Gemini's current capabilities and AGI, he refused the usual hand-wave about “a few more breakthroughs” and instead enumerated four. World physics — the ability to predict how light, gravity, fluids and friction behave, the kind of intuitive model a toddler builds before language. Memory — not the brittle context window of today's chatbots, but persistent recall that survives sessions and integrates new facts without forgetting old ones. Consistency — the property he illustrated with a striking benchmark: an AGI-worthy system should take “a couple of months for a team of experts to find a hole in it,” whereas today's models break in minutes for any motivated user. And continual learning — the ability to improve from experience after deployment rather than freezing at training time. The historical comparison Hassabis reached for was his own work. He recalled AlphaGo and AlphaZero, where switching the “thinking” module off dropped the engines from world-champion level to mere master level — “a 600+ ELO difference,” he said, the chess-rating equivalent of the gulf between a grandmaster and a club player. The implication: scale alone delivered the master; an architectural idea delivered the champion. He expects something analogous to bridge the AGI gap, and he believes one or two such ideas are still undiscovered. Brin, for his part, used the appearance to plant Google's flag. “We fully intend that Gemini will be the very first AGI,” he said, in a sentence that doubled as a recruiting pitch and a competitive shot at OpenAI and Anthropic. He added that any computer scientist still in retirement “should not be retired right now,” framing the current moment as the most exciting in his career — more so, he said, than the launch of the web or mobile. The pivot was telling: Brin was talking like an operator, not a co-founder emeritus, and he was doing so on the same stage where Sundar Pichai had just walked through the Gemini app roadmap. The succession question at Alphabet is no longer abstract.

·03The Four Gaps

Take each gap on its own terms, because the engineering and the capex implications differ sharply. World physics is the gap closest to product reality. Hassabis cited Veo 3, Google's video model, as evidence the problem is tractable: the system intuits lighting, occlusion and ballistic motion from data, where his own early game-development career required hand-coded shaders. “We turn sand into thinking machines,” he said, in a line worth pausing over. The DeepMind bet is that scaled video and Project Astra-style embodied data will let foundation models acquire physical intuition the way mammals do — through observation. Yann LeCun, who left Meta in March 2026 to raise $1.03 billion for AMI Labs, disagrees fundamentally: he argues language-trained models cannot bootstrap a physics model from text, and that an architectural break is required. The split matters because every robotics and autonomous-systems business case in 2026 depends on which view turns out correct. Memory is the most boring of the four and possibly the most important. Today's models forget at the end of every session. Hassabis wants something closer to human episodic memory: persistent, queryable, capable of integrating yesterday's correction into tomorrow's answer without catastrophic forgetting. This is the gap most directly visible to enterprise users. A finance team that watches Gemini produce a perfect ten-K summary on Monday and then re-explain its own caveats on Tuesday knows exactly what the missing piece feels like. Solve it, and the cost curve for vertical agents collapses — there is no need to re-onboard the model every morning. Consistency is the gap Hassabis used to define AGI itself. His benchmark — a team of experts taking months, not minutes, to find a flaw — is a stricter test than most academic AGI definitions. It also explains why DeepMind invested in AlphaProof, AlphaGeometry and the Mythos-class verification work that has filtered into Gemini Deep Think. Verifiable reasoning is the only known route to consistency at scale. The historical analogue is the transition from analogue radio to digital error correction: the trick was not to make the signal louder but to add the redundancy that lets you detect and repair errors deterministically. Continual learning is the gap with the longest research tail. Stanford and DeepMind have published incrementally on meta-learning and elastic weight consolidation since 2017, with limited transfer to production-scale models. The barrier is partly architectural — back-propagation through a frozen pretrained model is hard to do without destabilising what is already learned — and partly economic, because every continual-learning system multiplies training cost. Hassabis flagged AlphaEvolve, DeepMind's self-improving algorithm-design system, as an early proof point. He was careful, though, not to claim an intelligence explosion. “No, not an uncontrolled one,” he said, when Kantrowitz asked directly. The pivot worth flagging for enterprise readers: Hassabis did not list reasoning. He treats reasoning as essentially solved at the architectural level, with remaining work being scale and refinement. That is a notable shift from his 2024 framing and the strongest tell that DeepMind's internal benchmarks on Deep Think, AlphaProof and the Gemini 2.5 Pro reasoning stack have cleared a private bar.

·04From Lab to Mainstream

Translating Hassabis's calendar into corporate planning requires one mental adjustment: the 2030 date is not when AGI ships as a product, it is when DeepMind believes the research bar is cleared. Diffusion into enterprise workflows takes another two to three years on historical precedent — AlphaFold 2 was published in 2020, and the first AI-designed cancer drug from Isomorphic Labs entered clinical trials only in early 2026. That six-year gap is the realistic floor for AGI-class capabilities reaching regulated industries. The capex question therefore looks different than the consumer-press framing suggests. A DAX 40 board approving a five-year AI transformation in 2026 is not betting on AGI itself; it is betting on the capability flow before AGI — better memory architectures from 2027, world-model-grounded robotics from 2028, continual-learning agents from 2029. Each of those is a discrete procurement decision with its own ROI profile. The mistake would be to treat AGI as a binary switch and either over-invest in speculative architecture today or under-invest in the boring memory and consistency wins that arrive earlier. Isomorphic Labs is the case study. The Alphabet spin-out raised $2.1 billion in April 2026 at a valuation that priced in a specific bet: that AlphaFold 3 plus continual learning over molecular dynamics data will compress drug-discovery timelines by an order of magnitude. Hassabis told Fortune the result is a “new renaissance” in 10 to 15 years. That is not an AGI claim. It is a claim that one of the four gaps — continual learning over a constrained domain — gets solved well before the general case, and that the economic value released is large enough to fund the rest of the research path. Enterprise CFOs should read the Isomorphic raise as a template: narrow-domain continual learning is investable now, on a horizon shorter than Hassabis's AGI window.

Three Perspectives What this story means for different readers

For DAX 40 and Fortune 500 planners, the practical takeaway is not the 2030 date but the gap list. Each of the four gaps maps to a procurement question already on the table in 2026. Memory: are vendor agents built on retrieval-augmented patterns, or do they require persistent-memory architectures still in research preview? Consistency: which workflows can tolerate model variance and which need verifier-backed determinism (legal, audit, regulatory filings)? World physics: does the robotics pilot assume current models can intuit dynamics, or is it scoped tightly enough to survive that they cannot? Continual learning: is the operating model premised on retraining cycles the vendor controls, or on in-place adaptation the customer owns? Boards that translate Hassabis's list into a capability checklist will procure differently than boards that read the headline and budget for an AGI line item that does not yet exist.

The EU AI Act's general-purpose-AI tier and the UK AI Safety Institute's evaluation regime were both drafted assuming AGI was a fuzzy, distant question. Hassabis just made it concrete and put a date on it. Regulators in Brussels and London now face a sequencing problem: their current evaluation toolkits measure benchmarks and capability evaluations, not the four properties Hassabis named. Persistent memory in particular collides with GDPR's data-minimisation principle — an agent that remembers a user across sessions is, by definition, retaining personal data the user may have forgotten consenting to. Continual learning collides with the AI Act's requirement that high-risk system behaviour be documented and reproducible at conformity-assessment time. Expect a 2026-2027 push from European regulators to require memory- and learning-state disclosures, modelled on the systemic-risk reporting that already applies to frontier models above the 10^25 FLOP threshold.

The investment thesis split is now sharper than at any point since 2023. Yann LeCun's $1.03 billion AMI Labs raise at a $3.5 billion pre-money is a vote that LLMs hit a wall and world-model architectures inherit the next wave; Hassabis's framing says LLMs plus the four fixes are enough, and Brin's “Gemini will be the very first AGI” line is a recruiting weapon aimed squarely at LeCun's hiring pipeline. For founders, the most investable gap is memory — the smallest research distance to production, the clearest enterprise pain point, and the lowest capex requirement. World-physics startups remain capital-intensive and crowded with well-funded incumbents (DeepMind's Genie, Wayve, AMI Labs, Nvidia's Cosmos). Consistency-as-a-service — verifier layers, deterministic-routing middleware, audit-grade reasoning traces — is the under-priced category, with few funded entrants and obvious enterprise demand from regulated sectors.

Sources 8 references

04 / 04 · Enterprise & Architecture

8 min read

OpenCode Hits 8M Users After Anthropic Tried to Lock It Out

An open coding agent rode a server-side OAuth block into 1M daily active users — and handed European CIOs the lock-in escape they had been drafting on whiteboards for two years..

·01Primer

OpenCode is an open-source coding agent — the harness around a model that reads a repository, writes patches and runs tests. On January 9, 2026, Anthropic quietly switched on server-side OAuth checks that stopped OpenCode, Cline and RooCode from using Claude Pro and Max subscription tokens. Anthropic argued the third-party tools were spoofing the official Claude Code client to run cheap autonomous loops on a flat $200 monthly plan. OpenCode co-founder Dax Raad responded not by negotiating but by widening the harness to OpenAI’s GPT-5, xAI’s Grok and a curated roster routed through a new pay-as-you-go service called OpenCode Zen. Within months, OpenCode went from roughly 650,000 monthly active users to nearly 8 million, with around 1 million daily — making it the fastest-growing open coding agent and a live case study in model-portable developer tooling.

·02What Happened

Raad was on a Texas porch on January 9 when his phone began vibrating in a way that, by then, he had learned to read. The OpenCode Discord was tipping over. Across the world, engineers running Claude inside the OpenCode terminal were getting the same opaque error: “This credential is only authorized for use with Claude Code and cannot be used for other API requests.” Anthropic had silently deployed client fingerprinting. Overnight, the most popular model inside OpenCode had become unreachable through the cheapest authentication path. The Pragmatic Engineer’s Gergely Orosz, who interviewed Raad on May 27, framed the moment plainly: “OpenCode turned Anthropic’s blocking of integration with Claude Code into a massive growth lever.” Raad’s own summary, delivered in his Texas drawl on the podcast: “Get positioning right and the world just keeps handing you wins you didn’t expect.” He had not designed the harness as an anti-Anthropic statement. He had designed it as the open category leader. The block created a category that needed a leader. Then came the pivot. Within forty-eight hours, OpenCode published guidance for routing through other providers. Within six weeks, OpenAI made what The New Stack would later call a strategic counter-move — officially partnering with OpenCode so Codex and ChatGPT subscribers could use their existing entitlements inside OpenCode, OpenHands and RooCode. xAI’s Grok Code Fast 1 was bundled in free during its launch window. Google’s Gemini and a clutch of open-weight models came in through OpenCode Zen, a thin pay-as-you-go billing layer that charges per request, automatically tops up at five-dollar thresholds, and passes credit-card fees through at cost. Anthropic formalised its position on February 19, adding an Authentication and credential use section to the Consumer Terms that explicitly prohibited subscription OAuth tokens in any third-party tool, including its own Agent SDK. OpenCode complied; on the same day it stripped all remaining Claude OAuth code, citing Anthropic legal requests. The historical parallel is hard to miss. When Microsoft tried to stall Netscape in the late 1990s by bundling Internet Explorer, the practical effect was to turn the open web into the default surface for the next two decades of software. When Twitter throttled its API in 2012, it pushed an entire generation of client developers into building Mastodon, Bluesky and the fediverse plumbing now used by every newsroom. Anthropic’s January 9 switch did not kill OpenCode. It killed OpenCode’s dependence on Anthropic — and handed Raad a story arc to sell to every CTO who had ever signed a sole-source AI procurement clause with a queasy feeling.

·03The Numbers

The growth curve does most of the arguing. Pre-block, OpenCode was a respectable open-source project: roughly 650,000 monthly active users, a vibrant Discord, and a place near the top of GitHub’s trending list. Five months later, on the day Orosz published the podcast, the figure was approaching 8 million MAU, with daily active users near 1 million — a roughly 12x jump in a category Raad had previously argued nobody was winning. Janakiram MSV at The New Stack put a specific number on the early surge: 157,000 developers signed up to OpenCode in the weeks immediately after January 9 as a hedge against Anthropic. The GitHub repository has crossed 161,000 stars with 864 contributors — numbers that look more like a Linux distribution than a startup. The economics underneath are equally instructive. A $200 monthly Claude Max subscription, used via the official Claude Code CLI, costs Anthropic roughly the same amount of compute whether the customer is a hobbyist or a small team running autonomous loops overnight. Third-party harnesses tipped that asymmetry: estimates circulated in February suggested that the same workload, billed at API rates, would have run more than $1,000 per month per heavy user. Anthropic’s switch was less a moral crusade than a FinOps decision. OpenCode’s response was to make the FinOps decision the customer’s. OpenCode Zen routes per-request to a curated set of models — GPT-5.2 at $1.75 per million input tokens and $14 per million output; Grok Code Fast 1; Gemini variants; the Claude family where directly contracted — with workspace-level monthly caps and per-seat limits. Raad’s own framing on X, cited in the Pragmatic Engineer notes, is that inference is very profitable once the GPU is amortised: “There’s potential for 90% margin depending on the model.” Translation: the closed labs are not subsidising developer subscriptions out of charity, and an aggregator can plausibly run the same workload at a positive margin while charging less. For OpenCode the strategic ledger looks like this: one revenue line (Zen pay-as-you-go), no single-vendor dependency, an air-gappable on-prem deployment for regulated buyers, and a community of around a million daily users producing the kind of feedback loop that only Linux and Kubernetes have previously achieved in developer tooling. The pivot point was not technology. It was that the block forced OpenCode to become what every CIO procurement deck for the last 24 months has been asking for: model-agnostic, source-available, audit-friendly, and priced by the unit of work rather than by the seat.

·04The Governance Plumbing

For DAX40 CIOs the interesting question is not whether OpenCode is the best harness — Cursor and Claude Code remain credible inside the Anthropic walled garden — but whether OpenCode is now the cleanest answer to a compliance brief. The honest answer in May 2026 is: closer than anything else. OpenCode ships an enterprise deployment mode that runs fully air-gapped — no outbound calls, models hosted on the customer’s own GPUs or via a chosen provider, full session logs persisted to the customer’s SIEM. That is a different conversation from we have a SOC 2 report. It is the conversation a Lufthansa, an Allianz or a Volkswagen IT auditor actually wants: which model handled which file, under which user’s identity, against which branch, with which tool calls, written to which immutable store. The Anthropic episode also writes a procurement clause for the next five years. A CIO can now point to January 9 as the canonical case of a closed lab unilaterally revoking a working integration with sixteen hours notice and no SLA on reversal. That moves model portability from a slide in an architecture deck to a board-level risk control. Expect the next wave of European framework agreements — particularly under the EU AI Act’s general-purpose AI obligations from August 2026 — to require demonstrable model substitution within a working day, with attestation logs written to an MDM-controlled endpoint. OpenCode, almost by accident, is the reference implementation. Anthropic’s response, fairly read, is not unreasonable: a flat-rate consumer subscription was never priced to underwrite headless agent loops, and the company has a legitimate interest in defending the unit economics that pay for the next model. But that defence reads in the boardroom as exactly the risk the CIO is being paid to neutralise.

Three Perspectives What this story means for different readers

For a Frankfurt or Munich CIO, OpenCode is now the cleanest counter-bid in any 2026 developer-tooling RFP. The procurement logic writes itself: same harness for every engineer, choice of Claude, GPT-5, Gemini or an internally hosted Mistral; one billing relationship through OpenCode Zen or direct contracts; full audit trail; air-gapped option for regulated workloads; no single point of vendor failure. Anthropic’s January 9 block, in a procurement sense, was a free reference call. The remaining work is the integration: connecting OpenCode session logs to the SIEM, attesting binaries through the MDM, and binding agent identities to the corporate IdP. None of that is novel — it is the same plumbing that took Kubernetes from hobby project to default platform between 2017 and 2020.

The European AI Act’s general-purpose AI obligations begin biting from August 2026, and BaFin, BSI and the Bundesnetzagentur are already circulating draft guidance on what meaningful human oversight looks like inside an autonomous coding loop. A model-portable, source-available harness with deterministic logging is materially easier to defend than a closed binary calling a single foreign provider. Anthropic’s terms-of-service change on February 19 also raises a quieter question: under the Digital Markets Act, can a designated gatekeeper unilaterally revoke a working integration without notice? OpenCode is not a DMA case, but the legal community is already using January 9 as the worked example in conference panels on AI interoperability obligations. Expect a Commission consultation by year-end.

The funding implication is sharper than the headline. Pre-block, the consensus VC bet was that the coding-agent category would consolidate around two or three closed harnesses — Cursor, Claude Code, GitHub Copilot Workspace — with open-source projects relegated to hobbyists. Post-block, OpenCode is a million-daily-user proof point that the open category exists, monetises, and grows fastest when the closed players overreach. Sequoia-style positioning on Cursor’s $9bn valuation now has to price in a credible open alternative with comparable scale. For European founders, the more interesting signal is what OpenCode did not need: it has no announced institutional round, runs on a small team out of the SST/Serverless Stack ecosystem, and turned a hostile platform decision into eight months of organic growth. That is the model — literally and figuratively — for the next cohort of European developer-tooling startups.

Sources 8 references

State of the software engineering job market in 2026 (Gergely Orosz, The Pragmatic Engineer, May 26, 2026)

Drawing on exclusive TrueUp and Workforce.ai data, Orosz finds software-engineering job postings up in the US and UK over the past twelve months, flat in Canada, and visibly declining in Germany and France — even as top-tier US tech firms list 20% more engineering roles than a year ago. Meta’s two-year hiring sprint ended in last week’s 10% layoffs, while Apple, Google, Stripe and Datadog keep adding headcount, and AI-engineering listings at large firms are up 50–100% year-on-year. Why this matters: the DACH softening is not a cyclical blip but a structural divergence — US-headquartered employers are absorbing the AI-engineering boom while European HQs stay cautious, which directly shapes salary benchmarks, attrition risk, and the build-versus-buy calculus for DAX40 tech functions and the consultancies staffing them.

Source

Avoiding Death on the Yellow Brick Road (Joe Schmidt, a16z, May 27, 2026)

Schmidt argues the labs — OpenAI, Anthropic — will own what he calls the Yellow Brick Road of horizontal work that improves with raw model capability: code generation, writing, image creation, generic copilots. Everything else in Oz — vertical workflows with messy data, multi-step approvals, regulatory weight, and tribal domain knowledge — is defensible for focused application companies that own the system of work, route across models, absorb migrations, and act as the compliance control plane. He offers three concrete tests (tools-and-steps, system-versus-tool, customer-P&L) plus operator notes from 11x and FurtherAI. Why this matters: for enterprise buyers and the consultancies advising them, the essay is a useable framework for triaging the AI vendor stack — which incumbents are about to be hollowed out by a lab seat, and which vertical agentic players are worth a multi-year integration commitment in legal, insurance, underwriting or GTM.

Source