Apple Concedes the Model Layer: Gemini-Powered Siri at Cook’s Last WWDC
Today’s keynote is set to confirm a $1B-a-year Gemini deal and an Extensions framework that turns the iPhone into model-agnostic distribution..
At 10am Pacific today, Tim Cook will walk onstage at Apple Park to open WWDC 2026 for the last time as CEO. According to reporting by Bloomberg’s Mark Gurman and others, the keynote is expected to confirm two decisions that reverse a decade of Apple AI orthodoxy. First, Siri’s new brain will not be an Apple model: it will be a custom, roughly 1.2-trillion-parameter Google Gemini variant, running inside Apple’s own Private Cloud Compute enclaves, paid for at around $1 billion a year. Second, an “Extensions” framework in iOS 27 will let users route Apple Intelligence requests to ChatGPT, Claude or Gemini, ending ChatGPT’s de facto exclusivity. Apple, in short, is set to concede that it cannot win the model layer and will instead compete where it has always competed: distribution, integration, and the user interface.
Cook is expected to take the stage for a brief opening, hand the keynote to software chief Craig Federighi, and let the engineering bench carry the news. The choreography is the message: this is no longer Cook’s product to ship. On September 1, hardware engineering chief John Ternus formally takes over as Apple CEO, and the Siri overhaul is the inheritance Cook is keen to leave tied up rather than open. Federighi, according to Gurman and AppleInsider’s account of a “fateful” 2025 leadership meeting, scrapped the hybrid “legacy Siri plus LLM” architecture and put Mike Rockwell in charge of a full rebuild. John Giannandrea, the former Google executive Apple hired in 2018 to lead AI, had his Siri remit stripped in March 2025 and quietly left in April 2026 once his stock vested. The core announcement, telegraphed by Gurman in Bloomberg on November 5 and reconfirmed in his June 7 pre-WWDC newsletter, is that the rebuilt Siri runs on a bespoke 1.2-trillion-parameter mixture-of-experts Gemini model, trained by Google to Apple’s specification and served inside Apple’s Private Cloud Compute. Apple’s own server-side foundation model tops out around 150 billion parameters; its on-device model is roughly 3 billion. The new Siri is therefore an order of magnitude larger than anything Apple ships itself. Gurman reports Apple ran a “bake-off” between Anthropic and Google. Anthropic’s Claude was judged technically superior; Google won on price and on the leverage of the existing Safari search relationship, worth around $20 billion a year to Apple. Federighi is expected to demonstrate a system-wide “Search or Ask” gesture that pulls up a chatbot-style Siri with personal context across mail, photos, files and on-screen content, plus multi-step actions across apps. More remarkable still is the second shoe: Apple Intelligence Extensions. The framework, previewed in MacRumors and 9to5Mac reporting in May, lets any qualifying AI provider register through the App Store and be selected as the default model behind Siri, Writing Tools and Image Playground. Users will reportedly be able to assign distinct voices to different providers, so they can hear whether a given answer came from Apple, Google, Anthropic or OpenAI. John Gruber, writing on Daring Fireball, has called the strategic logic clean: if Google’s model is better, Apple should use it. The catch is that “should” is the word of a company that has accepted what it once denied — that the model layer is not where it will differentiate.
The architecture being unveiled is best read as three layers stacked uneasily on top of each other. At the bottom sits Apple’s existing on-device 3B foundation model, handling local summarization, Writing Tools, and anything that can be done within the secure enclave of an iPhone or M-series Mac. Above that is the 1.2T Gemini variant, running on Apple silicon servers behind Private Cloud Compute, providing the cloud brain for the new Siri — a black box whose weights Apple licenses but does not own. Above that sits the Extensions layer: a router that can hand a query to ChatGPT, Claude or Gemini’s own consumer endpoint, with the user picking the destination. Apple controls the gesture, the context-passing, the privacy boundary, and the App Store gate. It does not control the intelligence. This is a structural concession. From the 2018 hire of Giannandrea through the June 2024 “Apple Intelligence” launch, Apple’s stated position was that on-device, Apple-trained models plus a small server tier would suffice for the vast majority of user needs. The Gemini deal is an acknowledgment that the gap between a 3B on-device model and a frontier 1T+ MoE is not closable on Apple’s silicon and timetable, certainly not before iOS 27 ships in September. The historical parallel is not Microsoft-OpenAI, where Microsoft holds equity and IP rights. The closer analogue is the original Google-Safari default search deal, struck in 2002 and now worth roughly $20 billion a year. There, Apple decided that running its own search engine was not worth the capital or the distraction, took the rent, and concentrated on the device. The Gemini deal is the same logic inverted: this time Apple is the one paying, because the relevant rent (data, training compute, talent) accrues to the model provider, not the device maker. Ben Thompson at Stratechery has argued for two years that Apple’s AI play is aggregation, not foundation — Apple’s edge is the interaction layer, the personal context, the consent boundary, the 2.5 billion-device install base. Extensions push that logic further. Once a user can install Claude or Gemini as their default assistant, the iPhone becomes for AI what the EU-forced browser choice screen made it for the web: a neutral host. The EU Digital Markets Act has already named AI assistants as a 2026 priority enforcement area. By volunteering choice before being forced into it, Apple gets ahead of a regulatory wave, weakens OpenAI’s privileged 2024 carve-out, and quietly turns model providers into supplicants competing for App Store placement. The model becomes the commodity. The funnel — Siri’s invocation gesture, sitting one swipe away on every active Apple device — stays Apple’s.
Not everyone reads this as a clean win. Gary Marcus, who has spent two years calling the current LLM paradigm a dead-end for genuine reasoning, points out that Apple’s own ML research team published the “Illusion of Thinking” paper in 2025 showing that frontier models collapse on novel logic puzzles. Marcus’s reading: Apple knows the technology does not yet do what it is being asked to do, and is paying $1 billion a year to outsource the disappointment. Ed Zitron, on Better Offline, frames the deal in starker terms — Apple is renting a capability it could not build, from a competitor whose Android shipped a credible Gemini-powered assistant 18 months earlier. There are also questions Apple will not answer onstage today. Who eats the cost when Gemini hallucinates a calendar entry or misroutes an email? How does Apple’s privacy story — marketed for a decade as the differentiator — survive a world in which the assistant’s brain is a Google artifact, even if the weights run inside Apple enclaves? And how durable is a $1 billion annual payment when Google’s own incentives, post-antitrust ruling, increasingly favor running Gemini as a destination rather than a supplier? Gurman has already reported that OpenAI is “unhappy” with the deal, which is to say: the supplier market for frontier models is small, concentrated, and politically charged. Apple’s neutral-host story works only as long as the host can credibly threaten to switch. With Anthropic having lost the bake-off and Google holding both the Safari contract and the Siri contract, the threat has narrowed.
For DAX40 CIOs, the practical question is whether Apple’s Extensions framework finally gives them a defensible mobile AI posture. Today, employees use ChatGPT, Claude or Gemini through personal accounts on managed iPhones, with little MDM visibility. If iOS 27 lets IT specify a default Apple Intelligence provider — ideally an enterprise tenant of Claude or Gemini — and route Writing Tools and Siri queries through that endpoint with DLP controls, the model question shifts from “which app” to “which policy.” Expect Microsoft to push hard to add Copilot to the Extensions list. Procurement teams should also revisit the assumption that Apple Intelligence equals a privacy ceiling: with a Gemini brain in the loop, even via Private Cloud Compute, regulatory and data-residency questions reopen, particularly under the EU AI Act high-risk obligations that begin biting in August.
The deal lands on a hot regulator’s desk. The EU Commission’s April 2026 DMA two-year review named AI assistants and cloud as priority enforcement areas. Apple’s voluntary opening of Extensions — letting Claude or Gemini be the default — is best understood as DMA pre-compliance, modeled on the Safari browser-choice screen that ended Apple’s WebKit monopoly in Europe. In the United States, the Justice Department is still litigating the Google search remedy and will read a fresh $1 billion-a-year payment from Apple to Google as further evidence of the same default-rent dynamic. Antitrust authorities in Berlin and Brussels will watch closely whether Apple’s App Store gating of AI Extensions becomes the new chokepoint — effectively a 30 percent tax on model competition.
For the European model layer, the message is brutal: even Apple, with $200 billion in cash, has decided not to build a frontier model. Mistral, Aleph Alpha and the long tail of EU foundation-model startups now face an investor question that has been quietly forming since DeepSeek — if the hyperscalers and Apple are all renting from the same two or three labs, where is the venture math? The opportunity, instead, sits at the application and orchestration layer Apple is exposing. Extensions create a route for a German or French AI startup to land directly on 2.5 billion Apple devices without a billion-dollar training run, provided it can pass App Store review and offer a credible enterprise tenant. Expect a wave of Series B raises pitched as “the Claude wrapper for regulated industries.” The model-layer thesis, for European VCs, just got harder to defend.
Sources 17 references
- [1]Apple Plans to Use 1.2 Trillion Parameter Google Gemini Model to Power New Siri
- [2]What to Expect From WWDC 2026: Gemini-Powered Siri, iOS 27, macOS 27 and More
- [3]WWDC 2026: Apple’s Secret Meeting That Led It to Take AI Seriously
- [4]Apple AI’s Platform Pivot Potential — Stratechery
- [5]Apple and Gemini, Foundation vs. Aggregation — Stratechery
- [6]iOS 27 Will Let You Pick Claude or Gemini Instead of ChatGPT for Apple Intelligence
- [7]Apple nears $1 billion Google deal for custom Gemini model to power Siri
- [8]Daring Fireball: Apple and Google, Sitting in a Tree
- [9]Tim Cook expected to head WWDC 2026 keynote, for the last time
- [10]Apple’s WWDC: Tim Cook’s AI legacy at stake in his final developer conference as CEO
- [11]Former AI boss John Giannandrea officially leaving Apple this week
- [12]Apple Intelligence Foundation Language Models Tech Report 2025
- [13]EU’s Digital Markets Act Two-Year Review: AI and Cloud Are Now Priority Enforcement Areas
- [14]Marcus on AI — Archive
- [15]Google Defends $20B Apple Search Deal in Major Antitrust Appeal
- [16]Apple reaches 2.5 billion active devices after record-breaking quarter
- [17]Daring Fireball: Gurman Reports that OpenAI Is Unhappy With Apple Deal