The CEOs Are Doing It Themselves Now

Saturday, February 21, 2026·☀️ Morning·30 min read·12 stories

Something shifted overnight that's worth sitting with. The Information reports that Silicon Valley CEOs are bypassing their own organizations to code, design, and analyze using AI tools directly — not as a novelty, but as a new operating norm. Simultaneously, Atlassian froze hiring amid what it calls 'business software market turmoil,' Anthropic shipped a security tool that sent cybersecurity stocks into freefall, and Oracle announced it's cutting up to 30,000 jobs explicitly to fund AI data centers. The thread connecting all of this isn't 'AI is replacing workers.' It's that the organizational structures built to coordinate human effort are becoming the bottleneck, and the people at the top of those structures are starting to notice — by routing around them.

High Significance

Silicon Valley CEOs Find a New Ethos: I'll Do It Myself

↗The Information

The Information reports that Silicon Valley CEOs are increasingly using AI tools to bypass their own teams — coding, designing, and analyzing directly rather than delegating through management layers. This is not a productivity hack story. It's a structural signal about what happens when the person with the most context about what needs to happen discovers they can execute faster alone with AI than by routing work through the coordination machinery their company was built around.

Read this alongside three other data points from the same news cycle: Atlassian has frozen hiring amid what's described as turmoil in the business software market. Oracle is preparing to cut 20,000-30,000 jobs — its largest layoff ever — explicitly to fund AI data center construction for its $300 billion OpenAI partnership. And Anthropic shipped a security tool that reportedly sent cybersecurity stocks reeling, with the market actively pricing in AI agents as a threat to established enterprise software moats.

The pattern is consistent and accelerating. The firm's entire structure — management layers, coordination overhead, standardized processes — exists to solve the old scarcity problem: implementation was hard, so you needed lots of people organized in hierarchies to get things done. When the CEO discovers that AI collapses the implementation cost to near-zero, the hierarchy doesn't just become unnecessary. It becomes the thing standing between the decision-maker and the outcome. The interesting question isn't whether this trend continues. It's what happens to the 20,000 Oracle employees, the Atlassian hiring pipeline, and the cybersecurity vendors when the people running these institutions stop pretending the old structure still works.

High Significance

GPT-5.3-Codex-Spark Hits 1,200 Tokens Per Second

↗Simon Willison's Blog

OpenAI's Thibault Sottiaux announced that GPT-5.3-Codex-Spark is now serving at over 1,200 tokens per second — a 30% speed improvement. For a coding-focused model, this isn't an abstract benchmark. It's the difference between an agentic workflow that feels like waiting and one that feels like thinking.

At 1,200 tok/s, the model generates faster than most developers can read. This pushes real-time agentic coding from 'impressive demo' to 'default workflow' territory. The tight feedback loops that make tools like Cursor and Claude Code productive — generate, review, iterate, ship — get meaningfully tighter when the generation step approaches zero latency. Combined with the looped reasoning research showing 2.6B models outperforming 7B-8B models through recursive inference, the inference cost curve is being attacked from both ends: make the big models faster, make the small models smarter.

🧵Arc

High Significance

Karpathy Names the Next Layer: 'Claws' Are Here to Stay

↗Simon Willison's Blog

Andrej Karpathy bought a Mac Mini — 'the apple store person told me they are selling like hotcakes and everyone is confused' — to tinker with the emerging category of persistent AI agent systems he's calling 'Claws.' Simon Willison thinks the term will stick, and given Karpathy's track record with 'vibe coding' and 'agentic engineering,' he's probably right.

The definition is crystallizing: AI agents that run on personal hardware, communicate via messaging protocols, handle both direct instructions and scheduled tasks, and persist across sessions. OpenClaw is the flagship (180K+ GitHub stars), but NanoClaw (~4,000 lines), zeroclaw, ironclaw, and picoclaw are proliferating. Karpathy specifically calls out NanoClaw's small footprint as attractive — 'fits into both my head and that of AI agents, so it feels manageable, auditable, flexible.'

This matters because it names a layer of the stack that was forming without a label. LLMs are the foundation. Agents are the interaction pattern. Claws are the persistent, schedulable, locally-running orchestration layer above agents — the thing that turns a coding assistant into something closer to a personal operating system. The emoji is already locked in: 🦞.

🧵Arc

Simon Willison Ships Five Integrations in a Morning With Claude Code

↗Simon Willison's Blog

Simon Willison details how he used Claude Code and Claude Artifacts to rapidly prototype and deploy a complex multi-source feed integration for his blog. The feature — called 'beats' — pulls in five different content types (GitHub releases, TIL posts, museum blog entries, vibe-coded tools, and AI research projects) from five different sources, each requiring custom integration work.

The workflow is instructive. He started by prototyping the concept in regular Claude, cloning his public repo and asking it to mock up the UI using his actual templates and CSS. Once the artifact mockup convinced him the concept had legs, he handed the implementation to Claude Code, which handled the 'potentially tedious UI integration work' of making new content types work across all his page types and faceted search engine. One integration needed a parser for a Markdown README — Claude Code 'spun up a parser regex' on the spot.

The meta-point: this is the kind of personal infrastructure project that 'sits on the shelf forever' without AI assistance. Not because it's technically hard, but because the friction of wiring together five different data sources with proper UI integration is exactly the kind of tedious, low-glory work that never wins the priority queue. AI coding agents don't just make hard things possible — they make boring things worth doing.

🧵Arc

A Developer Built His Wife's Bakery a Production Management System. It's Open Source.

↗Hacker News

A developer looked at production management software for his wife's micro-bakery and found everything was either too expensive or too generic. So he built Craftplan — a full production management system with versioned recipes (BOMs with cost rollups), inventory with lot traceability, demand forecasting, allergen tracking, batch planning, and purchasing. Built with Elixir, Ash Framework, Phoenix LiveView, and PostgreSQL. Open-sourced on GitHub. 577 HN points and 167 comments.

This is the thesis in miniature. The existing solutions were built by companies serving the 'average' small manufacturer — generic enough to be sold broadly, expensive enough to justify the sales team. A developer who actually understood the workflow of a small-batch bakery built exactly the right tool because the actual workflows 'aren't that complex.' The complexity was in the vendor's business model, not in the problem.

The 577-point response tells you something about resonance. Builders recognize this pattern instantly: the gap between what exists and what should exist, bridged by someone who cares enough to close it. The live demo is at craftplan.fly.dev.

🧵Arc

High Significance

2.6B Parameters, 3x Reasoning: Looped Models Decouple Data From Compute

↗r/LocalLLaMA

New research on 'Looped Language Models' introduces Oro, a 2.6B parameter model that shifts reasoning from vocabulary-space chain-of-thought into latent space through recursive looping. Instead of generating visible reasoning tokens, the model passes its internal representation through an exit gate repeatedly until a certainty threshold is met. The result: a 2.6B model outperforming traditional 7B and 8B models (Gemma-3, Qwen-3) on knowledge manipulation tasks.

The critical finding: looping does almost nothing for knowledge storage — if the model hasn't seen a fact, no amount of looping will conjure it. But for knowledge manipulation — tasks requiring the model to reason over stored facts — the gains are dramatic. This maps to the biological analogy: we don't grow new neurons to solve a hard problem; we think longer with the ones we have.

The practical implication for local builders is significant. If this principle scales (and the researchers believe it does, though nobody has tested it at the 100B+ level yet), then 300-400B SoTA performance could theoretically be compressed into models that run on consumer hardware. The architectural insight — decouple data capacity (parameters) from compute capacity (loops) — opens a third scaling axis beyond bigger models and longer chains of thought. Combined with reports that Qwen3 Coder Next remains highly capable even at aggressive 2-bit quantization, the local inference story continues to get more interesting.

🧵Arc

High Significance

OpenAI Projects $111 Billion in Cash Burn Through 2030

↗The Information ↗The Information

Two financial disclosures from OpenAI paint a stark picture of frontier AI economics. First: the company has raised its revenue forecasts but also projects $111 billion in cumulative cash burn through 2030, while falling short on margin targets. Second: OpenAI pays 20% of its total revenue to Microsoft through 2032 as part of their partnership arrangement.

That 20% figure is the structural detail that matters most for builders. Every dollar flowing through the OpenAI ecosystem has a permanent toll attached — not from a technical bottleneck, but from a contractual obligation negotiated when OpenAI needed Microsoft's cloud infrastructure to exist. This is the bridge toll thesis applied to the AI stack itself: Microsoft controls the infrastructure, OpenAI controls the models, and the 20% flows upward regardless of whether the underlying costs justify it.

The $111 billion burn projection raises a different question: at what point does the centralized frontier lab model require becoming an extractive toll collector itself to survive? When your cost structure demands that kind of capital, the pressure to maximize margins on API pricing becomes existential. Builders pricing against OpenAI's API costs should factor in that 20% Microsoft tax as a structural floor that won't compress.

High Significance

Cursor Trains Its Own Coding Model, Ships Custom GPU Kernels

↗Cursor Blog ↗Cursor Blog ↗Cursor Blog

Two technical posts from Cursor reveal how far the company has moved beyond 'IDE wrapper.' First: Composer, a coding-specific model trained via reinforcement learning and purpose-built for software engineering agent tasks. A developer tools company training its own frontier model — rather than just wrapping someone else's — is a meaningful capability shift. Second: custom MXFP8 kernels built specifically for Blackwell GPUs, achieving a 3.5x speedup on Mixture of Experts layers and 1.5x faster overall MoE training.

A company that barely existed two years ago is now doing custom GPU kernel work and training its own models. This is not the behavior of a wrapper. It's the behavior of a company that has identified the specific bottleneck in its product (coding agent intelligence and speed) and is willing to go all the way down the stack to remove it. The RL approach specifically optimized for coding agent performance — not general chat, not benchmarks, but the actual task their users care about — is the kind of problem-intimate engineering that produces tools people can't stop using.

The University of Chicago study they cite — companies merge 39% more PRs after Cursor's agent became the default — is third-party academic data, not vendor self-reporting. The specific metric (merged PRs, not generated code) is a meaningful proxy for shipped work.

Oracle Cuts Up to 30,000 Jobs to Fund AI Data Centers

↗HrKatha

Oracle is reportedly preparing to cut 20,000-30,000 jobs — its largest layoff ever — explicitly to fund AI data center construction for its $300 billion OpenAI partnership. The company expects $8-10 billion in savings from the workforce reduction.

This is not a 'restructuring for efficiency' story. It's a company liquidating its existing human capital to finance a bet on a fundamentally different business model. Oracle's traditional enterprise software business requires large teams of salespeople, consultants, and support engineers to maintain. The AI data center business requires massive capital expenditure on hardware and relatively fewer humans. The math is simple: fire the people, use the savings to buy the GPUs.

Read this alongside Indeed and Glassdoor's parent company Recruit Holdings cutting 1,300 jobs to 'shift toward AI-driven operations.' The platforms that exist to help people find jobs are eliminating their own jobs because of AI. And Capgemini is cutting 2,400 roles in France, explicitly citing AI-led demand shifts reshaping which service lines clients want. The pattern across all three: legacy intermediaries are not gradually adapting to AI. They're performing emergency surgery on their own organizations to survive the transition.

The Vibe Coding Security Gap: A Linter for the Holes AI Leaves Behind

↗r/SideProject ↗Hacker News

A developer built prodlint — a free, open-source linter specifically designed to catch the security flaws that AI coding assistants routinely generate. After noticing that vibe-coded apps often compile perfectly with zero TypeScript or ESLint warnings but lack vital security guardrails, the builder tested it against seven open-source repos built with AI tools. Six of seven had critical security issues. One trading bot had API key fallbacks hardcoded in the source. A Supabase app had zero access controls on its database tables.

This arrives alongside a widely-shared thread from a builder who vibe-coded and shipped an app in three days, then got hacked twice. The pattern is consistent: AI coding tools are excellent at generating functional code and terrible at generating secure code. They optimize for 'does it work?' not 'can it be exploited?' The gap between implementation speed (which has collapsed to near-zero) and security knowledge (which hasn't) is producing a new class of vulnerability — apps that look complete and ship fast but are structurally exposed.

For builders shipping with AI tools: npx prodlint in your project directory. It runs in about 100ms and checks for the specific patterns AI gets wrong — missing database security, hardcoded credentials, empty error handling, hallucinated package imports. The tool is MIT licensed and on GitHub. Whether or not you use this specific linter, the underlying lesson is clear: AI can write the code, but it can't yet audit its own security assumptions. That's still your job.

🧵Arc

Nvidia's GB10 Brings Datacenter-Class Inference to the Living Room

↗PCMag (via Hacker News)

PCMag reports running 'serious AI models' on Nvidia's GB10 Superchip at home. This is the consumer hardware side of the inference cost curve — while Taalas attacks the problem with burned-in ASICs at 16,000 tokens per second, Nvidia's approach preserves flexibility. You can swap models, experiment with architectures, run whatever weights you want.

The tradeoff is clear: the GB10 won't match dedicated silicon on raw throughput for a single model, but it won't lock you into one architecture at fabrication time either. For builders evaluating their inference strategy, the local hardware options are now genuinely competitive with cloud for many workloads. The question has shifted from 'can I run this locally?' to 'what's the right mix of local and cloud for my specific latency, cost, and privacy requirements?'

Klarna's AI Agent Strategy Threatens Traditional SaaS Valuations

↗Hacker News

Klarna's public shift toward using internal AI agents to replace external enterprise SaaS tools is being analyzed as a structural threat to traditional B2B software valuations. The buy-vs-build math has changed: when AI agents can replicate the functionality of a $50K/year enterprise tool in weeks, the question becomes why you'd keep paying the subscription.

This connects directly to Atlassian's hiring freeze and Anthropic's security tool sending cyber stocks into freefall. The enterprise software moat was always 'it's too hard and expensive to build this yourself.' When that assumption breaks — when a motivated team with AI agents can stand up equivalent functionality faster than the procurement cycle for the commercial alternative — the entire valuation framework for B2B SaaS comes under pressure. The toll collectors who sit between companies and their operational capabilities are discovering that the toll booth itself is now trivially cheap to replicate.

🧵Arc

🧵Developing Stories

The Race to Sub-Penny Inference

Three simultaneous developments this morning: GPT-5.3-Codex-Spark hits 1,200 tok/s (30% faster), looped language models demonstrate 2.6B parameters outperforming 7B-8B through recursive latent reasoning, and Qwen3 Coder Next remains highly capable at aggressive 2-bit quantization. The inference cost curve is being attacked from every direction — speed improvements on frontier models, architectural innovations that trade time for capability, and quantization techniques that preserve quality at dramatically lower memory footprints.

The Machine-to-Machine Web: Rise of Autonomous Agent Networks

Karpathy's endorsement of 'Claws' as a category name for persistent, locally-running agent systems crystallizes what's been forming for weeks. The MCP ecosystem hit 79K GitHub stars. Agent Passport shipped OAuth-style identity for agents. The ROS Bridge enables LLMs to control physical robots. The infrastructure layer for autonomous agent networks is being built in public, by solo developers, at remarkable speed.

The Rise of Vibe Coding: From Snippets to Shipping

The security consequences of rapid AI-assisted shipping are becoming a concrete problem. A purpose-built linter (prodlint) found critical security issues in 6 of 7 AI-built repos tested. A builder who shipped in three days got hacked twice. Meanwhile, Simon Willison demonstrates the positive case — shipping five complex integrations in a morning with Claude Code. The gap between what AI coding tools enable (everything) and what they secure (nothing) is the emerging fault line.

MCP Adoption: From Protocol to Practice

The MCP ecosystem continues rapid expansion. Community servers surpassed 79,000 GitHub stars. Murl shipped a curl-like CLI making any MCP server scriptable from bash. A prediction market search engine (Attena) exposed 80K contracts via MCP. The protocol is winning the standardization race for agent-to-world communication.

The CEO who bypasses their own org chart to ship with AI is doing the same thing as the developer who builds their wife's bakery a production management system instead of buying the expensive generic one. They're both discovering that the coordination overhead — whether it's a management hierarchy or a SaaS subscription — was never solving their problem. It was solving the old scarcity problem: implementation was hard. Now that it isn't, the question everyone is asking, from the C-suite to the kitchen table, is the same: what was I paying for?

This edition: 12 stories · $0.18 to produce

Generated 13:45 UTC · anthropic/claude-opus-4.6