Eclecta The frontier, distilled Daily brief 2026-05-14
← Front page

Thursday, May 14, 2026

OpenAI hand-builds a Windows sandbox for its Codex agent and discloses an npm worm that forced a code-signing certificate rotation, while Microsoft Research opens up the mimalloc allocator.

An agent sandbox where the OS offers none

OpenAI published an engineer’s account of building a filesystem- and network-isolation sandbox for its Codex coding agent on Windows, which, unlike macOS (Seatbelt) and Linux (seccomp, bubblewrap), ships no primitive that maps to running an autonomous agent safely. The value is the dead-end analysis: it explains why each off-the-shelf option fails. AppContainer is too narrow for open-ended development work; Windows Sandbox is a throwaway VM, missing on Home editions and blind to the real checkout; Mandatory Integrity Control relabeling turns the workspace into a write sink for every low-integrity process on the host.

The first prototype scoped file writes with a synthetic SID and a write-restricted token, requiring both a normal owner ACL and a restricted-SID grant. Network blocking was advisory only, via poisoned proxy environment variables, and any program that opened a socket directly bypassed it. That gap drove a redesign: Windows Firewall can match a binary, user, or port, but not a restricted token’s SID, so the team now runs agent commands as dedicated local users (CodexSandboxOffline, firewall-blocked, and CodexSandboxOnline) with DPAPI-encrypted credentials. A privilege wall at CreateProcessAsUserW forced splitting execution into a separate command-runner that runs as the sandbox user and mints the restricted token itself. The shipped design spans four binaries plus an elevated setup step. It is a single-author, first-person narrative: no benchmarks, no red-team results, and the safety claims are self-reported.

A worm in OpenAI’s dependency tree

OpenAI disclosed that the TanStack npm library was compromised on May 11, 2026 as part of the “Mini Shai-Hulud” supply-chain worm, infecting two employee devices that predated its phased rollout of package controls. OpenAI says the worm reached a limited subset of internal repositories and exfiltrated only limited credential material, with no evidence of customer-data, intellectual-property, or production-system compromise and no misuse of the stolen credentials; an unnamed forensics firm was engaged. The consequence: those repos held code-signing certificates for iOS, macOS, Windows, and Android, so OpenAI rotated all of them. Only macOS users must act (updating ChatGPT Desktop, the Codex app and CLI, and Atlas) by a deadline extended from June 12 to June 26 in coordination with Apple. The controls that would have blocked it, npm’s minimumReleaseAge plus CI/CD credential hardening and provenance validation, shipped after an earlier Axios incident but had not reached the two devices.

Systems internals

Microsoft Research’s RiSE group published an annotated walkthrough of mimalloc, its roughly 12,000-line C allocator and drop-in malloc/free replacement. Each thread owns a heap of ~64 KiB pages split into size classes, so most allocations and frees need no synchronization; only a cross-thread free takes an atomic compare-and-swap. Each page keeps three free lists, and with thousands of lists, concurrent frees to the same page stay statistically rare, an approach the authors compare to randomized algorithms. Microsoft reports an ~800-thread workload where mimalloc committed 1.3 times its live data against roughly 4 times for an unnamed competitor; the competitor and methodology are undisclosed, and the full technical report is promised later. The allocator powers NoGIL CPython 3.13+, Unreal Engine, and Bing.

Cloudflare rebuilt its Browser Run headless-browser service on Durable Object-backed Containers, reporting 120 concurrent browsers, four times the prior 30, and Quick Action latency down by more than half. The citable lesson: Workers KV’s roughly 30-second eventual consistency caused race conditions when used to track live container availability, so Cloudflare moved that state to D1 for atomic assignment and batched the heartbeat writes through Queues in 100-row batches. All figures are vendor-reported.

Separately, Microsoft Research released GridSFM, an open-weight (research-use-only) neural operator that approximates AC optimal power flow across grids from 500 to 80,000 buses, claiming a 2.23% median cost gap against a solver and roughly 1,000 times faster solves; the numbers are vendor-reported with the white paper still to come.

What to watch today

  • macOS users of OpenAI’s apps face a June 26 update deadline; new notarization on the old certificate is already blocked.
  • mimalloc’s promised technical report, with the Azure Cosmos DB team, would let the 1.3x-vs-4x page-stealing claim be checked against named competitors.
  • GridSFM’s white paper, due to substantiate its AC-OPF benchmarks and open the production-scale (80,000-bus) Premier tier.

← All digests