Eclecta The frontier, distilled Daily brief 2026-05-25
← Front page

Monday, May 25, 2026

A maintainer puts hard numbers to open source's agent-traffic problem, an AI disproves an 80-year-old Erdős conjecture, SPEC's new CPU benchmark gets its first independent teardown, and a CISA contractor publishes the agency's own cloud keys.

The human bottleneck

Armin Ronacher published 90 days of issue-tracker data on what agent traffic does to open-source maintenance, drawn from co-maintaining Pi, an AI coding agent built using itself. Of 3,145 external issues and pull requests, 2,504 were auto-closed as coming from non-approved contributors; about 17% were reopened, or 26% counting those a later fixing commit referenced. Of 714 auto-closed PRs, 60 were eventually merged, roughly 8%. His sharper point is qualitative: LLM-laundered issue reports, confident and scope-expanded and often wrong, are worse than no diagnosis, because an agent takes the issue body as evidence and follows its wrong path. A custom /is command that tells the agent to derive its own analysis from the code only partly helps. His thesis: AI multiplied code, projects, and issues without adding maintainers or users.

Two essays circle the same constraint. Addy Osmani, drawing on a Google I/O panel, names the orchestration tax: running coding agents in parallel does not multiply output, because review and merge route through one serial resource, the developer’s attention; he ties fleet size to review rate, usually low single digits. Benedict Evans argues per-occupation AI job-exposure scores are impossible in principle: a century of accounting automation coincided with rising accountant headcount, because cheaper analysis produces more work, not less.

AI disproves an Erdős conjecture

On May 20, OpenAI said an internal model produced a counterexample to Erdős’s 1946 unit-distances conjecture, which held that unit distances among n planar points grow no faster than n^(1+ε) for every ε>0. Writing at Computational Complexity, Bill Gasarch lays out the result and the dispute over who, or what, deserves credit. The construction dropped Erdős’s Gaussian-integer grid for number fields of unbounded degree, a tool from algebraic number theory not standard in combinatorial geometry. OpenAI’s paper lists no human co-authors: Lijie Chen ran the model; Mark Sellke and Mehtaab Sawhney verified it. Gary Marcus calls the work AI-assisted, not AI-generated. Humans then improved it fast: a streamlined proof by Noga Alon, W.T. Gowers, Will Sawin and others fixed ε near 6×10^-38, and Sawin sharpened it to about 0.014. Gasarch’s caveat: this may be a rare alignment of a known problem with an achievable counterexample, not a repeatable capability.

SPEC ships CPU2026

Chips and Cheese published the first independent teardown of SPEC CPU2026, the CPU benchmark vendors will cite for the next decade. The suite grows to 52 workloads from CPU2017’s 43, with larger source footprints. Running GCC 14.2.0 on Zen 5 and Intel Lion Cove, the author finds it leans harder on core throughput and less on branch prediction and last-level cache; its new lowest-IPC integer tests are the gcc and llvm compilation workloads. He faults the decade-old Ampere eMAG 8180 picked as the 1.0 reference, which an FX-8350 beats, and the dropping of 505.mcf and 520.omnetpp, CPU2017’s best proxy for memory-latency-bound game code. The run is single-system, and a Lion Cove sample that crashed at 5.7GHz ran at 5.5GHz, possibly understating Intel. Separately, MatX CEO and ex-Google TPU architect Reiner Pope derives chip design from logic gates up in a Dwarkesh Patel lecture, explaining why arithmetic area scales with the square of bit width, the case for FP4 and FP8; Patel is a MatX investor.

CISA leaks its own keys

KrebsOnSecurity reported that a CISA contractor with administrative access published plaintext credentials to dozens of the agency’s internal systems on a public GitHub profile named “Private-CISA,” a scratchpad for syncing between work and home machines. The exposed secrets included AWS GovCloud keys and an RSA private key that, per TruffleHog’s Dylan Ayrey, granted org-wide access to read every CISA-IT repository and tamper with its CI/CD pipelines; commit logs show GitHub’s secret-scanning protection was deliberately disabled. GitGuardian flagged the leak, yet CISA kept rotating keys more than a week later while stating no sensitive data was compromised. Congressional Democrats tied the lapse to a workforce cut by over a third after buyouts. Risky Business’s Adam Boileau called it a failure no technical control could catch: a contractor syncing work through an unmanaged personal account.

What to watch today

  • Whether CISA finishes rotating the exposed credentials, and how it answers the oversight letters from Sens. Hassan and Reps. Thompson and Ramirez.
  • First official vendor submissions to SPEC CPU2026, the test of whether its slow eMAG 8180 reference flatters new silicon.
  • Whether more AI-produced proofs follow the unit-distances result, testing Gasarch’s “perfect storm” caveat.

← All digests