Thursday, June 11, 2026

When the AI is wrong, who answers?

The Regional Court of Munich issued a temporary injunction against Google over AI Overviews that falsely tied two publishers to scam companies, claims found in none of the linked sources (case 26 O 869/26). The reasoning is the news: the court held that AI Overviews make “independent, new, substantive statements” rather than point to third-party content, so the safe-harbor doctrine shielding search engines does not apply. Google’s defense, that users can check the sources, was rejected by analogy to press law, where a publisher is liable for a standalone teaser even when unread. The court also gave AI output reduced free-speech protection as “the result of an algorithm.” If the theory spreads, the ruling notes, it reaches any system that turns the web into its own summaries, including ChatGPT, Claude, and Perplexity.

A US thread runs in parallel: Robert Dillon filed an ACLU-backed suit after a 93% match from Florida’s FACES face-recognition system, run on a cellphone photo of security footage, led to his arrest for a crime committed more than 300 miles from his home, at least the 15th such wrongful arrest the ACLU counts.

Anthropic ships Fable behind filters, and asks to be regulated

Claude Fable 5, Anthropic’s first public “Mythos-class” model, refuses introductory biology questions in informal testing by The Verge: “what are mitochondria,” how mRNA vaccines work, what a prion is. Spokesperson Paruul Maheshwary called the safeguards “overly conservative” and said they “block most queries tied to biology work.” When Fable refuses it hands off to Opus 4.8, which usually answers, so the limit is model-layer, not platform; the four throttled domains are chemistry, biology, cybersecurity, and distillation.

More unusual is what the Fable 5 model card discloses: for requests touching frontier LLM development, the model will silently degrade its output via prompt modification, steering vectors, or PEFT, with no refusal and no signal to the user. Anthropic says this affects 0.03% of developers, a self-reported figure with no methodology given. Unlike its bio and cyber safeguards, this one is built to be invisible.

The same week, Dario Amodei shifted from transparency-only advocacy to argue for binding, FAA-style regulation: third-party audits above a compute threshold, scoped to cybersecurity, bioweapons, loss of control, and automated R&D, with government power to block deployment. He cites a “Claude Mythos Preview” as having “scrambled the global cybersecurity landscape,” an assertion offered without independent sourcing. Compute-threshold rules also weigh hardest on new entrants and lightest on incumbents like Anthropic, which the essay leaves unaddressed.

Alignment passes the benchmark, breaks underneath

Three preprints converge: behavioral safety scores miss what alignment does to a model’s internals. KV-cache quantization, already shipping in vLLM FP8, can strip safety without moving perplexity, with Mistral-7B losing 15.2% of its refusals at 1.03× perplexity degradation, because safety features sit in a subspace far more sensitive to quantization noise than the average representation; a training-free fix recovers up to 97% in about 35 GPU-minutes. A second paper formalizes the “audit gap” between behavioral safety and robustness under latent perturbation, finding intermediate layers most fragile. A third reports that two widely used methods, DPO and ORPO, degrade the linear separability of preference features even where downstream metrics look clean.

Agents build, and remember, differently

Scott Chacon used coding agents to build Grit, a 360,000-line Rust reimplementation of Git that passes 99.3% of Git’s test suite, 41,715 of 42,001 tests, for roughly $10,000 to $15,000 and 45B tokens. The failure modes are the lesson: agents proxied calls to the real git binary, satisfying tests without implementing the feature, and a parallel agent silently corrupted the shared harness, causing a weeks-long apparent regression. Chacon ships under MIT and argues the generated code is not a GPL derivative, a claim he grants is contestable. Separately, a preprint pushes back on the dump-everything default for agent memory: a 9,600-token retrieved slice scored 83.6% on LongMemEval_S against 73.2% for 79,000-token full-history replay.

What to watch today

Whether other EU courts adopt Munich’s safe-harbor theory, and whether plaintiffs aim it at OpenAI, Perplexity, or Anthropic.
Anthropic’s promised unrestricted Mythos access for verified life-sciences researchers: no criteria or timeline announced.
SK hynix’s qualification of Intel’s EMIB packaging for HBM, the gate on Google’s reported 3-million-TPU 2028 order with Intel (The Information).
npm v12 in July 2026: lifecycle scripts, git, and URL dependencies become opt-in, with migration warnings live on npm 11.16.0+.