Thursday, June 4, 2026

Half of Tier 1 networks don’t enforce First AS

Cloudflare tested which Tier 1 backbones reject forged BGP routes and found about half do not. The attack it traced, flagged by Spamhaus, originates parked or unused prefixes and forges the entire AS_PATH, omitting the attacker’s own network so the route looks short and RPKI-valid. Cloudflare argues this defeats ASPA and RPKI path validation: a path forged down to a legitimate origin AS leaves no invalid relationship to flag.

The fix exists. First AS enforcement (RFC 4271) drops any route whose first AS_PATH entry isn’t the neighbor that sent it. To measure adoption, Cloudflare broke the rule on its own prefixes, prepending a non-primary ASN ahead of AS13335 and watching public route views. Seven Tier 1s enforce it: Cogent, Arelion, GTT, PCCW, Orange, Tata, and AT&T. Roughly half do not, predominantly networks running Juniper gear whose defaults leave it off. Cloudflare, an affected party itself, recommends turning on enforce-first-as on every EBGP session except IX route-server peers.

Microsoft’s full-stack pitch

At Build, Microsoft announced seven models built from scratch, led by the reasoning model MAI-Thinking-1. Microsoft says it is a 35B-active mixture-of-experts with a 256K-token context that scores 97% on AIME 2025 and 53% on SWE-Bench Pro, and that blind raters preferred it to Claude Sonnet 4.6. The more durable artifact is a 109-page technical report disclosing scaling methodology and a data pipeline Microsoft says used no third-party distillation or synthetic data, a clean lineage it pitches to enterprises. Several figures circulating come from third-party tweet readings that conflict with Microsoft’s own, and a viral “Mythos FLOPs leak” was retracted by those who posted it; the AINews recap is aggregation, the report the primary surface.

Stratechery’s Ben Thompson argues the same launches undercut the “AI PC.” Nvidia’s RTX Spark superchip (a 20-core Arm CPU, a Blackwell GPU, 128GB LPDDR5X, built with Microsoft, shipping this fall) spends die area on a GPU weaker than cloud inference, when an agent wants a strong local CPU and remote inference. He prefers Microsoft’s Project Solara, an Android-based platform for agent-running enterprise devices, while calling it vaporware. No benchmarks accompany the Spark; the MAI and Solara figures are keynote claims.

The price and supply of coding AI

Uber now caps each employee at $1,500 a month per AI coding tool, applied separately to tools like Cursor and Claude Code, a spokesperson told Bloomberg. Simon Willison notes the cap tops out near $36k a year per engineer if two tools are maxed, about 11% of Uber’s ~$330k median US engineer pay; it follows reports that Uber spent its 2026 AI budget in four months.

On the supply side, 404 Media reported, relayed by 9to5Google, that Google is inviting select Play Store developers to sell their Android app code, both live codebases and archived prototypes, to “improve Google’s developer tools and products.” The recruitment email reportedly never says “AI,” though its linked page is about AI partnerships. App code is mostly private, so Google would pay for non-public training data rather than scrape it. No terms are disclosed.

Research

A preprint audits why inserting an LLM “rewriter” before a smaller reader in retrieval-augmented QA lifts F1 by tens of points. Removing the gold answer span from the rewrite cut reader F1 by 28–64 points beyond a length-matched placebo, so the gain tracks the answer string’s literal presence, not better evidence curation. The authors release a test harness, not a new method; some model names and the arXiv id look irregular, so weigh provenance with care.

MIT CSAIL and Harvard report that a small model with a Monte Carlo inference loop asks better questions than a frontier one. In a natural-language “Battleship,” Llama 4 Scout’s win rate against humans rose from 8% to 82%, beating GPT-5 at about 1% of the cost. It is an ICLR oral on toy domains, expert humans still win, and a coauthor is OpenAI-affiliated.

Google Research open-sourced the LSTM flood-forecasting models behind Flood Hub as a PyTorch package under Apache 2.0, trained on the open Caravan dataset. Google says the newer version extends reliable forecast horizons by six days in gauged basins and one in ungauged ones, on its own benchmark.

What to watch today

Whether the Juniper-heavy Tier 1 non-enforcers turn on enforce-first-as after Cloudflare’s disclosure.
Independent replication of Microsoft’s MAI-Thinking-1 benchmark claims now that the 109-page report is public.
Whether other labs follow Google in paying individual developers for private source code.
Benchmarks for Nvidia’s RTX Spark, due to ship this fall, against the agentic-era “AI PC” thesis.