Models

Frontier and open-weight model releases and what they can do.

openai.com2026-06-26AI modelsrel 9/10 score 7.5

Previewing GPT-5.6 Sol: a next-generation model

GPT-5.6 Sol introduces significant performance improvements and enhanced safety measures in coding, biology, and cybersecurity tasks, setting a new standard for AI model capabilities.

GPT-5.6 series includes Sol (flagship), Terra (balanced), and Luna (fast and affordable) models
Sol sets state-of-the-art on Terminal-Bench 2.1 and GeneBench v1, with strong cybersecurity performance on ExploitBench² and ExploitGym
Models priced per 1M tokens: Sol ($5 input / $30 output), Terra ($2.50 input / $15 output), Luna ($1 input / $6 output)

Full summary

OpenAI introduces the GPT-5.6 series with Sol as the flagship model, Terra for balanced performance at half the cost of GPT-5.5, and Luna for strong capabilities at the lowest cost. Sol excels in coding, biology, and cybersecurity tasks, achieving state-of-the-art results on Terminal-Bench 2.1 and GeneBench v1, while demonstrating competitive performance with fewer tokens compared to previous models on ExploitBench² and ExploitGym. The series includes enhanced safety features such as layered safeguards, real-time checks, account-level reviews, and differentiated access, tested through extensive red-teaming efforts. Pricing is tiered based on model capabilities, with Sol priced at $5 input / $30 output per 1M tokens, Terra at $2.50 input / $15 output, and Luna at $1 input / $6 output.

arxiv.org2026-07-02AI models safetyrel 8/10 score 5.0

Mind the Heads: Topological Representation Alignment for Multimodal LLMs

HeRA offers a novel approach to aligning multimodal representations at the granularity of individual attention heads, potentially improving the accuracy and reliability of multimodal large language models (MLLMs).

Details

Proposes Head-Wise Representation Alignment (HeRA) method
Focuses on preserving topological structure using Mutual K-Nearest Neighbor (MKNN) alignment metric
Improves performance on challenging vision-centric tasks across multiple MLLMs and benchmarks

The paper introduces Head-Wise Representation Alignment (HeRA), a method that enforces cross-modal alignment at the level of individual attention heads in multimodal large language models (MLLMs). HeRA uses the Mutual K-Nearest Neighbor (MKNN) alignment metric to preserve topological structure across modalities. Evaluations show that aligning less aligned heads yields significant performance improvements on vision-centric tasks and reduces visual hallucinations by mitigating over-reliance on linguistic priors.

research.google2026-07-01AI modelsrel 8/10 score 5.8

Introducing TabFM: A zero-shot foundation model for tabular data

TabFM offers a zero-shot approach to tabular data prediction, eliminating the need for manual feature engineering and hyperparameter tuning, thus significantly simplifying machine learning workflows.

Details

TabFM uses in-context learning (ICL) to process tabular data without traditional training phases
Trained on hundreds of millions of synthetic datasets generated by structural causal models (SCMs)
Evaluations show superior performance compared to industry-standard supervised algorithms on TabArena benchmarks

Google Research introduces TabFM, a zero-shot foundation model designed specifically for tabular data classification and regression. By leveraging in-context learning (ICL), TabFM bypasses the need for manual feature engineering and hyperparameter tuning, offering high-quality predictions with minimal effort. Trained on synthetic datasets generated using structural causal models (SCMs), TabFM demonstrates superior performance across various benchmarks. The model is being integrated into Google BigQuery, enabling users to perform advanced tasks via a simple SQL command.

vllm.ai2026-06-30AI models agentsrel 8/10 score 5.7

Micro-Agent: Beat Frontier Models with Collaboration inside Model API

vLLM Semantic Router introduces a new paradigm for AI request routing, enabling cost optimization, safety enforcement, and improved response quality without changing client integration.

Details

vLLM Semantic Router uses patterns like Confidence, Ratings, ReMoM, Fusion, and Workflows to handle requests
Evaluation shows VSR Closed outperforms other models in LiveCodeBench (92.6) and GPQA-Diamond (96.0)
The system maintains a single API surface while allowing operators to control the recipe

The vLLM Semantic Router introduces a new approach to AI request routing by implementing collaboration patterns within the router. These patterns include Confidence, Ratings, ReMoM, Fusion, and Workflows, which optimize cost, enforce safety policies, and enhance response quality. The system evaluates requests based on evidence and selects appropriate model pools or collaboration recipes. Evaluation results show that VSR Closed outperforms other models in benchmarks like LiveCodeBench (92.6) and GPQA-Diamond (96.0). This approach maintains a single API surface while allowing operators to control the underlying recipe, making it programmable, observable, and open at the serving layer.