Mikhail Doroshenko

Archives
Log in
May 28, 2026

AI Benchmark Digest — 2026-05-28

AI Benchmark Digest — 2026-05-28

=== DAILY === NEW SCORES FROM TOP-10 MODELS (1) - GPT-5.5 (xHigh) on SWE-rebench: 62.73 Resolved (%) (#1/82)

NEW #1 LEADERS (2) - Kaggle FACTS Grounding (Score (%)): Gemma 4 26B A4B (80.87) beat GPT-5.2 (76.17) by 4.7 - PinchBench (Success Rate (%)): Qwen Max (93.44) beat Grok 0.1 (92.07) by 1.37


View on AI Benchmark Hub

Don't miss what's next. Subscribe to Mikhail Doroshenko:
Powered by Buttondown, the easiest way to start and grow your newsletter.