Mikhail Doroshenko

Archives
Log in
June 3, 2026

AI Benchmark Digest — 2026-06-03

AI Benchmark Digest — 2026-06-03

=== DAILY === NEW SCORES FROM TOP-10 MODELS (2) - GPT-5.5 Pro on IUMB: 100.0 Score (%) (#2/55) - Gemini 3 Deep Think on IUMB: 87.5 Score (%) (#6/55)

NEW #1 LEADERS (4) - MathArena - Kangaroo 2025 Levels 11-12 (Accuracy (%)): Claude Opus 4.8 (Thinking) (100.0) beat GPT-5.4 (xHigh) (98.33) by 1.67 - MathArena - APEX 2025 (Accuracy (%)): Claude Opus 4.8 (Thinking) (81.25) beat GPT-5.5 (xHigh) (80.21) by 1.04 - MathArena - Kangaroo 2025 Levels 7-8 (Accuracy (%)): Claude Opus 4.8 (Thinking) (96.67) beat GPT-5.4 (xHigh) (95.83) by 0.84 - MathArena - AIME 2026 (Accuracy (%)): Claude Opus 4.8 (Thinking) (100.0) beat GPT-5.4 (xHigh) (99.17) by 0.83


View on AI Benchmark Hub

Don't miss what's next. Subscribe to Mikhail Doroshenko:
Powered by Buttondown, the easiest way to start and grow your newsletter.