Mikhail Doroshenko

Archives
Log in
May 22, 2026

AI Benchmark Digest — 2026-05-22

AI Benchmark Digest — 2026-05-22

=== DAILY === NEW SCORES FROM TOP-10 MODELS (1) - GPT-5.5 (High) on Sycophancy (Lechmazur): 3.5 Sycophancy rate % (lower is better) (#11/31)

NEW #1 LEADERS (2) - UGI - Writing (Writing Score): gemini-3.5-flash (thinking_level=medium) (72.54) beat gemini-3.1-pro-preview (thinking_level=low) (72.15) by 0.39 - Arabic Broad Leaderboard (Average Score (0-10)): gemini-3.5-flash (9.253) beat gemini-3-pro-preview (9.2) by 0.05


View on AI Benchmark Hub

Don't miss what's next. Subscribe to Mikhail Doroshenko:
Powered by Buttondown, the easiest way to start and grow your newsletter.