AI Benchmark Digest — 2026-05-22
AI Benchmark Digest — 2026-05-22
=== DAILY === NEW SCORES FROM TOP-10 MODELS (1) - GPT-5.5 (High) on Sycophancy (Lechmazur): 3.5 Sycophancy rate % (lower is better) (#11/31)
NEW #1 LEADERS (2) - UGI - Writing (Writing Score): gemini-3.5-flash (thinking_level=medium) (72.54) beat gemini-3.1-pro-preview (thinking_level=low) (72.15) by 0.39 - Arabic Broad Leaderboard (Average Score (0-10)): gemini-3.5-flash (9.253) beat gemini-3-pro-preview (9.2) by 0.05
Don't miss what's next. Subscribe to Mikhail Doroshenko: