TildAlice Dev Weekly logo

TildAlice Dev Weekly

Archives
February 28, 2026

GPT-4 vs Claude 3.5 vs Gemini: MMLU Zero-Shot Accuracy

GPT-4 beats Claude 3.5 by just 1.8% on zero-shot MMLU — way closer than official benchmarks claim. Real accuracy numbers from 1,000 questions.

Read the full article: GPT-4 vs Claude 3.5 vs Gemini: MMLU Zero-Shot Accuracy


You're receiving this because you subscribed to TildAlice newsletter. | #LLM, #GPT-4, #Claude, #Gemini, #MMLU

Don't miss what's next. Subscribe to TildAlice Dev Weekly:
tildalice.io
GitHub
Powered by Buttondown, the easiest way to start and grow your newsletter.