AI Benchmark Digest — 2026-04-27


            
        April 27, 2026
    
    
AI Benchmark Digest — 2026-04-27


        AI Benchmark Digest — 2026-04-27
=== DAILY ===
NEW MODELS (8)
  - medllama3-v11 — ELO 1333, #776/1047 (above: ai-medical-model-32bit, below: ollama_v7)
  - Collaiborator-MEDLLM-Llama-3-8B-v2-5 — ELO 1321, #802/1047 (above: suzume-llama-3-8B-multilingual, below: Collaiborator-MEDLLM-Llama-3-8B-v2-6)
  - Collaiborator-MEDLLM-Llama-3-8B-v2-6 — ELO 1321, #803/1047 (above: Collaiborator-MEDLLM-Llama-3-8B-v2-5, below: JSL-MedMNX-7B)
  - Collaiborator-MEDLLM-Llama-3-8B — ELO 1320, #806/1047 (above: Gemma 3n E4B Instructed LiteRT Preview, below: Collaiborator-MEDLLM-Llama-3-8B-v2-1)
  - Collaiborator-MEDLLM-Llama-3-8B-v2-1 — ELO 1320, #807/1047 (above: Collaiborator-MEDLLM-Llama-3-8B, below: falcon-180B)
  - Collaiborator-MEDLLM-Llama-3-8B-v2-4 — ELO 1318, #813/1047 (above: JSL-MedMNX-7B-SFT, below: Llama 2 70B)
      Open Medical LLM - PubMedQA: 78.6 (#3/168)
  - Collaiborator-MEDLLM-Llama-3-8B-v2-3 — ELO 1315, #821/1047 (above: Parrot-7B, below: Granite 4.0 Micro)
  - Llama-8B-1807 — ELO 1039, #1028/1047 (above: pythia-2.8B-deduped, below: GPT-2)
NEW #1 LEADERS (1)
  - Design Arena (Website) (Elo): claude-opus-4-6 (1349.0) beat claude-opus-4-7-thinking (1348.0) by 1.0

View on AI Benchmark Hub
    

                                Don't miss what's next. Subscribe to Mikhail Doroshenko:
                            
                        
            Email address (required)
            
            
                    ← Newer
                
                AI Benchmark Digest — 2026-04-28
            
        
                    Older →
                
                AI Benchmark Digest — 2026-04-26