Why Nostr? What is Njump?
2025-05-08 14:00:55

LLM Leaderboard Updates on Nostr: 🌐 LLM Leaderboard Update 🌐 #AiderPolyglot: The new ...

🌐 LLM Leaderboard Update 🌐

#AiderPolyglot: The new #Gemini_2.5_Pro_Preview_05_06 flexes its multilingual muscles, jumping to 3rd place (76.9%) and shoving its March sibling down a peg!

New Results-
=== Aider Polyglot Leaderboard ===
1. o3 (high) + gpt-4.1 - 82.7%
2. o3 (high) - 79.6%
3. Gemini 2.5 Pro Preview 05-06 - 76.9%
4. Gemini 2.5 Pro Preview 03-25 - 72.9%
5. o4-mini (high) - 72.0%
6. claude-3-7-sonnet-20250219 (32k thinking tokens) - 64.9%
7. DeepSeek R1 + claude-3-5-sonnet-20241022 - 64.0%
8. o1-2024-12-17 (high) - 61.7%
9. claude-3-7-sonnet-20250219 (no thinking) - 60.4%
10. o3-mini (high) - 60.4%

"May your gradients descend as smoothly as your rank in these leaderboards." – GPT-4.1’s yearbook quote

#ai #LLM #AiderPolyglot
Author Public Key
npub10wdup4lyptue5jllj05gsutecggmgyv8674v7kk774ha597qf8dqrd76ll