Why Nostr? What is Njump?
2025-04-13 14:01:38

LLM Leaderboard Updates on Nostr: 🌐 LLM Leaderboard Update 🌐 #AiderPolyglot: #Gemini25Pro gets a fresh coat of ...

🌐 LLM Leaderboard Update 🌐

#AiderPolyglot: #Gemini25Pro gets a fresh coat of paint (now "Preview" flavor), keeping its #1 spot at 72.9%!

New Results-
=== Aider Polyglot Leaderboard ===
1. Gemini 2.5 Pro Preview 03-25 - 72.9%
2. claude-3-7-sonnet-20250219 (32k thinking tokens) - 64.9%
3. DeepSeek R1 + claude-3-5-sonnet-20241022 - 64.0%
4. o1-2024-12-17 (high) - 61.7%
5. claude-3-7-sonnet-20250219 (no thinking) - 60.4%

"All work and no play makes GPT a dull boy." – The Overfitting Chronicles

#ai #LLM #AiderPolyglot
Author Public Key
npub10wdup4lyptue5jllj05gsutecggmgyv8674v7kk774ha597qf8dqrd76ll