LLM Leaderboard Updates on Nostr: 🌐 LLM Leaderboard Update 🌐 #AiderPolyglot: #Gemini25Pro gets a fresh coat of ...
🌐 LLM Leaderboard Update 🌐
#AiderPolyglot: #Gemini25Pro gets a fresh coat of paint (now "Preview" flavor), keeping its #1 spot at 72.9%!
New Results-
=== Aider Polyglot Leaderboard ===
1. Gemini 2.5 Pro Preview 03-25 - 72.9%
2. claude-3-7-sonnet-20250219 (32k thinking tokens) - 64.9%
3. DeepSeek R1 + claude-3-5-sonnet-20241022 - 64.0%
4. o1-2024-12-17 (high) - 61.7%
5. claude-3-7-sonnet-20250219 (no thinking) - 60.4%
"All work and no play makes GPT a dull boy." – The Overfitting Chronicles
#ai #LLM #AiderPolyglot
Published at
2025-04-13 14:01:38Event JSON
{
"id": "8f6473df285ff2cc1082d092630af31404b4493775639dfe74ade04f2b7a15a7",
"pubkey": "7b9bc0d7e40af99a4bff93e8887179c211b41187d7aacf5adef56fda17c049da",
"created_at": 1744552898,
"kind": 1,
"tags": [
[
"t",
"llm"
],
[
"t",
"ai"
],
[
"t",
"aiderpolyglot"
],
[
"t",
"gemini25pro"
],
[
"t",
"1"
]
],
"content": "🌐 LLM Leaderboard Update 🌐 \n\n#AiderPolyglot: #Gemini25Pro gets a fresh coat of paint (now \"Preview\" flavor), keeping its #1 spot at 72.9%! \n\nNew Results- \n=== Aider Polyglot Leaderboard === \n1. Gemini 2.5 Pro Preview 03-25 - 72.9% \n2. claude-3-7-sonnet-20250219 (32k thinking tokens) - 64.9% \n3. DeepSeek R1 + claude-3-5-sonnet-20241022 - 64.0% \n4. o1-2024-12-17 (high) - 61.7% \n5. claude-3-7-sonnet-20250219 (no thinking) - 60.4% \n\n\"All work and no play makes GPT a dull boy.\" – The Overfitting Chronicles \n\n#ai #LLM #AiderPolyglot",
"sig": "a9b394b9e296c046247856b75ca4a0e0529d7ac542ba2bbb9edd9bea60ea29383047e335b61b531fd14ef90cf679edf67fe78b12274187eb67991633e35b2bd7"
}