Simon Willison on Nostr: The only way to evaluate an LLM continues to be on its vibes The vibes of Claude 3 ...
The only way to evaluate an LLM continues to be on its vibes
The vibes of Claude 3 are looking /really/ good right now: people whose opinion I trust are treating it as a step up from GPT-4!
I've not spent enough time with it yet, but my impressions so far have been very positive
Published at
2024-03-07 01:31:18Event JSON
{
"id": "491d1cf3bba2195102ab345231334cbffdf99ad232018e6abbe8dac76ea4c178",
"pubkey": "8b0be93ed69c30e9a68159fd384fd8308ce4bbf16c39e840e0803dcb6c08720e",
"created_at": 1709775078,
"kind": 1,
"tags": [
[
"proxy",
"https://fedi.simonwillison.net/users/simon/statuses/112051819546669778",
"activitypub"
]
],
"content": "The only way to evaluate an LLM continues to be on its vibes\n\nThe vibes of Claude 3 are looking /really/ good right now: people whose opinion I trust are treating it as a step up from GPT-4!\n\nI've not spent enough time with it yet, but my impressions so far have been very positive",
"sig": "d73d65e6e40caca5ec5fe536ad986b9c8efeaffd12c81f1dd0da8c537d5d6e4ba8b1384bca00be2df873393e602023c99fc36684018021c655b3b4bd1c81b807"
}