Tom Walker on Nostr: Every so often I see a post about how LLMs fail logic puzzles. And... yes? Of course ...
Every so often I see a post about how LLMs fail logic puzzles.
And... yes? Of course they do. The only way it could solve it is if it has seen the puzzle before or a substantially similar one. (But that might cause it to give the answer to the similar one, not the correct answer.)
Why is this even tested so often or considered surprising? It is, in essence, an autocomplete. It does not understand logic. It has no concept of a correct answer. It gives the most likely completion.
Published at
2023-09-12 11:01:00Event JSON
{
"id": "54c999421a732f75a9a32f8fd37b333ff8f13ea2983cfd69dc417e8246835aaa",
"pubkey": "01cf6009dc9a2a2a7029452898cbdcb50c91449ce757443f7472094cd9f44f31",
"created_at": 1694516460,
"kind": 1,
"tags": [
[
"proxy",
"https://mastodon.social/users/tomw/statuses/111051830787605917",
"activitypub"
]
],
"content": "Every so often I see a post about how LLMs fail logic puzzles.\n\nAnd... yes? Of course they do. The only way it could solve it is if it has seen the puzzle before or a substantially similar one. (But that might cause it to give the answer to the similar one, not the correct answer.)\n\nWhy is this even tested so often or considered surprising? It is, in essence, an autocomplete. It does not understand logic. It has no concept of a correct answer. It gives the most likely completion.",
"sig": "9a7d8f0bea882ad8d38731d7452fdef39b2c5029b59c1a237a894fc5955e63e45af84c5f1652081a83fd0ee3b7eac32e0a41eefea22e0b1caf6736676e550afb"
}