Jon Udell on Nostr: Here's a test I'll be adding to my repertoire of LLM tests. "I left Singapore at 9PM ...
Here's a test I'll be adding to my repertoire of LLM tests.
"I left Singapore at 9PM on Oct 20, flying east to San Francisco where I will arrive at 9PM on Oct 20. How much daylight will I see looking out the window?"
All of them - ChatGPT4, Claude, Bard - got it spectacularly wrong, confidently asserting I'd see lots of daylight.
When I told them I'm 10 hours into the flight and have seen none so far they were all like "oh, yeah, right, sorry about that."
#llm
Published at
2023-10-21 00:28:04Event JSON
{
"id": "101430b7843a4b8442c6da769c855260a58ee943f2e029c4163da489eae473c1",
"pubkey": "1c835acda48047e2a9591e81623af0d9aefae1d29fc9831fde8f0169b9e50470",
"created_at": 1697848084,
"kind": 1,
"tags": [
[
"t",
"llm"
],
[
"proxy",
"https://social.coop/users/judell/statuses/111270172086022868",
"activitypub"
]
],
"content": "Here's a test I'll be adding to my repertoire of LLM tests.\n\n\"I left Singapore at 9PM on Oct 20, flying east to San Francisco where I will arrive at 9PM on Oct 20. How much daylight will I see looking out the window?\"\n\nAll of them - ChatGPT4, Claude, Bard - got it spectacularly wrong, confidently asserting I'd see lots of daylight.\n\nWhen I told them I'm 10 hours into the flight and have seen none so far they were all like \"oh, yeah, right, sorry about that.\"\n\n#llm",
"sig": "40a45fab805ee20852b28d08c65ee0d2062451010f90245bd601c5fe95b81cc0ded368ed8f146c01239e3696d1e2ffb094dda486eadc150e06c470fcdae80812"
}