Terence Tao on Nostr: An interesting experiment on #MathOverflow, where a user gave 15 different MO ...
An interesting experiment on #MathOverflow, where a user gave 15 different MO problems for o-1 to answer, with the aim of verifying and then rewriting the answer into a presentable form if the AI generated answer was correct. The outcome was: one question answered correctly, verified, and rewritten; one question given a useful lead, which led the experimenter to find a more direct answer; one possibly correct answer that the experimenter was not able to verify; and the remainder described as "a ton of time consuming chaos", in which the experimenter spent much time trying to verify a hallucinated response before giving up.
https://meta.mathoverflow.net/questions/6114/capabilities-and-limits-of-ai-on-mathoverflowI found the discussion for possible AI disclosure policies for MO in the post to also be interesting.
Published at
2025-03-11 17:38:20Event JSON
{
"id": "05835acc9d3c0369b38810b177d8c17dd6e6bdc6d99c9959a52d68cb4c9676b5",
"pubkey": "bc13e579bf49294633747700e25eb4f71d0c2aa134e89f30d36d7e1630e1bba3",
"created_at": 1741714700,
"kind": 1,
"tags": [
[
"t",
"mathoverflow"
],
[
"proxy",
"https://mathstodon.xyz/users/tao/statuses/114145014631171295",
"activitypub"
]
],
"content": "An interesting experiment on #MathOverflow, where a user gave 15 different MO problems for o-1 to answer, with the aim of verifying and then rewriting the answer into a presentable form if the AI generated answer was correct. The outcome was: one question answered correctly, verified, and rewritten; one question given a useful lead, which led the experimenter to find a more direct answer; one possibly correct answer that the experimenter was not able to verify; and the remainder described as \"a ton of time consuming chaos\", in which the experimenter spent much time trying to verify a hallucinated response before giving up. https://meta.mathoverflow.net/questions/6114/capabilities-and-limits-of-ai-on-mathoverflow\n\nI found the discussion for possible AI disclosure policies for MO in the post to also be interesting.",
"sig": "ddc00ac15ef060de1ef2d073f4219da5f0b6b5df1fc2c303ca7eb352b8f67330464f2b020ca5244396049841a1d2d2bf0fc17c9eee93456e8610cc7380dfca9d"
}