WikiResearch on Nostr: "WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts ...
Published at
2024-07-01 14:09:53Event JSON
{
"id": "787b1fe944dc02bec374b454b7397026701d189351b8ad12b9f770c0d1322e77",
"pubkey": "3eee297347e2cad49e0095a31aca38ffdcd46865ae75ea8ac5963999ce24d8a1",
"created_at": 1719842993,
"kind": 1,
"tags": [
[
"proxy",
"https://mastodon.social/users/wikiresearch/statuses/112711630428754314",
"activitypub"
]
],
"content": "\"WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia\" when given passages containing contradictory facts, models struggle to generate answers reflecting the conflicting nature of the context.\n\n(Hou et al, 2024)\n\nhttps://t.co/cwZyqq42g6 https://t.co/TWWABihKwh\n\nvia https://twitter.com/WikiResearch/status/1807778222425112719",
"sig": "c8714ef7627b4421e39e5618cfc244b84c3ee62180070926b648856954a520af4305c78e667b7e846ced940cdca24a469d50d2c3284e108017ee2e6ce8e87021"
}