fiatjaf on Nostr: What prevents LLM data from being poisoned by sheer quantity of garbage? If they're ...
What prevents LLM data from being poisoned by sheer quantity of garbage?
If they're crawling the internet for data to be fed into the LLMs doesn't that mean that data that appears _more_ will have more importance, instead of data that is "better"?
In other words, what is the "pagerank" of LLMs?
Published at
2025-03-25 01:27:57Event JSON
{
"id": "0000120274ef1fffbd779b23af4a2ba1fe197272699b3cc3d02701214dead13a",
"pubkey": "3bf0c63fcb93463407af97a5e5ee64fa883d107ef9e558472c4eb9aaaefa459d",
"created_at": 1742866077,
"kind": 1,
"tags": [
[
"nonce",
"4611686018427391580",
"16"
]
],
"content": "What prevents LLM data from being poisoned by sheer quantity of garbage?\n\nIf they're crawling the internet for data to be fed into the LLMs doesn't that mean that data that appears _more_ will have more importance, instead of data that is \"better\"?\n\nIn other words, what is the \"pagerank\" of LLMs?",
"sig": "14fccbb5b0c2750e74d84c94056fe6942afe0350e64676b3b6afdac4eeae6ca147a45160a705e7e90575ef5e89ac0d4759de62a942092704d18c4d1dab07b97f"
}