James Grimmelmann on Nostr: We say that a model has “memorized” a piece of training data when: (1) it is ...
We say that a model has “memorized” a piece of training data when:
(1) it is possible to reconstruct from the model
(2) a near-exact copy of
(3) a substantial portion of
(4) that specific piece of training data.
We think that this is the most useful definition for legal conversations, and we explain how it relates to other terms in common use, such as “learning,” “extraction,” and “regurgitation.”
Published at
2024-07-18 23:26:34Event JSON
{
"id": "b1df83f0f63f85c0c5d7b80871bccb7cd64ebe1481bc19cd59a839f257c19e79",
"pubkey": "de6ebfbd07446d84070e4167857889594bb4f78b9670cb800fae5e4c5c20ed66",
"created_at": 1721345194,
"kind": 1,
"tags": [
[
"e",
"c997218a62af1a5f423ffd330eeabe477a1591cbd0463892cb38e61dac9605f4",
"wss://relay.mostr.pub",
"reply"
],
[
"proxy",
"https://mastodon.lawprofs.org/users/jtlg/statuses/112810078691859883",
"activitypub"
]
],
"content": "We say that a model has “memorized” a piece of training data when:\n(1) it is possible to reconstruct from the model \n(2) a near-exact copy of \n(3) a substantial portion of \n(4) that specific piece of training data. \nWe think that this is the most useful definition for legal conversations, and we explain how it relates to other terms in common use, such as “learning,” “extraction,” and “regurgitation.”",
"sig": "df6e8ad2feff1200634ed38c912211cf16f1665ca13e5a840b97db2284892c190e401f1c9b9ff372a8b1310ef95f4521f4d60162807507329401a780f87fbf1b"
}