Mark Pesce on Nostr: How RLHF Preference Model Tuning Works (And How Things May Go Wrong) Large Language ...
Published at
2023-08-09 21:38:25Event JSON
{
"id": "55d4664e72fba78fcd5d082a3779fbb81ad8f7e879f866cb6eaa8abf50515c96",
"pubkey": "8b0ef49d11c147634fe81e5df544407f35ac0ca01d288082da576d0404fbbd8f",
"created_at": 1691617105,
"kind": 1,
"tags": [
[
"proxy",
"https://arvr.social/users/mpesce/statuses/110861818632474013",
"activitypub"
]
],
"content": "How RLHF Preference Model Tuning Works (And How Things May Go Wrong)\n\nLarge Language Models like ChatGPT are trained with Reinforcement Learning From Human Feedback (RLHF) to learn human preferences. Let’s uncover how RLHF works and survey its current strongest limitations.\n\nhttps://www.assemblyai.com/blog/how-rlhf-preference-model-tuning-works-and-how-things-may-go-wrong/",
"sig": "223184c78a9256ddf1084dfbe8d64fd1c2ff0ca443320fd03d476c9990b772e50ef7ed4c95510ecb41a0b60b5c0ea6611ebd310e73341de6b83762308938dc80"
}