Reinforcement Learning from Human Feedback (RLHF) is just a vibe check "You'd train ...

2024-08-08 18:35:08

Reinforcement Learning from Human Feedback (RLHF) is just a vibe check

"You'd train it to agree with the human judgement on average. Once we have a Reward Model vibe check, you run RL with respect to it, learning to play the moves that lead to good vibes. Clearly, this would not have led anywhere too interesting in Go.”

https://x.com/karpathy/status/1821277264996352246

Author Public Key

npub1x63s0q69wcpvzuktgpxh02679x0skt6gdjnregct8f9lqmjuq3rsdhtc9e

Show more details

Assaf 🥥🌴 on Nostr: Reinforcement Learning from Human Feedback (RLHF) is just a vibe check "You'd train ...