drmonalidesai on Nostr: DeepSeek unveils new AI reasoning method as anticipation for its next-gen model ...
DeepSeek unveils new AI reasoning method as anticipation for its next-gen model rises. In collaboration with researchers from Tsinghua University, DeepSeek developed a technique that combines methods referred to as generative reward modelling (GRM) and self-principled critique tuning, according to a paper published on Friday. The dual approach aims to enable LLMs to deliver better and faster results to general queries.The resulting DeepSeek-GRM models outperformed existing methods, having “achieved competitive performance” with strong public reward models, the researchers wrote. Reward modeling is a process that guides an LLM towards human preferences.
{
"id":"b069b95111e9848e939e04358ff70d072ad14ec2208051458ef912503cf130d2",
"pubkey":"d6133367256042a6777318655bf1f38d8cdd944f4baa6c0b70ef4b025994af04",
"created_at":1744029409,
"kind":1,
"tags": [],
"content":"DeepSeek unveils new AI reasoning method as anticipation for its next-gen model rises. In collaboration with researchers from Tsinghua University, DeepSeek developed a technique that combines methods referred to as generative reward modelling (GRM) and self-principled critique tuning, according to a paper published on Friday. The dual approach aims to enable LLMs to deliver better and faster results to general queries.The resulting DeepSeek-GRM models outperformed existing methods, having “achieved competitive performance” with strong public reward models, the researchers wrote. Reward modeling is a process that guides an LLM towards human preferences.\n\nhttps://amp.scmp.com/tech/tech-trends/article/3305259/deepseek-unveils-new-ai-reasoning-method-anticipation-its-next-gen-model-rises\n\n\nhttps://m.primal.net/QHQX.jpg",
"sig":"9e872589b022841ec01618463183c3f905e31562fc7924d24b8b9f908411ad78db507731669b28c1185408620446c568b8d5eba3b4fef4aa64aa71a0398d3fa7"
}