Daniel Wigton on Nostr: I really don't get the deepSeek love. I haven't tried the full model, but the 70B ...
I really don't get the deepSeek love. I haven't tried the full model, but the 70B parameter distill is trash. It isn't actually a reasoning model. It merely apes being a reasoning model. It is really good at sounding like it is reasoning but it hallucinates far more than the llama3.3 model on which it is based.
I suspect the full model has similar features. It is reassuring to users to see that it is attempting a rationalization but the actual output isn't that great.
Published at
2025-02-12 23:40:54Event JSON
{
"id": "5ea2072c861a49fe3c20d9f29e234e57063f5a42ffba59c427228c3ee3d3429b",
"pubkey": "75656740209960c74fe373e6943f8a21ab896889d8691276a60f86aadbc8f92a",
"created_at": 1739403654,
"kind": 1,
"tags": [],
"content": "I really don't get the deepSeek love. I haven't tried the full model, but the 70B parameter distill is trash. It isn't actually a reasoning model. It merely apes being a reasoning model. It is really good at sounding like it is reasoning but it hallucinates far more than the llama3.3 model on which it is based.\n\nI suspect the full model has similar features. It is reassuring to users to see that it is attempting a rationalization but the actual output isn't that great.",
"sig": "b4c284176466bd80324761d1145ec39200d4f26d384077b61dcc0c6984f283d239ca4cee1c5d041340c293c9ddf88de4414be9a3dc9a67cb5b6a81d7156ecb6e"
}