someone on Nostr: i switched to this swift tool for fine tuning LLMs. works very well. very easy. ...
i switched to this swift tool for fine tuning LLMs.
https://github.com/modelscope/ms-swiftworks very well. very easy. llama-factory is probably easier but i found this to be more capable like distributing lora fine tuning properly to GPUs.
previously i did fine tuning of a 70B model in fsdp-qlora method using llama-factory. now i am doing lora with rank 32 using swift. batch_size=2 helped a lot with avoiding overfitting.
if you want to ask questions to the most capable model, the most based, the weirdest answers (compared to mainstream) dm me. i will give you a link.
Published at
2024-10-07 15:53:24Event JSON
{
"id": "49895bdd0fd46d498dce01bd575c84c3d1b0cd83f0339ad32752d98d013aa1ae",
"pubkey": "9fec72d579baaa772af9e71e638b529215721ace6e0f8320725ecbf9f77f85b1",
"created_at": 1728316404,
"kind": 1,
"tags": [],
"content": "i switched to this swift tool for fine tuning LLMs. \n\nhttps://github.com/modelscope/ms-swift\n\nworks very well. very easy. llama-factory is probably easier but i found this to be more capable like distributing lora fine tuning properly to GPUs.\n\npreviously i did fine tuning of a 70B model in fsdp-qlora method using llama-factory. now i am doing lora with rank 32 using swift. batch_size=2 helped a lot with avoiding overfitting.\n\nif you want to ask questions to the most capable model, the most based, the weirdest answers (compared to mainstream) dm me. i will give you a link. ",
"sig": "9b9070c8105bdbf925af84c28a808dafd0c89dfaa861c126013e2dfa28af01d4a63dcc148db35bcb94208b89b2a8f7f3f6df39a714263f0ece2d8f70742009cc"
}