Vyram Kraven on Nostr: I know someone who self hosts an opem llm using ollama. A self hosted ai is always ...
I know someone who self hosts an opem llm using ollama. A self hosted ai is always better then using gpt, claude, or any others. Your info is tracked & logged when using those also they are designed to censor or answer certain questions a specific way.
With an open source model you can get a unfiltered answer, adjust through ollama the weight of flexibility in response. (precise responding or a little lose to be flexible)
You can folder conversations see all the chats for anyone if you run it & make it public for anyone to use, you can run multiple models & adjust them if they are off in response to teach it correction.
The drawback is bigger models require a ton of RAM & GPU memory if you can't handle it your computer will freeze when it's trying to generate a response. I really recommend a 128GB RAM & a very high quality gpu. If you go public with account creation & are planning to make a public service you might want to look into multiple gpu servers.
Multiple users using it at the same time you will hear your gpu going loud. It's worth it in the end to host your own because in the end the data is yours. What is needed is more developers so it can be optimized to work with less power.
Published at
2024-09-13 14:25:04Event JSON
{
"id": "b726f13b916b05ce5f15cea2b30cb15662d246fd74b1a871a9705030fb9a6917",
"pubkey": "b3f585f3e038f1dccdfaf2d7a1449a418605c392cd33ba7f137bca24842f5f90",
"created_at": 1726237504,
"kind": 1,
"tags": [
[
"e",
"e346d2f0b777971dd1764663b40c0f7cd2cfbe6646f9e3c77d1157066712d886",
"",
"root"
],
[
"p",
"2c309c7e4fc66d43a8d3beefcfd721e96ef34135c3c8b440eb7f1eb8e0fdfdd0"
]
],
"content": "I know someone who self hosts an opem llm using ollama. A self hosted ai is always better then using gpt, claude, or any others. Your info is tracked \u0026 logged when using those also they are designed to censor or answer certain questions a specific way.\n\nWith an open source model you can get a unfiltered answer, adjust through ollama the weight of flexibility in response. (precise responding or a little lose to be flexible)\nYou can folder conversations see all the chats for anyone if you run it \u0026 make it public for anyone to use, you can run multiple models \u0026 adjust them if they are off in response to teach it correction.\n\nThe drawback is bigger models require a ton of RAM \u0026 GPU memory if you can't handle it your computer will freeze when it's trying to generate a response. I really recommend a 128GB RAM \u0026 a very high quality gpu. If you go public with account creation \u0026 are planning to make a public service you might want to look into multiple gpu servers. \n\nMultiple users using it at the same time you will hear your gpu going loud. It's worth it in the end to host your own because in the end the data is yours. What is needed is more developers so it can be optimized to work with less power.",
"sig": "0d8d42daedbeb59f1e9706f2bef02d3c87b21841fc74964d79855af67ea00c067bf37347f5a4d9d76d33f95c214bc7b772a9a44f8b2a29a5198e5f109fcd8df6"
}