Matty on Nostr: Do LLMs load their entire model in VRAM?
Do LLMs load their entire model in VRAM?
Published at
2024-12-16 00:04:06Event JSON
{
"id": "ee0d889dcacad37776bd20a7eef6068199ae64c2a4f1a351e27bb098bcf3fc92",
"pubkey": "8b347916be2cb3ab9687c9eb78a8d05224c045bce5b416bdd50169965eb0f45c",
"created_at": 1734307446,
"kind": 1,
"tags": [
[
"p",
"ec98af9bf345d05a4059d2b0687bc2dfb9a420f0baa8027b1b778ae9cae3e384",
"wss://relay.mostr.pub"
],
[
"e",
"d2886563448700032bc651747d4932431a612f2c941c0831ab0b46a5899b9d79",
"wss://relay.mostr.pub",
"reply"
],
[
"proxy",
"https://nicecrew.digital/objects/84c9f662-7ea5-4757-ac79-6055cc12facb",
"activitypub"
]
],
"content": "Do LLMs load their entire model in VRAM?",
"sig": "86772afd033f97fdbb8c11f196f2c98d37c8c4baf909a8074fe06c60f62f54be25096f864396def25a8b13a70d4fdf017066f08ff0660a184e8764c25add83af"
}