Matt Lavender on Nostr: npub1u2hl9…4z4d7 this may be true, but unfortunately the inference cost of the ...
npub1u2hl9r48emg8eq82nh8zwadl3xr93u7vkk37qavfcg7cu3dmrrzqe4z4d7 (npub1u2h…z4d7) this may be true, but unfortunately the inference cost of the models at this scale is completely unsustainable for that sort of use, or anything equivalent.
It takes 128 GPUs per instance to run GPT-4.
The cost per conversation on GPT-3 was 38c.
GPT-4 is roughly 3 times as expensive to run: or $1.14 PER CONVERSATION.
At those rates, routine casual usage is just not viable...even assuming we could give every child 128 dedicated GPUs, which we definitely cannot.
Published at
2023-12-06 17:39:15Event JSON
{
"id": "9ecbdd3563c2a7c7f0154da6525bbf5ef3ecf734c4380a66ed381fd853e5e174",
"pubkey": "5cab5e3f8e16e9d9983c5f50d9985e04a09d211d742bfc83fdd27d7001fd8be7",
"created_at": 1701884355,
"kind": 1,
"tags": [
[
"p",
"e2aff28ea7ced07c80ea9dce2775bf898658f3ccb5a3e07589c23d8e45bb18c4",
"wss://relay.mostr.pub"
],
[
"p",
"ddc6c81c03da216550654f73121985d8f30636aac98903de01993746bab7bdb3",
"wss://relay.mostr.pub"
],
[
"e",
"7ae0179f06997dc37ea9c3bd14ccce3d3a61b5cfeb0df74f9f742c6b9ede804a",
"wss://relay.mostr.pub",
"reply"
],
[
"proxy",
"https://journa.host/users/mattlav1250/statuses/111534693146358138",
"activitypub"
]
],
"content": "nostr:npub1u2hl9r48emg8eq82nh8zwadl3xr93u7vkk37qavfcg7cu3dmrrzqe4z4d7 this may be true, but unfortunately the inference cost of the models at this scale is completely unsustainable for that sort of use, or anything equivalent. \n\nIt takes 128 GPUs per instance to run GPT-4.\n\n The cost per conversation on GPT-3 was 38c.\n\n GPT-4 is roughly 3 times as expensive to run: or $1.14 PER CONVERSATION.\n\nAt those rates, routine casual usage is just not viable...even assuming we could give every child 128 dedicated GPUs, which we definitely cannot.",
"sig": "d7189acdf5350f001610cbf8b472d6e41feb98d9156f2e4f000789738c1bfa063488f2f233f431e2718ed78542a64146c4387a6a677683963772bbb6579cccfd"
}