Henry Saputra on Nostr: NVIDIA TensorRT-LLM Now Supports Recurrent Drafting for Optimizing LLM Inference | ...
Published at
2024-12-19 01:03:12Event JSON
{
"id": "3271509fd87d0051fb27224cf69a803a93b915a0b67d065e3333fcd514595037",
"pubkey": "113ba2d5aa88e97df8be825240ab525ca052f7bc6bb8eb05d62a87bfcbd38f2d",
"created_at": 1734570192,
"kind": 1,
"tags": [
[
"proxy",
"https://sigmoid.social/users/Kingwulf/statuses/113676792123820894",
"activitypub"
]
],
"content": "NVIDIA TensorRT-LLM Now Supports Recurrent Drafting for Optimizing LLM Inference | NVIDIA Technical Blog\nhttps://developer.nvidia.com/blog/nvidia-tensorrt-llm-now-supports-recurrent-drafting-for-optimizing-llm-inference/",
"sig": "3b673bdd8cc5526d228e05b507864d63cff1cb6473813849d6bf0aa9863813de4bd8c1ca15f14b32cc85dc623bde20c91cf71a0bbc893f840c1bd7865f92aa30"
}