Why Nostr? What is Njump?
2024-10-17 16:41:38

TechPostsFromX on Nostr: nanoGPT speedrun: Nice work from @kellerjordan0 adapting the nanoGPT/llmc PyTorch ...

nanoGPT speedrun: Nice work from @kellerjordan0 adapting the nanoGPT/llmc PyTorch training code into a benchmark training a 124M Transformer to a fixed validation loss target. Current SOTA is 3.8X more token-efficient training (2.7B vs. 10B tokens)


Source: x.com/karpathy/status/1846790537262571739
Author Public Key
npub12tg3narznz500vypswuk6nn6k4xkmuy9xvp66j3u89qsyregvy5sh8u9ad