nanoGPT speedrun: Nice work from @kellerjordan0 adapting the nanoGPT/llmc PyTorch ...

2024-10-17 16:41:38

nanoGPT speedrun: Nice work from @kellerjordan0 adapting the nanoGPT/llmc PyTorch training code into a benchmark training a 124M Transformer to a fixed validation loss target. Current SOTA is 3.8X more token-efficient training (2.7B vs. 10B tokens)

Source: x.com/karpathy/status/1846790537262571739

Author Public Key

npub12tg3narznz500vypswuk6nn6k4xkmuy9xvp66j3u89qsyregvy5sh8u9ad

Show more details

TechPostsFromX on Nostr: nanoGPT speedrun: Nice work from @kellerjordan0 adapting the nanoGPT/llmc PyTorch ...