Why Nostr? What is Njump?
2024-02-12 13:11:53

Volker on Nostr: If you're interested in how your 12 words relate to your private key, here's an ...

If you're interested in how your 12 words relate to your private key, here's an explainer for normies. Might be a good idea to repost that if you find it interesting, so it gets visible across time zones.

From entropy to private key - a short overview of a surprisingly complex process
In the beginning, there is entropy. That entropy is typically generated by your device (HW wallet or computer), and can be augmented by adding your own entropy to it, typically in the form of dice or coin throws. Depending on how many bits of entropy you choose, you will end up with either a 12 (for 128 bits of entropy) or 24 (for 256 bits) mnemonic sentence, aka your "seed phrase" or "backup words". Note that in any case, your private key will be 256 bits long, although it will be significantly "less random" if you use less entropy.
In order to make the seed backup somewhat error proof, a checksum of the entropy is generated by hashing it using SHA256. To do so, the first few bits of that hash (4 for 128 bits of entropy, 8 for 256) are then appended to the entropy itself, yielding 132 or 264 bits, respectively. (To make things less cluttered, we will stick with the case of 128 bit of entropy for the remainder of this explainer. We also skip the odd case of 192 bits of entropy completely. The principles remain the same.)
Now these 132 bits are split into segments of 11 bits each, and those bits interpreted as integer numbers. The scope of an 11 bit integer ranges from 0…2047, as 2^11 equals 2048, so you can identify 2048 different things (in this case: numbers) using 11 bits.
That integer is now used as the item number in the BIP39 word list. BIP39 is a widely (although not universally) used standard for mapping bits into more easily memorable words. The integer number represented by the 11 bits is simply the offset into the array of words. So if your first 11 bits are, let's say, "110 0010 0000", that represents the number 1568, and if you look at the BIP39 list, you'll find the word "series" is entry number 1568 (it's labeled as number 1569, because the list starts at 1 instead of 0. If anybody knows why, I'd be very happy to learn about it).
The process of splicing off 11 bits, and converting them to a word, is repeated until you end up with your twelve word backup phrase.
The whole set of words is now concatenated (the full words, not just the 4-character abbreviations) into a single string, and that string gets fed into a specific hash function called PBKDF2 (Password Based Key Derivation Function) which requires a secondary parameter as "salt" (to make lookup table attacks harder). You can supply your own password here, but if you don't, then the word "mnemonic" is used instead.
The PBKDF2 algorithm is run 2048 times, in order to slow it down enough to make brute force attacks hopefully unattractive. The result of that process is a 512 bits long hash value.
This 512 bit hash is now fed into yet another hashing function, namely HMAC-SHA512. HMAC also takes a secondary parameter as salt, and here, "Bitcoin seed" is used.
The result of this hash function is another 512 bits and those bits are split into two parts of 256 bits each: the left side is called the "master chain code" and the right side is the "master private key".
The real private keys, the ones used for Bitcoin transactions, are generated by further concatening derivation paths and hashing the results, but I won't get into that here.

If you are sure that anything here is wrong, please do respond and explain. It's a pretty confusing process, but I hope I got it right.
Author Public Key
npub1g4jdvuxv9dgkcqtn5fupf2l9mr9xp27glzp6eq450dwfszrhfp9scxfm70