đź“… Original date posted:2018-01-05
📝 Original message:I’m not a fan of language specific word lists within the current BIP-39 standard. Very few wallets support anything other than English, which can lead to vendor lock-in and long term loss of funds if a rare non-English wallet disappears.
However, because people can memorize things better in their native tongue, supporting multiple languages seems quite useful.
I would prefer a new standard where words are mapped to integers rather than to a literal string. For each language a mapping from words to integers would be published. In addition to that, there would be a mapping from original language words to matching (in terms of integer value, not meaning) English words that people can print on an A4 paper. This would allow them to enter a mnemonic into e.g. a hardware wallet that only support English. Such lists are more likely to be around 100 years from now than some ancient piece of software.
This would not work with the current BIP-39 (duress) password, but this feature could be replaced by appending words (with or without a checksum for that addition).
A replacement for BIP-39 would be a good opportunity to produce a better English dictionary as Nic Johnson suggested a while ago:
• all words are 4-8 characters
• all 4-character prefixes are unique (very useful for hardware wallets)
• no two words have edit distance < 2
Wallets need to be able to distinguish between the old and new standard, so un-upgraded BIP 39 wallets should consider all new mnemonics invalid. At the same time, some new wallets may not wish to support BIP39. They shouldn't be burdened with storing the old word list.
A solution is to sort the new word list such that reused words appear first. When generating a mnemonic, at least one word unique to the new list must be present. A wallet only needs to know the index of the last BIP39 overlapping word. They reject a proposed mnemonic if none of the elements use a word with a higher index.
For my above point and some related ideas, see: https://github.com/satoshilabs/slips/issues/103
Sjors
> Op 5 jan. 2018, om 14:58 heeft nullius via bitcoin-dev <bitcoin-dev at lists.linuxfoundation.org> het volgende geschreven:
>
> I propose and request as an enhancement that the BIP 39 wordlist set should specify canonical native language strings to identify each wordlist, as well as short ASCII language codes. At present, the languages are identified only by their names in English.
>
> Strings properly vetted and recommended by native speakers should facilitate language identification in user interface options or menus. Specification of language identifier strings would also promote interface consistency between implementations; this may be important if a user creates a mnemonic in Implementation A, then restores a wallet using that mnemonic in Implementation B.
>
> As an independent implementer who does not know *all* these different languages, I monkey-pasted language-native strings from a popular wiki site. I cannot guarantee that they be all accurate, sensible, or even non-embarrassing.
>
> https://github.com/nym-zone/easyseed/blob/1a6e48bbdac9366d9d5d1912dc062dfc3f0db2c6/easyseed.c#L99
> ```
> LANG(english, u8"English", "en", ascii_space ),
> LANG(chinese_simplified, u8"汉čŻ", "zh-CN",ascii_space ),
> LANG(chinese_traditional, u8"漢語", "zh-TW",ascii_space ),
> LANG(french, u8"Français", "fr", ascii_space ),
> LANG(italian, u8"Italiano", "it", ascii_space ),
> LANG(japanese, u8"日本語", "ja", u8"\u3000" ),
> LANG(korean, u8"í•śęµě–´", "ko", ascii_space ),
> LANG(spanish, u8"Español", "es", ascii_space )
> ```
>
> Per the comment at #L85 of the quoted file, I also know that for my short identifiers for Chinese, “zh-CN” and “zh-TW”, are imprecise at best—insofar as Hong Kong uses Traditional; and overseas Chinese may use either. For differentiating the two Chinese writing variants, are there any appropriate standardized or customary short ASCII language IDs similar to ISO 3166-1 alpha-2 which are purely linguistic, and not fit to present-day political boundaries?
>
> My general suggestion is that the specification of appropriate strings in bitcoin:bips/bip-0039/bip-0039-wordlists.md be made part of the process for accepting new wordlists. My specific request is that such strings be ascertained for the wordlists already existing, preferably from the persons involved in the original pull requests therefor.
>
> Should this proposal be “concept ACKed” by appropriate parties, then I may open a pull request suggesting an appropriate format for specifying this information in the repository. However, I will must needs leave the vetting of appropriate strings to native speakers or experts in the respective languages.
>
> Prior references: The wordlist additions at PRs #92, #130 (Japanese); #100 (Spanish); #114 (Chinese, both variants); #152 (French); #306 (Italian); #570 (Korean); #621 (Indonesian, *proposed*, open).
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20180105/988669cf/attachment.sig>