Why Nostr? What is Njump?
2024-03-17 18:57:11

waxwing on Nostr: Utxo set analysis time! I have found a "toolchain" to extract taproot utxo pubkeys ...

Utxo set analysis time!

I have found a "toolchain" to extract taproot utxo pubkeys that's at least *reasonably* efficient - more on that, at the end, for the engineers.

But here's an analysis of a snapshot of the whole 167M utxo set as of 16th March 2024.

Of the 167M utxos, a full 39M are taproot (in other words, about 1 in 4 of all the individual "bits of bitcoin" that exist in our global consensus, are taproot - but not 1 in 4 *bitcoins*, i.e. not by value)!

Of that 39M, 33M are *sub 1000 sats*, i.e. basically dust or near dust. Pretty obviously, these will be "data carrying" type (probably ordinals stuff? sorry I don't know the details). Here's a rough breakdown of the taproot outputs in the utxo set by value:

Amount in sats    Number of utxos(taproot only)
> 5 million         51674
> 2.5 million       81512
> 1 million         154130
> 500k              238060
> 250k              352235
> 100k              800843
> 50k               1043038
> 25k               1333547
> 10k               2853756
> 1000 sat          6084116
> 100 (i.e. ~all)   39034007

This will not be news to most. IMO taproot *economic* usage only picks up when Lightning implementations start using it; there is only fairly limited other incentive, for now.

For my taproot based 'anonymous usage project' (see recent posts), a filter of about 500k sats makes sense to me - anon sets of 250k are pretty decent, though as we've seen, we can definitely support much larger sets.

About the "toolchain".

Step 1 is to run the dumptxoutset RPC call against Core. As noted, this currently returns an 11GB data set of 167 million coins, so be aware if your setup is size constrained.

Step 2 is to parse the custom format of this data set. I believe it's Level DB. I found the easiest way was to run this useful tool: https://github.com/theStack/utxo_dump_tools/ against the file created by Step 1. This creates a sqlite database with intelligible columns in the 'utxo' database (like 'value', 'scriptpubkey').

Then I wrote a primitive Python script to do a SELECT from utxos WHERE value >=? AND scriptpubkey LIKE '5120%' .. something along those lines.

I can directly take the output of Step 3 as input to the aut-ct tool I've been talking about recently to create tokens.
Author Public Key
npub1vadcfln4ugt2h9ruwsuwu5vu5am4xaka7pw6m7axy79aqyhp6u5q9knuu7