Gregory Maxwell [ARCHIVE] on Nostr: 📅 Original date posted:2016-05-10 📝 Original message:On Tue, May 10, 2016 at ...
📅 Original date posted:2016-05-10
📝 Original message:On Tue, May 10, 2016 at 5:28 AM, Rusty Russell via bitcoin-dev
<bitcoin-dev at lists.linuxfoundation.org> wrote:
> I used variable-length bit encodings, and used the shortest encoding
> which is unique to you (including mempool). It's a little more work,
> but for an average node transmitting a block with 1300 txs and another
> ~3000 in the mempool, you expect about 12 bits per transaction. IOW,
> about 1/5 of your current size. Critically, we might be able to fit in
> two or three TCP packets.
Hm. 12 bits sounds very small even giving those figures. Why failure
rate were you targeting?
I've mostly been thing in terms of 3000 txn, and 20k mempools, and
blocks which are 90% consistent with the remote mempool, targeting
1/100000 failure rates (which is roughly where it should be to put it
well below link failure levels).
If going down the path of more complexity, set reconciliation is
enormously more efficient (e.g. 90% reduction), which no amount of
packing/twiddling can achieve.
But the savings of going from 20kb to 3kb is not interesting enough to
justify it*. My expectation is that later we'll deploy set
reconciliation to fix relay efficiency, where the savings is _much_
larger, and then with the infrastructure in place we could define
another compactblock mode that used it.
(*Not interesting because it mostly reduces exposure to loss and the
gods of TCP, but since those are the long poles in the latency tent,
it's best to escape them entirely, see Matt's udp_wip branch.)
> I would also avoid the nonce to save recalculating for each node, and
> instead define an id as:
Doing this would greatly increase the cost of a collision though, as
it would happen in many places in the network at once over the on the
network at once, rather than just happening on a single link, thus
hardly impacting overall propagation.
(The downside of the nonce is that you get an exponential increase in
the rate that a collision happens "somewhere", but links fail
"somewhere" all the time-- propagation overall doesn't care about
that.)
Using the same nonce means you also would not get a recovery gain from
jointly decoding using compact blocks sent from multiple peers (which
you'll have anyways in high bandwidth mode).
With a nonce a sender does have the option of reusing what they got--
but the actual encoding cost is negligible, for a 2500 transaction
block its 27 microseconds (once per block, shared across all peers)
using Pieter's suggestion of siphash 1-3 instead of the cheaper
construct in the current draft.
Of course, if you're going to check your whole mempool to reroll the
nonce, thats another matter-- but that seems wasteful compared to just
using a table driven size with a known negligible failure rate.
64-bits as a maximum length is high enough that the collision rate
would be negligible even under fairly unrealistic assumptions-- so
long as it's salted. :)
> As Peter R points out, we could later enhance receiver to brute force
> collisions (you could speed that by sending a XOR of all the txids, but
> really if there are more than a few collisions, give up).
The band between "no collisions" and "infeasible many" is fairly
narrow. You can add a small amount more space to the ids and
immediately be in the no collision zone.
Some earlier work we had would send small amount of erasure coding
data of the next couple bytes of the IDs. E.g. the receiver in all
the IDs you know, mark totally unknown IDs as erased and the let the
error correction fix the rest. This let you algebraically resolve
collisions _far_ beyond what could be feasibly bruteforced. Pieter
went and implemented... but the added cost of encoding and software
complexity seem not worth it.
Published at
2023-06-07 17:50:21Event JSON
{
"id": "e97b55f429c98016d812846470e33422d6e0ca766b29d5e46c53924dee2e8a83",
"pubkey": "4aa6cf9aa5c8e98f401dac603c6a10207509b6a07317676e9d6615f3d7103d73",
"created_at": 1686160221,
"kind": 1,
"tags": [
[
"e",
"37fe3d2b857e9599a14c87e34f77dee62e90bb6721200fbd7752c60823834c1e",
"",
"root"
],
[
"e",
"1cdc415e8ef0c84672a3aa26d1ff1fc475204e1d4a0074405cc2af54d8334ee5",
"",
"reply"
],
[
"p",
"13bd8c1c5e3b3508a07c92598647160b11ab0deef4c452098e223e443c1ca425"
]
],
"content": "📅 Original date posted:2016-05-10\n📝 Original message:On Tue, May 10, 2016 at 5:28 AM, Rusty Russell via bitcoin-dev\n\u003cbitcoin-dev at lists.linuxfoundation.org\u003e wrote:\n\u003e I used variable-length bit encodings, and used the shortest encoding\n\u003e which is unique to you (including mempool). It's a little more work,\n\u003e but for an average node transmitting a block with 1300 txs and another\n\u003e ~3000 in the mempool, you expect about 12 bits per transaction. IOW,\n\u003e about 1/5 of your current size. Critically, we might be able to fit in\n\u003e two or three TCP packets.\n\nHm. 12 bits sounds very small even giving those figures. Why failure\nrate were you targeting?\n\nI've mostly been thing in terms of 3000 txn, and 20k mempools, and\nblocks which are 90% consistent with the remote mempool, targeting\n1/100000 failure rates (which is roughly where it should be to put it\nwell below link failure levels).\n\nIf going down the path of more complexity, set reconciliation is\nenormously more efficient (e.g. 90% reduction), which no amount of\npacking/twiddling can achieve.\n\nBut the savings of going from 20kb to 3kb is not interesting enough to\njustify it*. My expectation is that later we'll deploy set\nreconciliation to fix relay efficiency, where the savings is _much_\nlarger, and then with the infrastructure in place we could define\nanother compactblock mode that used it.\n\n(*Not interesting because it mostly reduces exposure to loss and the\ngods of TCP, but since those are the long poles in the latency tent,\nit's best to escape them entirely, see Matt's udp_wip branch.)\n\n\u003e I would also avoid the nonce to save recalculating for each node, and\n\u003e instead define an id as:\n\nDoing this would greatly increase the cost of a collision though, as\nit would happen in many places in the network at once over the on the\nnetwork at once, rather than just happening on a single link, thus\nhardly impacting overall propagation.\n\n(The downside of the nonce is that you get an exponential increase in\nthe rate that a collision happens \"somewhere\", but links fail\n\"somewhere\" all the time-- propagation overall doesn't care about\nthat.)\n\nUsing the same nonce means you also would not get a recovery gain from\njointly decoding using compact blocks sent from multiple peers (which\nyou'll have anyways in high bandwidth mode).\n\nWith a nonce a sender does have the option of reusing what they got--\nbut the actual encoding cost is negligible, for a 2500 transaction\nblock its 27 microseconds (once per block, shared across all peers)\nusing Pieter's suggestion of siphash 1-3 instead of the cheaper\nconstruct in the current draft.\n\nOf course, if you're going to check your whole mempool to reroll the\nnonce, thats another matter-- but that seems wasteful compared to just\nusing a table driven size with a known negligible failure rate.\n\n64-bits as a maximum length is high enough that the collision rate\nwould be negligible even under fairly unrealistic assumptions-- so\nlong as it's salted. :)\n\n\u003e As Peter R points out, we could later enhance receiver to brute force\n\u003e collisions (you could speed that by sending a XOR of all the txids, but\n\u003e really if there are more than a few collisions, give up).\n\nThe band between \"no collisions\" and \"infeasible many\" is fairly\nnarrow. You can add a small amount more space to the ids and\nimmediately be in the no collision zone.\n\nSome earlier work we had would send small amount of erasure coding\ndata of the next couple bytes of the IDs. E.g. the receiver in all\nthe IDs you know, mark totally unknown IDs as erased and the let the\nerror correction fix the rest. This let you algebraically resolve\ncollisions _far_ beyond what could be feasibly bruteforced. Pieter\nwent and implemented... but the added cost of encoding and software\ncomplexity seem not worth it.",
"sig": "6c1f2c61db4309839f49c4d34dfca24181234031532c5f0d55dde644bd9bbad4feb3ba2b896eee7a4328ecff3700139daa26588d1cdd2ee662db47673187dbb1"
}