Pieter Wuille [ARCHIVE] on Nostr: 📅 Original date posted:2018-06-26 📝 Original message:On Tue, Jun 26, 2018 at ...
📅 Original date posted:2018-06-26
📝 Original message:On Tue, Jun 26, 2018 at 8:33 AM, matejcik via bitcoin-dev
<bitcoin-dev at lists.linuxfoundation.org> wrote:
> I'm still going to argue against the key-value model though.
>
> It's true that this is not significant in terms of space. But I'm more
> concerned about human readability, i.e., confusing future implementers.
> At this point, the key-value model is there "for historical reasons",
> except these aren't valid even before finalizing the format. The
> original rationale for using key-values seems to be gone (no key-based
> lookups are necessary). As for combining and deduplication, whether key
> data is present or not is now purely a stand-in for a "repeatable" flag.
> We could just as easily say, e.g., that the high bit of "type" specifies
> whether this record can be repeated.
I understand this is a philosophical point, but to me it's the
opposite. The file conveys "the script is X", "the signature for key X
is Y", "the derivation for key X is Y" - all extra metadata added to
inputs of the form "the X is Y". In a typed record model, you still
have Xes, but they are restricted to a single number (the record
type). In cases where that is insufficient, your solution is adding a
repeatable flag to switch from "the first byte needs to be unique" to
"the entire record needs to be unique". Why just those two? It seems
much more natural to have a length that directly tells you how many of
the first bytes need to be unique (which brings you back to the
key-value model).
Since the redundant script hashes were removed by making the scripts
per-input, I think the most compelling reason (size advantages) for a
record based model is gone.
> (Moreover, as I wrote previously, the Combiner seems like a weirdly
> placed role. I still don't see its significance and why is it important
> to correctly combine PSBTs by agents that don't understand them. If you
> have a usecase in mind, please explain.
Forward compatibility with new script types. A transaction may spend
inputs from different outputs, with different script types. Perhaps
some of these are highly specialized things only implemented by some
software (say HTLCs of a particular structure), in non-overlapping
ways where no piece of software can handle all scripts involved in a
single transaction. If Combiners cannot deal with unknown fields, they
won't be able to deal with unknown scripts. That would mean that
combining must be done independently by Combiner implementations for
each script type involved. As this is easily avoided by adding a
slight bit of structure (parts of the fields that need to be unique -
"keys"), this seems the preferable option.
> ISTM a Combiner could just as well combine based on whole-record
> uniqueness, and leave the duplicate detection to the Finalizer. In case
> the incoming PSBTs have incompatible unique fields, the Combiner would
> have to fail anyway, so the Finalizer might as well do it. Perhaps it
> would be good to leave out the Combiner role entirely?)
No, a Combiner can pick any of the values in case different PSBTs have
different values for the same key. That's the point: by having a
key-value structure the choice of fields can be made such that
Combiners don't need to care about the contents. Finalizers do need to
understand the contents, but they only operate once at the end.
Combiners may be involved in any PSBT passing from one entity to
another.
> There's two remaining types where key data is used: BIP32 derivations
> and partial signatures. In case of BIP32 derivation, the key data is
> redundant ( pubkey = derive(value) ), so I'd argue we should leave that
> out and save space. In case of partial signatures, it's simple enough to
> make the pubkey part of the value.
In case of BIP32 derivation, computing the pubkeys is possibly
expensive. A simple signer can choose to just sign with whatever keys
are present, but they're not the only way to implement a signer, and
even less the only software interacting with this format. Others may
want to use a matching approach to find keys that are relevant;
without pubkeys in the format, they're forced to perform derivations
for all keys present.
And yes, it's simple enough to make the key part of the value
everywhere, but in that case it becomes legal for a PSBT to contain
multiple signatures for a key, for example, and all software needs to
deal with that possibility. With a stronger uniqueness constraint,
only Combiners need to deal with repetitions.
> Thing is: BIP174 *is basically protobuf* (v2) as it stands. If I'm
> succesful in convincing you to switch to a record set model, it's going
> to be "protobuf with different varint".
If you take the records model, and then additionally drop the
whole-record uniqueness constraint, yes, though that seems pushing it
a bit by moving even more guarantees from the file format to
application level code. I'd like to hear opinions of other people who
have worked on implementations about changing this.
Cheers,
--
Pieter
Published at
2023-06-07 18:13:13Event JSON
{
"id": "9c20abaef416ecede7dab915e520b97169dd066d04cb073ea9f750cbb817e202",
"pubkey": "5cb21bf5d7f25a9d46879713cbd32433bbc10e40ef813a3c28fe7355f49854d6",
"created_at": 1686161593,
"kind": 1,
"tags": [
[
"e",
"cde3c2f1af5ec4e3200e32c7fdbcba54b58741a9d65d38dd383e78325ee0ffcd",
"",
"root"
],
[
"e",
"94b9d8dac44fec485b00e717fc900cf850667512d39fb979455f9ccffba5a8c5",
"",
"reply"
],
[
"p",
"3a1f55a6a0e37109f7404e07ba52112f4c689363e4c7aa20d99d4785ede262ab"
]
],
"content": "📅 Original date posted:2018-06-26\n📝 Original message:On Tue, Jun 26, 2018 at 8:33 AM, matejcik via bitcoin-dev\n\u003cbitcoin-dev at lists.linuxfoundation.org\u003e wrote:\n\u003e I'm still going to argue against the key-value model though.\n\u003e\n\u003e It's true that this is not significant in terms of space. But I'm more\n\u003e concerned about human readability, i.e., confusing future implementers.\n\u003e At this point, the key-value model is there \"for historical reasons\",\n\u003e except these aren't valid even before finalizing the format. The\n\u003e original rationale for using key-values seems to be gone (no key-based\n\u003e lookups are necessary). As for combining and deduplication, whether key\n\u003e data is present or not is now purely a stand-in for a \"repeatable\" flag.\n\u003e We could just as easily say, e.g., that the high bit of \"type\" specifies\n\u003e whether this record can be repeated.\n\nI understand this is a philosophical point, but to me it's the\nopposite. The file conveys \"the script is X\", \"the signature for key X\nis Y\", \"the derivation for key X is Y\" - all extra metadata added to\ninputs of the form \"the X is Y\". In a typed record model, you still\nhave Xes, but they are restricted to a single number (the record\ntype). In cases where that is insufficient, your solution is adding a\nrepeatable flag to switch from \"the first byte needs to be unique\" to\n\"the entire record needs to be unique\". Why just those two? It seems\nmuch more natural to have a length that directly tells you how many of\nthe first bytes need to be unique (which brings you back to the\nkey-value model).\n\nSince the redundant script hashes were removed by making the scripts\nper-input, I think the most compelling reason (size advantages) for a\nrecord based model is gone.\n\n\u003e (Moreover, as I wrote previously, the Combiner seems like a weirdly\n\u003e placed role. I still don't see its significance and why is it important\n\u003e to correctly combine PSBTs by agents that don't understand them. If you\n\u003e have a usecase in mind, please explain.\n\nForward compatibility with new script types. A transaction may spend\ninputs from different outputs, with different script types. Perhaps\nsome of these are highly specialized things only implemented by some\nsoftware (say HTLCs of a particular structure), in non-overlapping\nways where no piece of software can handle all scripts involved in a\nsingle transaction. If Combiners cannot deal with unknown fields, they\nwon't be able to deal with unknown scripts. That would mean that\ncombining must be done independently by Combiner implementations for\neach script type involved. As this is easily avoided by adding a\nslight bit of structure (parts of the fields that need to be unique -\n\"keys\"), this seems the preferable option.\n\n\u003e ISTM a Combiner could just as well combine based on whole-record\n\u003e uniqueness, and leave the duplicate detection to the Finalizer. In case\n\u003e the incoming PSBTs have incompatible unique fields, the Combiner would\n\u003e have to fail anyway, so the Finalizer might as well do it. Perhaps it\n\u003e would be good to leave out the Combiner role entirely?)\n\nNo, a Combiner can pick any of the values in case different PSBTs have\ndifferent values for the same key. That's the point: by having a\nkey-value structure the choice of fields can be made such that\nCombiners don't need to care about the contents. Finalizers do need to\nunderstand the contents, but they only operate once at the end.\nCombiners may be involved in any PSBT passing from one entity to\nanother.\n\n\u003e There's two remaining types where key data is used: BIP32 derivations\n\u003e and partial signatures. In case of BIP32 derivation, the key data is\n\u003e redundant ( pubkey = derive(value) ), so I'd argue we should leave that\n\u003e out and save space. In case of partial signatures, it's simple enough to\n\u003e make the pubkey part of the value.\n\nIn case of BIP32 derivation, computing the pubkeys is possibly\nexpensive. A simple signer can choose to just sign with whatever keys\nare present, but they're not the only way to implement a signer, and\neven less the only software interacting with this format. Others may\nwant to use a matching approach to find keys that are relevant;\nwithout pubkeys in the format, they're forced to perform derivations\nfor all keys present.\n\nAnd yes, it's simple enough to make the key part of the value\neverywhere, but in that case it becomes legal for a PSBT to contain\nmultiple signatures for a key, for example, and all software needs to\ndeal with that possibility. With a stronger uniqueness constraint,\nonly Combiners need to deal with repetitions.\n\n\u003e Thing is: BIP174 *is basically protobuf* (v2) as it stands. If I'm\n\u003e succesful in convincing you to switch to a record set model, it's going\n\u003e to be \"protobuf with different varint\".\n\nIf you take the records model, and then additionally drop the\nwhole-record uniqueness constraint, yes, though that seems pushing it\na bit by moving even more guarantees from the file format to\napplication level code. I'd like to hear opinions of other people who\nhave worked on implementations about changing this.\n\nCheers,\n\n-- \nPieter",
"sig": "ddb0f14d8eda743010b84a691861df1d58e3f191e7110346a5d61bf30c18dff9e074625455ed86a70ecef5748985befb22bef943908d8b3b8a8dd53eec3299c1"
}