Fabrice Drouin [ARCHIVE] on Nostr: 📅 Original date posted:2019-01-08 📝 Original message: On Tue, 8 Jan 2019 at ...
📅 Original date posted:2019-01-08
📝 Original message:
On Tue, 8 Jan 2019 at 17:11, Christian Decker
<decker.christian at gmail.com> wrote:
>
> Rusty Russell <rusty at rustcorp.com.au> writes:
> > Fortunately, this seems fairly easy to handle: discard the newer
> > duplicate (unless > 1 week old). For future more advanced
> > reconstruction schemes (eg. INV or minisketch), we could remember the
> > latest timestamp of the duplicate, so we can avoid requesting it again.
>
> Unfortunately this assumes that you have a single update partner, and
> still results in flaps, and might even result in a stuck state for some
> channels.
>
> Assume that we have a network in which a node D receives the updates
> from a node A through two or more separate paths:
>
> A --- B --- D
> \--- C ---/
>
> And let's assume that some channel of A (c_A) is flapping (not the ones
> to B and C). A will send out two updates, one disables and the other one
> re-enables c_A, otherwise they are identical (timestamp and signature
> are different as well of course). The flush interval in B is sufficient
> to see both updates before flushing, hence both updates get dropped and
> nothing apparently changed (D doesn't get told about anything from
> B). The flush interval of C triggers after getting the re-enable, and D
> gets the disabling update, followed by the enabling update once C's
> flush interval triggers again. Worse if the connection A-C gets severed
> between the updates, now C and D learned that the channel is disabled
> and will not get the re-enabling update since B has dropped that one
> altogether. If B now gets told by D about the disable, it'll also go
> "ok, I'll disable it as well", leaving the entire network believing that
> the channel is disabled.
>
> This is really hard to debug, since A has sent a re-enabling
> channel_update, but everybody is stuck in the old state.
I think there may even be a simpler case where not replacing updates
will result in nodes not knowing that a channel has been re-enabled:
suppose you got 3 updates U1, U2, U3 for the same channel, U2 disables
it, U3 enables it again and is the same as U1. If you discard it and
just keep U1, and your peer has U2, how will you tell them that the
channel has been enabled again ? Unless "discard" here means keep the
update but don't broadcast it ?
> At least locally updating timestamp and signature for identical updates
> and then not broadcasting if they were the only changes would at least
> prevent the last issue of overriding a dropped state with an earlier
> one, but it'd still leave C and D in an inconsistent state until we have
> some sort of passive sync that compares routing tables and fixes these
> issues.
But then there's a risk that nodes would discard channels as stale
because they don't get new updates when they reconnect.
> I think all the bolted on things are pretty much overkill at this point,
> it is unlikely that we will get any consistency in our views of the
> routing table, but that's actually not needed to route, and we should
> consider this a best effort gossip protocol anyway. If the routing
> protocol is too chatty, we should make efforts towards local policies at
> the senders of the update to reduce the number of flapping updates, not
> build in-network deduplications. Maybe something like "eager-disable"
> and "lazy-enable" is what we should go for, in which disables are sent
> right away, and enables are put on an exponential backoff timeout (after
> all what use are flappy nodes for routing?).
Yes there are probably heuristics that would help reducing gossip
traffic, and I see your point but I was thinking about doing the
opposite: "eager-enable" and "lazy-disable", because from a sender's
p.o.v trying to use a disabled channel is better than ignoring an
enabled channel.
Cheers,
Fabrice
Published at
2023-06-09 12:53:52Event JSON
{
"id": "c1a43dcfeb5877537683ce64c9534d20dbaee322003a353f96be0f651f043b8b",
"pubkey": "81c48ba46c211bc8fdb490d1ccfb03609c7ea090f8587ddca1c990676f09cfd3",
"created_at": 1686315232,
"kind": 1,
"tags": [
[
"e",
"fd0da5dbd5383b525edc98216d5094b180c1b831bf7af1f8df8ca35294a8c8fd",
"",
"root"
],
[
"e",
"f96e9e7b49ca4b75ae31526d0fc0495506694bd1266e67871a649b7f0b6522f0",
"",
"reply"
],
[
"p",
"72cd40332ec782dd0a7f63acb03e3b6fdafa6d91bd1b6125cd8b7117a1bb8057"
]
],
"content": "📅 Original date posted:2019-01-08\n📝 Original message:\nOn Tue, 8 Jan 2019 at 17:11, Christian Decker\n\u003cdecker.christian at gmail.com\u003e wrote:\n\u003e\n\u003e Rusty Russell \u003crusty at rustcorp.com.au\u003e writes:\n\u003e \u003e Fortunately, this seems fairly easy to handle: discard the newer\n\u003e \u003e duplicate (unless \u003e 1 week old). For future more advanced\n\u003e \u003e reconstruction schemes (eg. INV or minisketch), we could remember the\n\u003e \u003e latest timestamp of the duplicate, so we can avoid requesting it again.\n\u003e\n\u003e Unfortunately this assumes that you have a single update partner, and\n\u003e still results in flaps, and might even result in a stuck state for some\n\u003e channels.\n\u003e\n\u003e Assume that we have a network in which a node D receives the updates\n\u003e from a node A through two or more separate paths:\n\u003e\n\u003e A --- B --- D\n\u003e \\--- C ---/\n\u003e\n\u003e And let's assume that some channel of A (c_A) is flapping (not the ones\n\u003e to B and C). A will send out two updates, one disables and the other one\n\u003e re-enables c_A, otherwise they are identical (timestamp and signature\n\u003e are different as well of course). The flush interval in B is sufficient\n\u003e to see both updates before flushing, hence both updates get dropped and\n\u003e nothing apparently changed (D doesn't get told about anything from\n\u003e B). The flush interval of C triggers after getting the re-enable, and D\n\u003e gets the disabling update, followed by the enabling update once C's\n\u003e flush interval triggers again. Worse if the connection A-C gets severed\n\u003e between the updates, now C and D learned that the channel is disabled\n\u003e and will not get the re-enabling update since B has dropped that one\n\u003e altogether. If B now gets told by D about the disable, it'll also go\n\u003e \"ok, I'll disable it as well\", leaving the entire network believing that\n\u003e the channel is disabled.\n\u003e\n\u003e This is really hard to debug, since A has sent a re-enabling\n\u003e channel_update, but everybody is stuck in the old state.\n\nI think there may even be a simpler case where not replacing updates\nwill result in nodes not knowing that a channel has been re-enabled:\nsuppose you got 3 updates U1, U2, U3 for the same channel, U2 disables\nit, U3 enables it again and is the same as U1. If you discard it and\njust keep U1, and your peer has U2, how will you tell them that the\nchannel has been enabled again ? Unless \"discard\" here means keep the\nupdate but don't broadcast it ?\n\n\n\u003e At least locally updating timestamp and signature for identical updates\n\u003e and then not broadcasting if they were the only changes would at least\n\u003e prevent the last issue of overriding a dropped state with an earlier\n\u003e one, but it'd still leave C and D in an inconsistent state until we have\n\u003e some sort of passive sync that compares routing tables and fixes these\n\u003e issues.\n\nBut then there's a risk that nodes would discard channels as stale\nbecause they don't get new updates when they reconnect.\n\n\u003e I think all the bolted on things are pretty much overkill at this point,\n\u003e it is unlikely that we will get any consistency in our views of the\n\u003e routing table, but that's actually not needed to route, and we should\n\u003e consider this a best effort gossip protocol anyway. If the routing\n\u003e protocol is too chatty, we should make efforts towards local policies at\n\u003e the senders of the update to reduce the number of flapping updates, not\n\u003e build in-network deduplications. Maybe something like \"eager-disable\"\n\u003e and \"lazy-enable\" is what we should go for, in which disables are sent\n\u003e right away, and enables are put on an exponential backoff timeout (after\n\u003e all what use are flappy nodes for routing?).\n\nYes there are probably heuristics that would help reducing gossip\ntraffic, and I see your point but I was thinking about doing the\nopposite: \"eager-enable\" and \"lazy-disable\", because from a sender's\np.o.v trying to use a disabled channel is better than ignoring an\nenabled channel.\n\nCheers,\nFabrice",
"sig": "994a4905a61a2fc88e03b1358c15ed443294ff18da74dbf5e59a1b9dfb33a5a9a55815b6178c8f4c530224574f92cf318bc2c02e04e41127a798d35d642d93ea"
}