Why Nostr? What is Njump?
2025-05-03 22:54:12

Dr. Hax on Nostr: OK, I'm looking for some help from a fellow lnd node runner. Problem 1: My comnection ...

OK, I'm looking for some help from a fellow lnd node runner.

Problem 1: My comnection to my peer keeps dropping about every 3.5 minutes

Logs say "pong response failure" and "timeout while waiting for pong response -- disconnecting".

The error in the peer's log says the same. That it also had a timeout while waiting for pong reaponse.

I can reconnect with no problem at all.

There is not any network issues such as packet loss (see notes on the pcap below for the evidence to back this up). Tor is not in the mix for this test.

I am able to sends sats if I do so quickly after connecting to the peer. So it seems like things can work properly if the connection issue can be sorted out.

I took a pcap of a connection, transaction and disconnection. Near the end, I see the client (node which initiated the connection) just absolutely slamming PSH,ACKs to the tune of 37 of them in just under 0.001 seconds. Then it sends a TCP Retransmission 0.006 seconds later and gets an ACK 0.036 seconds later, which is a perfectly reasonable response time.

The next batch is some TCP keepalives and keepalice ACKs. Some PSH,ACKs and ACKs in sub ms response time, followed by a retransmit and and ACK from the other side.

Finally 2 more keepalives and Keepalive ACKs in 0.012 seconds and then we get the FIN,ACK from the client followed by the RST,ACK from the server (remote peer to which we connected).

The FIN,ACK did come 5 seconds after the last ACK, so I feel like the server should have responded sooner, but at the same time I don't feel like a 5 second lag should cause a connection to be dropped and no attempt to ever be made to connect to it again. Also, these blitzkreigs of packets within 1ms is absurd.

Any ideas on where I should look next? I guess take pcaps on both sides and compare them?

This is absolutely brutal. I wouldn't expect most sysadmins to go through this much trouble to track down this issue, let alone any normal human be expected to do so.
Author Public Key
npub16v82nr4xt62nlydtj0mtxr49r6enc5r0sl2f7cq2zwdw7q92j5gs8meqha