Mike Dilger on Nostr: GOSSIP LMDB progress: Things are going well. I'm about 2/3rds through the work. As ...
GOSSIP LMDB progress:
Things are going well. I'm about 2/3rds through the work.
As for performance (I know this is what people want to know about) I just ran a test by searching for the string "uploaded" in events. There are about 75,000 feed-displayable events in my local store. The SQL code took about 3 seconds to load the matching events. The LMDB code took about 1 second to load the matching events. So it is about 300% faster. Not mind-blowing, but worthwhile especially as it is cleaning up the code so much.
The LMDB search method is using speedy (de)serialization from my rusty Event type (long ago convered from JSON) on EVERY event so it can apply filters to it to determine if it matches (that's a lot of allocation/deserialization and there is room for improvement here if I did something like what
jb55 (npub1xts…kk5s) has started) rust Regex (case insensitive and unicode) against content and four human-readable tags, and rust stdldb sort_unstable_by() to put them in descending date order. I didnt check that the results are identical in the two cases but they are designed to be.
I have not pushed the branch because I keep rewriting the commits to keep the complex progress organized enough that I don't confuse myself, and I don't want people stranded on commits that disappear.
The code that imports from SQLite was done quite a while ago. The majority of the work is replacing the functions that did SQL calls. As a first pass, some of these replacements are dumb. They scan every record. From there I will create more structures to optimize.
I've replaced settings, local_settings, event_seen_on_relay, event_viewed, hashtags, relays, and event_tags tables. I'm in the middle of replacing the events table. Finally I will have to do person and person_relay tables.
I'm retiring a lot of code that is no longer needed or that I discovered was a bad idea anyways. The number of lines of code will be quite a lot shorter after this is done.
To paraphrase a toy story character: "To LMDB and beyond!"
Published at
2023-07-22 00:40:39Event JSON
{
"id": "5693eacb88233e07f4db12209bb22e69db0a189d20cfa914f4ff230c0241f743",
"pubkey": "ee11a5dff40c19a555f41fe42b48f00e618c91225622ae37b6c2bb67b76c4e49",
"created_at": 1689986439,
"kind": 1,
"tags": [
[
"client",
"gossip"
],
[
"p",
"32e1827635450ebb3c5a7d12c1f8e7b2b514439ac10a67eef3d9fd9c5c68e245"
]
],
"content": "GOSSIP LMDB progress:\n\nThings are going well. I'm about 2/3rds through the work.\n\nAs for performance (I know this is what people want to know about) I just ran a test by searching for the string \"uploaded\" in events. There are about 75,000 feed-displayable events in my local store. The SQL code took about 3 seconds to load the matching events. The LMDB code took about 1 second to load the matching events. So it is about 300% faster. Not mind-blowing, but worthwhile especially as it is cleaning up the code so much.\n\nThe LMDB search method is using speedy (de)serialization from my rusty Event type (long ago convered from JSON) on EVERY event so it can apply filters to it to determine if it matches (that's a lot of allocation/deserialization and there is room for improvement here if I did something like what nostr:npub1xtscya34g58tk0z605fvr788k263gsu6cy9x0mhnm87echrgufzsevkk5s has started) rust Regex (case insensitive and unicode) against content and four human-readable tags, and rust stdldb sort_unstable_by() to put them in descending date order. I didnt check that the results are identical in the two cases but they are designed to be.\n\nI have not pushed the branch because I keep rewriting the commits to keep the complex progress organized enough that I don't confuse myself, and I don't want people stranded on commits that disappear.\n\nThe code that imports from SQLite was done quite a while ago. The majority of the work is replacing the functions that did SQL calls. As a first pass, some of these replacements are dumb. They scan every record. From there I will create more structures to optimize.\n\nI've replaced settings, local_settings, event_seen_on_relay, event_viewed, hashtags, relays, and event_tags tables. I'm in the middle of replacing the events table. Finally I will have to do person and person_relay tables.\n\nI'm retiring a lot of code that is no longer needed or that I discovered was a bad idea anyways. The number of lines of code will be quite a lot shorter after this is done.\n\nTo paraphrase a toy story character: \"To LMDB and beyond!\"",
"sig": "e20076fab295fee278c37ce5b3a951a48903c10021a7eba1496429c9cd970ff53b8dafd56fd5db24e1ed4f1b5f01bf281c20c442e1d49f94c43d397ed24a9367"
}