Very cool. Thanks for the explanations.
On the separate mmap, I'm writing the serialized OpaqueEvent first, then writing the indexes. On a crash we might have a useless event that didn't get indexed, but it wouldn't be corrupt. The writer locks a write-lock, appends, then updates the 'end' pointer at the beginning of the map, then releases the write lock.
Also, these events are opaque (customer-ready, pre-validated). I never access the fields.
I noticed strfry does an index scan of these flatbuffers (of stripped down events) when necessary, using the fields to filter. I'm instead trying to use multiple btree-like indexes:
id -> offset
(created_at, id) -> offset
(pubkey, created_at, id) -> offset
(event_kind, created_at, id) -> offset
(tagdata, created_at, id) -> offset
Each key has to be unique per event, hence the id on each. And the created_at is to scan a since/until range. It's a lot of data to store, but I think that is okay on a personal relay. I don't think it would cause more page faults and cache misses.
I'm finding the set of events that match the filter by doing a functional intersection over the results from these filters. I say functional because I don't collect each result into a set and then intersect them, I intersect as I go: first index: collect the offsets in a hashset; second index: check each in the previous hashset and if it exists, copy to the next hashset (thus functionally being a set intersection) otherwise drop it.
Maybe this is a dumb algorithm. Databases are more my brother's thing, but I'm stubbernly trying to reinvent them myself (I would never learn as well otherwise).
I think deleted events will just be marked as such in another hash table, and a periodic rewrite will filter them out. I haven't really worked that out and now that I'm thinking about it, it could be quite expensive.
A lot of these OS level features like sendfile() or pwritev() become very difficult to use when relying on an upstream library that doesn't expose it's API in a way to utilize those. If I use tungstenite for websockets, I don't think I can write into it's outbound buffers at a low level.
At this point I'm really still exploring. It will change.