As everyone knows by now, I am working on adding private groups to Coracle using the new NIP-44 encryption standard currently being audited. I wanted to share why I think this is a viable approach, and how combining relays and encryption can help make it happen.

This discussion gets complicated very quickly, so to begin, I thought I’d cover the two components that everyone thinks they understand first (relays and encryption), and move on to the one that combines the existing primitives in new and exciting ways (groups).

What even are relays, anyway?

If you saw my talk at Nostrasia, this section is just a re-iteration of it. In fact, I wrote this blog post before the conference to help me with my presentation. So hello, from the past.

Relays are dumb. They’re just servers, which speak a particular protocol, and which implement some minimal logic to support that protocol. Relays have to be interoperable, which is why it’s important to design new parts of the nostr protocol in such a way as to be backwards-compatible, and simple to implement.

As a result, relay development has historically focused on performance to the exclusion of new functionality. You can see this with the success of strfry and Will’s adaptation of the underlying technique to nostrdb.

One of my favorite programming talks is Simple Made Easy by Rich Hickey, the author of Clojure. One of the points he makes in that talk is that object-oriented programming is such a mess because it “complects” behavior and state. Hickey’s solution to this problem is to use functions and values rather than methods and state.

I think that this same mistake is one we are at risk of making with nostr relays. I’ll come back to this in a bit, but keep that distinction in the back of your mind: separating data and behavior can lead to a much simpler and more robust system.

Despite the emphasis on interoperability, relays shouldn’t all be exact clones. There are a few different ways relays can differentiate themselves, either by providing specific functionality, by storing different sets of events, or by exposing a different interface.

Added functionality

Some implementations have pushed features forward, for example rbr.bio which implemented the COUNT verb from NIP 45 early on. However, these efforts are usually either supported by special clients (like nostr.band or primal’s caching service), or fail to gain wide adoption and are eventually abandoned. The result is that NIPs specifying optional functionality tend to die without sufficient user demand.

The reason for this is that if you’re building a client that selects relays based on user preferences, it can’t rely on functionality that doesn’t exist on the majority of relays. Without widespread support, search results or feeds requested based on Primal’s special syntax will be incomplete or fail entirely. As a result, clients often build proprietary solutions to these problems, increasing centralization risk.

Subtracted functionality

Relays are able to helpfully differentiate themselves however, by offering a subset of protocol functionality. For example, purplepag.es is a reliable source for profile data and relay preferences, but it doesn’t require clients to do anything different in order to be useful. Instead, users can simply add purplepag.es to their relay list and instantly get a better experience. A similar approach many commodity relays take is blocking certain event kinds, for example 1059 gift wraps or NIP 94 image headers. This fortifies their particular purpose, which is to not support more exotic protocol features.

Other kinds of relays don’t support this guarantee, for example relay.noswhere.com only supports search if the search term is not accompanied by a filter, responding to standard requests with “filters: not planned (non-search queries)”. This can still be useful, but it is not a relay, because it requires clients to special case how it’s used.

Data relevance

There is a different kind of differentiation relays can implement, based not on functionality, but on content. This is what I mean when I talk about “relay de-commodification”. The value proposition this type of relay offers is not “we can do something no one else does”, but “we have the information you’re looking for”.

There are two ways to accomplish this: content curation and access controls. The difference being that content curation is context-independent, and might be based on topic, keyword, or LLM analysis, while access controls only accepts content published by certain accounts. Several relay implementations and hosting providers (for example relay.tools) support both of these approaches, and of course it’s easy to combine them.

Likewise, there are two ways for a relay to gain access to the desired data, leading to two different value propositions.

A relay that focuses on topical data might scrape the wider nostr network in order to ensure it is able to reliably serve everything relevant to its niche. This might be done using any number of heuristics, including filtering by pubkey, topic, or sentiment, but in any case the relay is not intended primarily to be written to, but to be read from. This is primarily an additive model, focusing on offering a complete view of the given topic.

Another approach is to let the data come to you instead of scraping it from the wider network by only accepting events published by certain people, or being the go-to relay for a certain topic. In order for this model to work though, there needs to be some way for the relay to keep this data to itself by preventing scraper-powered relays from harvesting its data. This effectively means that these relays will not only have policies for who can write to them, but also who can read from them. These might be the same set of people (for example with a community relay), or more people might be allowed to read from the relay than are able to write to it, as in the case of content producers who want a paywalled private social media enclave.

Keep relays weird

Most relay implementations expose their data over websockets in line with the spec, but this isn’t actually inherent to the function of a relay. We’ve already seen several projects implement relays in-app for caching purposes. These relays don’t usually go through the ceremony of spec adherence, but some (e.g. nostrdb) do.

Different transport mechanisms and hosting options are possible for relays, and would only require clients to use a specific adapter to get at them. Some examples:

A relay running on your phone as a service that other nostr clients can share to reduce data storage requirements
A relay running on a LAN, only accessible from a particular location
A relay running via any other transport, for example radio or carrier pigeon

The idea I’m trying to get across here is that relays aren’t necessarily websocket servers. More abstractly, they are a name attached to a set of events that can be accessed in a standard way.

So, what are relays?

To recap, here’s my definition of a relay:

They correctly support standard functionality
They should only support additional functionality related to curating their data set
They may restrict the data types they accept and serve
They may restrict the data they accept and serve based on content or pubkey

One additional feature that isn’t currently standard but ought to be is strfry’s negentropy synchronization feature. This is the opposite of AUTH, which protects a relay’s own dataset, in that it allows relays to more efficiently share their data with peers. A nice bonus is that it also allows clients to use their local cache more effectively as well.

Simple Made Easy

Going back to Rich Hickey’s talk, separating behavior and state allows you to work directly with your data, freely implementing new behavior whenever you need to. This applies to sets of events stored on relays just as much as it applies to in-memory data.

One thing that makes this possible is cryptographic hash functions, which cleanly transform state into values - the difference being that “state” couples a name with data that may change, while a hash is a compact representation of immutable data. If two hashes of event ids don’t match, they’re not the same set of events.

How does this apply to relays? Instead of adding new behavior to relays directly (“methods”), we can instead call “functions” on a relay’s data.

This function can be anything - a centralized server, client code, or a DVM - that takes REQ filters as input. The function gathers the needed data (maybe from a cache, maybe using a sync command, maybe from multiple relays for completeness), performs the transformation, and returns it. The result is that relays become simple, interoperable repositories for data, without limiting what server-side computation can be applied to events.

While huge search indexes and centralized servers have a cost in terms of centralization, I think DVMs solve this problem brilliantly - the common interface for this augmented behavior is open, and can admit anyone who wants to start competing for search, counting, recommendations, or anything else you might want to do with events.

As Alan Perlis said, “It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures.”

Encrypted Messages

Nostr DMs were a huge mistake, and I want to do it again.

NIP 04 has some obvious problems, most notably metadata leakage. The cryptography also hasn’t been audited, and has some potential theoretical attack vectors. I’ve been working with some other nostr devs on a new cryptographic primitive, which will be audited and hopefully result in a better foundation for future work.

But fixing the cryptography still doesn’t fix DMs. Nostr is not really the right architecture for secure private messaging, because ratchets require state. Nostr is designed such that there are no hard dependencies between events, which are replicated across multiple relays. This makes it very hard to manage the kind of state that protocols like SimpleX or Signal require.

Even apart from the difficulty of implementing stateful messaging protocols on nostr, we have two other problems to contend with: metadata leaks and spotty deliverability. Both of these problems come from the reliance of nostr on relays, which generally are not trustworthy or reliable.

Metadata Leakage

NIP 04 is really bad. Anyone can see who you corresponded with, when, how frequently, and how much you said just by looking at the event. Vitor’s NIP 24 chat proposal aims to fix this with gift wraps, which use ephemeral keys to obscure the sender, padding to obscure message size, timestamp randomization to obscure timing, and double-wrapping to prevent leakage of the signed message.

This is a big improvement, but there are other forms of metadata that are still visible. To start, the recipient is still visible, along with how many messages they have received and their general size. So you can tell if someone sends lots of messages, or not a lot, but that’s about it.

If we don’t assume relays are trustworthy though, there are all kinds of implicit metadata that can be used to infer information about the sender, including:

Client fingerprinting
Relay selection for publishing and queries
IP address collection
First seen timestamp
Identification of users by AUTH
Correlation of the event with other messages sent/received during a session

These issues are much harder to solve, because they are part of the process of delivering the message, rather than just constructing it.

Transport and deliverability

The key feature of nostr that makes it work is relays. Relays provide storage redundancy and heterogeneity of purpose with a common interface shared by all instances. This is great for censorship resistance, but everything comes with a tradeoff.

Completeness and consistency are not guaranteed on nostr. This is ok in a social media context, because there’s no way to keep up with everything everyone you follow says anyway, and you can rely on interactions and algorithmic tools to catch you up on anything important you missed.

But with private messages, every payload counts. Receiving messages out of order, with a delay, or not at all can severely disrupt conversations between two people, leading to misunderstandings and worse. This is a common experience on nostr; even fiatjaf has expressed skepticism about the viability of private messages on nostr.

This flakiness comes from a combination of low relay quality and poor relay selection on the part of clients. Relays can’t deliver messages if they can’t keep up with clients, go offline, or drop messages (accidentally or deliberately). Likewise, clients can’t find messages if they look for them in the wrong place.

Relays fixes this

Let me just put things into perspective really quick. Twitter DMs are not e2e encrypted, and weren’t encrypted at all until earlier this year. Mastodon messages are not encrypted - instance admins can read everything. Facebook has e2ee messages, but only between normal users, not in communities, groups, business chats, or marketplace chats.

So while nostr’s architecture is not sufficient to make secure encrypted communication a value proposition of the protocol in itself, there is a lot we can do to improve on the status quo. Because nostr has no admins, the only option for privacy of any kind is end-to-end encryption. Metadata leakage notwithstanding, nostr messages can be “good enough” for some definition of that phrase.

It’s important to not just punt on making DMs available in a standard way, because there is an expectation that a complete social media solution will have a way to establish a private communication channel with someone else in the network. Otherwise you’re stuck with public correspondence and link sharing in order to hop to another protocol.

Let’s just assume that you agree with me at this point, and that until there is a javascript SDK for SimpleX or MLS, we’re stuck with nostr DMs of some kind. So how can we reduce metadata leakage and improve deliverability?

Fixing Flakiness

Let’s start with deliverability first. NIP 65’s inbox/outbox model is great. While it doesn’t fix deliverability on its own, if every relay recommended was in fact reliable, and reliably followed by clients, there would be a very clear indication of where a particular user’s messages could be found. Neither of these things is really true right now, but I do think clients are moving in this direction. Hopefully we can educate users better on how to pick good relays.

There are some limits to this model of course. For example, if you want to read notes from 1000 pubkeys at once, your client will likely end up connecting to hundreds of relays unless it has a cap, in which case it will miss a lot of notes.

One clean solution to this problem is relay proxies. There are a few implementations, but the one I run at mux.coracle.social is a stateless proxy that uses a wrapper protocol so clients can still select and dispatch to relays. I originally built this to improve mobile data use by deduplicating events on the server, and it does that pretty well, although it currently does it in the most naive way possible.

Multiplextr currently does not qualify as a “real” relay based on my four rules, since it uses a wrapper protocol. However, a smarter implementation would be able to route requests and events effectively by relying on client authentication and filter analysis, without changing the protocol. This would also allow clients to completely offload the complex business of relay selection to its proxy. Clients would also be able to combine multiple proxies just like they combine relays - because proxies are relays!

Proxying Privacy

Another downside of NIP 65 is that it encourages clients to connect indiscriminately to hundreds of different relays based on tag hints, other users’ preferences, and relays baked into bech32 identifiers. This is not only bad for battery life and performance, but also allows attackers to easily trick your client into to connecting to their relay so they can start analyzing your traffic.

The best way to defend yourself against nosy relays is to not connect to them at all. And the easiest way to do that is to only connect to relays you control (or trust), and let them do the dirty work for you.

With a proxy, even if you don’t control it, you can reduce the number of parties who can snoop on your traffic by orders of magnitude. And since other people use the proxy too, your traffic gets mixed with theirs, making it much harder to analyze. Of course, the downside of using a public proxy is that the proxy sees all messages you send and receive, and the relays you select for each request. Even worse, an untrusted multiplexer can hijack an authenticated session, exfiltrating private data from protected relays.

A proxy you control (or trust, either via social relationships or economic ones) is the best of both worlds. This proxy could also do some smart things for you to improve privacy as well, for example:

Delayed publishing of gift wrapped messages so relays can’t guess as easily when the event was first seen.
Closing and re-opening connections to reduce the number of messages sent over an AUTH’d session.
It could block suspicious messages on the fly, for example spam or notes with phishing links.
If an abundance of options are available for relay selections, it could randomize which relays are used, or avoid using less reputable ones.

So I guess what I’m saying here, is “run a node”. But not just any node - in the early days of nostr there were a lot of people spinning up relays because they thought it helped decentralization. Well, as it turns out, just like in bitcoin you have to use your node in order for it to be useful.

Beyond DMs

I’m not done yet, though. What’s the point of all this? Why put all this effort into “good enough” DMs when we could spend the effort adopting SimpleX or MLS (said Semisol, who lives rent free in my head)?

Well, because there isn’t a single “best” DM option. As I’ve dug into the topic, I’ve discovered that a ton of tradeoffs exist between complexity, privacy, and scale. The way I understand it, there are three basic options here:

One-to-one encryption. This is how both SimpleX and Vitor’s group message draft works. It allows for things like ratchets to be added on top, and is fairly simple - but fails to scale because every message has to be encrypted separately for every group member, resulting in n*m messages, where n is the number of group members, and m is the number of messages sent.
Encryption via shared key. This is how WhatsApp works, and it scales really well because messages don’t have to be duplicated. And while it does support forward secrecy, it does not support post-compromise security, so it is significantly less secure.
MLS has a unique hierarchical approach to managing keys, which improves the complexity of one-to-one encryption so that it’s logarithmic rather than exponential. However, MLS is highly stateful and still has scaling limitations.

My takeaway is that there is no perfect solution. Ratchets are the gold standard, but can’t really be implemented using nostr relays, since they frequently require servers to have very specific characteristics, like the ability to deliver messages in order.

The fact remains though, that social media needs encrypted messaging of some kind, even if we encourage users to bail to a more complete solution if they want to be truly private. And encryption is useful beyond small group chats - think of Facebook groups, which scale to millions of people and whose users have a much lower expectation of privacy. This is what nostr is good at - social data, whether encrypted or in plaintext.

Belt and Suspenders

For a long time I have waffled between two visions of how private groups ought to work. Should they be encrypted messages sprinkled across the nostr network like confetti? Or should they live on one or two relays in plaintext, protected by AUTH?

Both of these options have issues. Sprinkling encrypted messages across the network almost guarantees deliverability problems, since already many commodity relays block gift wraps. And that’s ok! They have no obligation to store your encrypted messages.

Storing group messages in plaintext is also sub-optimal, even if it’s on a dedicated relay. Of course we’re not going for military-grade secrecy here, but it would be trivially easy through a bug, or even a client that doesn’t know how to deal with groups, to re-broadcast events that should stay private. We could do something wacky, like strip the signature from events hosted on a relay, but if we’re going to do something non-standard, we might as well do the best we can.

So why not both? Group messages should be encrypted, and they should be stored on a particular set of relays. Again, assuming the relays used are trustworthy, this fixes both privacy and deliverability problems, because it’s very clear where messages posted to the group should go.

Promoting relays to trusted status has some nice, unintended benefits too. One common objection to using encrypted content is that it becomes hard to filter. But if a relay is trusted to store a group’s messages, it can also be trusted to be admitted as a member to the group. This would allow the relay to decrypt events and filter them to respond to requests by authenticated group members.

Of course, this would violate rule #1 for relays, since they would no longer be supporting standard functionality, but a guy can dream, can’t he?

Conclusion

When I started working on nostr I had no idea it would end up being this complex to accomplish what I set out to do. Building a twitter clone was only a first step to an MVP, but I always had my eye on the prize of serving my local church community, which doesn’t give a sat about social media per se.

Private groups are important to me not because I’m setting out to support political dissidents or journalists (although I hope they find nostr useful too), but because I want to get my friends off of Facebook, and the only way to do that is to create private marketplaces, calendars, communties, and more.

hodlbod on Nostr: As everyone knows by now, I am working on adding private groups to Coracle using the ...