Background
The primary way I use email is through notmuch and notmuch-emacs on my laptop, which reads email from a local maildir. Sending mail is done using the venerable sendmail(8) command, which is backed by nullmailer to send it out via SMTP to the relay server of my email provider. Receiving mail also arrives at the server of my email provider and needs to make its way to my personal devices. And that's painful.
The standard mechanism for this is IMAP, so that is what I have to use. For a long time I used OfflineIMAP to move my mail from the IMAP server to my personal devices. Since I also wanted bi-directional sync of tags I ended up writing the python-cffi bindings for notmuch, because iterating over all my mail every few minutes is something that suits PyPy. While these bindings are not as robust as I hoped, they are a lot better than the previous ones and ended up becoming the default python3 bindings for notmuch.
Anyway, this worked good enough for a while. Of course OflineIMAP is incredibly slow and each sync takes up to 5 minutes. Yes I know about IMAP idle, though when I tried this out OfflineIMAP was even less reliable. So I run a "oneshot" sync in the background every 5-10 minutes and I don't care if I don't receive my email the exact instant it ends in my mailbox.
Until one day the Python3 migration broke OfflineIMAP. I could have tried fixing that I guess (someone else did eventually), but it was more interesting to build my own tool: imapsyncer.
The main reason I built my own tool was because I wanted to know what the limitation to OfflineIMAP being so slow really is, how fast can you go? Secondary, it was a good opportunity to learn IMAP and understand the limitations. I built it on the email stack (then) used by Delta Chat because I am also involved in the project and it seemed like a good idea to get to know these libraries.
Of course I didn't complete this project, OfflineIMAP became usable again and I just kept using my existing tools. But it did accomplish some things: I fixed some nasty bugs in the Delta Chat email stack that only showed up under stress testing and I learned how IMAP worked. And now I know why OfflineIMAP is slow, syncing IMAP just isn't going to be fast. It's IMAP.
Goals
From the above what I want from an email protocol is:
- Primarily fast sync. Getting all new mail should take very little effort and be fast. Syncing tags should be similarly fast.
- Sending mail on the same connection. If we have to deal with connectivity management for one protocol we may was as well utilise that same connection to send mail, equally fast.
- Syncing only part of the mailbox is not important. All clients can store all email. Even if they can't they're free to delete some data locally. This is a major difference with IMAP and JMAP.
- Searching is out of scope. All devices are powerful enough to build their own search index offline, this is a solved problem and avoids lots of search technology and protocol complexity.
- Emails are organised by tags. Tags can be added and removed by clients at any time and have to synchronise back to the server and other clients.
While these things follow from how I personally use email, it also happens to be matching very closely to how Delta Chat handles email. And Delta Chat is a mobile-first application that's been around for a good number of years. So this way of looking at email is not as idiosyncratic as it first might seem.
Outlines of a Protocol
So what would a such a protocol look like? I started wondering.
- Email messages should be immutable, whether you send them or receive them.
- Thus emails are identified by their hash. For the protocol you can use this to give some flexibility to the clients, you don't have to store all emails and can still retrieve them later. And you can still allow deletion of specific emails on the server.
- Every change to the mailbox is tracked in an append-only log. This is crucial for fast sync: all a client should do is send their current offset in the log and they get all updates.
- Tags will have to be some kind of Conflict-free Replicated Data Type (CRDT) stored in an append-only log. Each client can make modifications and merge changes from others independently. Luckily this is a pretty simple set-based CRDT.
The Append-Only Log
Since there is a blob store to store the actual email payload the append-only log only needs to carry meta-data about email. Hence I've been referring to this as the metadata log in my prototype. The kinds of records it needs to store is reasonably limited, for now I think to start with these:
- Store
- Indicating an email was stored. This record includes the hash of the email. This is what you'd do for incoming mail.
- Send
Indicating an email was sent, this record also includes the hash of the email.
As soon as a server receives a request to send an email it appends this record to the metadata log, before it tries to hand it off to the SMTP network. A server will have to internally track which messages have managed to be sent over SMTP and possibly create a Delivery Sender Notification (DSN) message in a new Store record if it can't be sent.
- Tags
- Indicating a change in tags for a message. The record includes the message's hash and the new tag state with a vector clock for the CRDT.
- Scrub
- This record indicates email(s) have been deleted. The record does not contain the hash of the deleted email, but does contain all the offsets in the metadata log which have been scrubbed. This allows other clients to scrub the same records from their metadata log as well as delete all the email blobs referred to by the scrubbed records. Thus allowing to delete email on both the server and all clients without storing the hash of the deleted email.
Each of these records get added to the append-only metadata log on the server. Only the server can build this log and the order the server stores the record defines the offsets of each record. This sequential offset is crucial to being able to request changes quickly: all that is needed is a client's current offset and any new records can be returned.
Deletion and append-only logs don't go well together. The compromise I think to make is to have the Scrub records: you leave in place the knowledge there was a record and its offset, but delete the record itself.
Protocol Details
This is still only a sketch, not a specification. I have only a partial prototype of this so far. But it should be sufficient to build something useful.
Transport
QUIC is a decent modern transport that gives us some nice features we can leverage to be a fast protocol.
- It has multiple independent streams which do not block each other on packet loss. These streams have no overhead at all to create, it is the same as sending data on an existing stream.
- Streams have built-in acknowledgement.
- Streams can stay open arbitrarily long. At least in practical terms.
- It can carry 0-RTT data. That means it can carry application data on the very first handshake packet in some circumstances, further reducing latencies. The main concern is that this allows replay attacks, so this must be idempotent.
- It offers connection migration to clients, e.g. when switching between wifi and mobile networks.
Commands
The protocol is primarily request-response based, with the client issuing all requests. Each requests is made on a new bi-directional QUIC stream, which allows sending multiple requests concurrently.
- Log
Returns all metadata log entries started at a given cursor offset.
The response is an infinite stream with all the requested log records. Infinite means it will never be terminated by the server, the server will idle while waiting for new records to appear. The client must close the stream if it wants it to stop.
Requesting from offset 0 will include all changes from the start of history.
- Head
Returns the current offset of the append-only metadata log.
Because new mail can arrive at any time the returned offset should be considered out of date as soon as it is received. It can still be useful for a client to know when to stop the Log response for "oneshot" syncs.
- Store
Stores an email blob.
The Blake3 hash of the email stored is returned. If the email was already in the store this does not result in a new metadata log record, but otherwise the response is the same. This makes it idempotent for most practical purposes.
My current prototype doesn't even care if this is a valid email or not. You could store anything you want, that's possibly reasonable behaviour?
- Fetch
Returns an email blob for the requested hash to the client.
For now this is the only way to retrieve email data. QUIC allows packing of multiple streams in a single packet so when making bulk requests this could be sufficient.
It is also rather possible this won't be sufficient and it may need more ways of requesting blobs, like a bulk fetch, or some kind of Log + Fetch which automatically pushes the mail for any Store records being sent in the log. I'll have to see from prototyping how well this works but would like if it could remain this simple.
- Send
Requests for an email to be sent.
This results in a new Send record in the metadata log. The server does some basic checks like whether it can be parsed as a valid email, maybe whether the sender address is allowed for this account, etc. Then the server must respond with either an success or failure response.
Sending is idempotent, sending the same message again will not result in the server sending it twice or adding it twice to the metadata log. It's hash stays the same so it is the same email.
- Tag
- Updates the tags for a message.
This is a pretty basic set of commands, I believe just doing this will enable almost all the functionality I personally think I need. However there are probably some pieces of functionality that could be tempting to add. E.g. maybe you could allow requesting just part of some messages, the server could parse the headers or mime structure and allow requesting just those bits you want to know about.
On the other hand you can probably do an awful lot with just tags. One idea is that you could run a client on the server and not give it any storage. But it could allow it to react to new records and decide to modify tags for messages it sees.
A Prototype
Some of this exists in some form in a prototype I started writing. Some of this isn't implemented in any form yet, especially tags are in the latter category as you can tell from my sparse description of them. But also sending has barely started. Of course things will change as I'll actually start to use something like this.
The road to adopting a new server isn't that crazy either. It's feasible to hook up receiving by forwarding from an existing account and sending via an exiting SMTP relay. So you could experiment with this while still keeping all email available in an existing traditional account.
Feedback
Feel free to reach out to me on mastodon if you find any of this somewhat interesting.