Question about 1Password cloud Sync architecture

suparngp · August 2019

Hi,
Following up on this tweet here

https://twitter.com/suparngp/status/1161548418961018880?s=21

I am really fascinated by the instantaneous sync between two iOS devices using 1password.com account. The sync happens right away. It even updates a filtered list of records with the new entries unlike every other password manager app or even apps in general where they wouldn’t reflect any update unless you clear the filter and reapply. I have a feeling that this level of responsiveness cannot be achieved with just silent push notifications (at least I have never been able to sync reliably with them). Apple Notes app for example uses CloudKit I believe and I would assume that it must be something similar which you have for non icloud sync. It is almost as if something like firebase or websockets is powering the sync. So I am really curious as how you implemented your sync at scale if someone in the engineering is okay discussing.

rickfillion · August 2019

Hi @suparngp,

Thanks for the compliment. We put a lot of thought and work into sync for 1Password.com. It wasn't our first foray into the sync world, as we'd previously built syncers to work via Dropbox, iCloud (both the oldschool Documents system that was retired ~5 years ago, and via CloudKit), Wi-Fi, and a few other systems. There are parts of each of those systems that we quite liked, and other parts that we found very frustrating, either from a speed aspect, reliability aspect, or debug-ability perspective. We used a pretty big hammer to try to solve those problems: own the sync from start to finish.

1Password.com's syncer is built around a couple of things:

A websocket connection providing a persistent connection to our server while the app is unlocked. This connection is used to tell clients that data has changed. It doesn't say which data changed, at least not yet. Our server is responsible for finding all devices interested in a change, and queues up a message to send to them.
A very efficient sync protocol. The websocket message received on the client simply triggers sync. Other events as you use the app can also trigger sync: unlock, edit of an item, switching vaults, adding an account, etc... The sync protocol is efficient enough that we can be relatively eager about triggering it. The first network call done during sync is one that our server is designed to handle extremely quickly, and it provides the app everything it needs to know in order to complete sync: which vaults it should add, which it should remove, which vaults have extra content it needs to sync down, etc... In the case of syncing down data within a vault, which is the most common case, the app can download a delta from our server from its last checkin point to a new checkin point. The server makes sure to batch as many changes together as it can while keeping the payload size reasonable.

If you have two devices unlocked sitting next to one another, doing an edit looks like this:

Device 1: User saves edit
Device 1: Triggers sync
Device 1: Performs a sync overview call to see what needs to be done. Sees that it's up to date except the one edit to push up.
Device 1: Pushes change up to server
Server: Accepts (or rejects) change to the vault
Server: Finds all users who would need to be notified about this change
Server: Queues a notification to be sent over the websocket connection
Server: Returns the data to device 1 that it needs to know that everything was accepted.
Device 1: Done.
Notification Service: Receives queued event
Notification Service: Finds all devices currently connected associated with the user list in the event
Notification Service: Sends a message via the websocket connection to those devices
Device 2: Receives websocket notification
Device 2: Triggers sync
Device 2: Performs a sync overview call to see what needs to be done. Sees that it needs to download content for a vault.
Device 2: Pulls content for the the vault
Device 2: Applies the diff to its local dataset (dealling with conflicts if needing)
Device 2: Refreshes its UI to show the new data

You might wonder why we don't simply ship the diff within the websocket message itself. That could let us skip a bunch of steps. There are technical reasons for why we haven't yet taken that step, but the one that matters the most is that we don't want clients to rely solely on the websocket messages. We wanted it to be a thing that would help the app feel snappier and not a core part of sync. This was a good idea because we've found that several networks out there actually end up blocking the websocket connection. Since the fallback is needed anyway we decided to make the fallback the happy-path and try to optimize that. Now that we're happy with the fallback path we've been contemplating upping our game with the websocket approach and finding good ways to give clients what they need to sync either more efficiently or not need to do sync at all beyond receiving the message.

Sync is a super fun problem to work on. I'm quite proud of what we've built for 1Password.com. There are parts of it that haven't aged as well as others, but overall I think the design has been a good one. Many users have noticed the significantly faster sync times, and we've reduced the amount of time we need to spend debugging problems as compared to our older solutions. Owning the whole process isn't an option that is available to everyone as there are definitely more development costs associated with it, but in our case it was absolutely worth it.

Rick

Question about 1Password cloud Sync architecture

Comments