Tuesday, April 29, 2014

Google Drive Realtime API - First Thoughts

Technologies tend to arrive with a bang of hyperbole, and then settle into the valley of despair before reaching the plains of enlightenment. It's a well known process. Look it up.

Google's Realtime API is probably going to buck that trend. Well, unless a lot more people read my blog than I know about, because this is a technology that is so sufficiently advanced it's essentially magic. But few people know about it.

I'm assuming you've used Google Docs at some point, and had the pleasure of watching other peoples' cursors (or your own) wandering through the document. Collaborative text editing is older than Sub-etha-edit, but Docs took it to the point where it "just worked" across the entire Internet. Do not underestimate that achievement.

The Realtime API is that.

In short, you can write Javascript web pages that hold a shared 'document' object, and every copy of that data structure in every other browser in the world updates to include any changes.

Seriously, stop and think about that. Text boxes which update when anyone changes their content. We used to laugh about such things being depicted in bad 80's action movies, but that's now state-of-the-art.

I have made my own attempts at such a technology, which is why I find their solution to be so familiar. And I also understand the limitations... that while the API does its best to appear magical in operation, you pay for it in other ways: atomicity being the primary one.

So before going into it's strengths, let's go over the weaknesses of the OT (Operational Transform) approach, in order to better dance around the landmines.

The big one: binary data. OT depends on having a structured understanding of the data it's transforming - it wants to be 'git' (the version control system) but without the possibility of ever having 'unresolved edit conflicts' that require manual intervention.

Binary blobs are - by definition - unstructured, and the realtime API cannot patch large blocks of binary data without fundamentally stepping all over the toes of everyone else trying to do the same thing. the upshot: patches can't be combined, so changes go missing.

So, don't keep BLOBs in Realtime.

A DOM-like object tree is the exact opposite. It is so structured that every branch insertion, every node deletion, can be tracked as a separate "mutation". That's great! The OT system has a stream of micro-operations that can be 'transformed' against each other in a more granular way. Google sat down and figured out the full "theory of patches" for that limited case.

Text strings are a kind of half-way between the two, and one where the OT 'rules' are simple enough that most programmers could sit down and work them out in a half hour.

Creating an OT 'grammar' is necessary for each datatype. The rules which work when combining text edits in a "description" box are not adequate to make sure that two "legal' edits to a JSON string result in a syntax legal combination. The strings may combine to produce invalid JSON... bad if that field is storing program config data.

If you know that two binary blobs represent GIF images, then extracting the 'differences' between two versions (with the intention of applying the 'difference' to a third image) is a simple set of photoshop operations. Without that knowledge, a 'standard binary merge' is only going to corrupt the GIF file.

Clearly, the OT rules for combining images are useless for combining XML data. Every datatype needs an OT definition, and it's not proven (or possibly proveable) that all datatypes can have one.

The academic area that looks into this is called the "Theory of Patches". If you read the papers in the hopes of finding a solution, what you tend to get is "Oh no, it is so much worse than you thought... have you considered these pathological merge cases?" and then your head hurts.

The best thing about the "Theory of Patches" is that, in academic style, at least it lays out the general shape of the minefield, and mentions some particularly impressive craters from past attempts to get through it.

For the moment, the Drive API only has built-in rules for three datatypes: Strings, Lists, and Maps. ('Custom' objects are possible, but they're really Maps in disguise) And frankly, Lists are a pain in the ass.

But since you can build pretty much any data tree structure you want out of those, you're generally good. And by doing so, your 'document model' is granular enough that OT can merge your micro-patches with everyone else's version and keep that tree in synch.

Then there are the consequences... because you have a data structure that doesn't only change when you tell it to, but when any bloody person in the world does. What happens when, three steps into a dialog wizard, someone else deletes the file? Well, you're going to have to code a listener for that.

There are other hard limits: 10Mb per 'realtime document', and I think it was 500k per 'patch'. But you should plan to never hit those limits: if you store bulk data in realtime, you're doing something wrong. (That's what the normal drive API is for.) Realtime is for co-ordination and structure, not streaming, and not data transfer.

Google handles all the authentication, permissions can be set for files through the usual Drive web interface, which is nice. Realtime documents get their permissions from the 'base' drive file they're attached to - like 'conversations' that can be started about any existing file. (actually, file version. If you change the base file, you invoke a new realtime branch - watch that.)

Although the OAuth sign-in process is a lot slicker than it used to be, it still has problems... Mostly caused by pop-up blockers. But that's part of a much bigger discussion I want to save for another day.

And they have automatic undoUndoooo! Do you know how hard that is? How much of your time that saves? How happy your users will be?

What the "Realtime API" does is make Google into the biggest 'Chat Server' in the world. Every document is hooking up to its own little social networking hub, to discuss what buttons their users are pressing, and how their cursors are moving today. A billion tiny brainstorming sessions, between browsers.

There's a lot of guff being written about how people are leaving the social networks. That's fine... Google's social network isn't just for people. It's a peer message-passing layer for our software, arguably more useful and important to the internet's long-term future.

I really encourage you to start writing code that depends on this modern view. Spend time with the paradigm, learn its' flaws and graces - you can probably have a shared 'ToDo' list app working in a few hours that instantly scales to millions of users. But be prepared to let go of a lot of baggage. This is Star-Trek level technology, so don't try applying 20th century thinking. Start fresh.

Enlightenment awaits.

No comments:

Post a Comment