- 
                Notifications
    You must be signed in to change notification settings 
- Fork 37
Transaction model
To support undo and concurrent editing, we update our model following a set of "transactions".
The client is the JS code running on the user's browser.
The server is our cluster of web servers. They communicate with the datahorse, another cluster. We call it a datahorse to distinguish it from the database, which is the actual persistent storage layer, currently PostgreSQL.
The model is all the data in a single document set. It is too big to be copied between the database and the server, so only the database can see the complete model. But the server, being a crafty proxy, can query for any parts of the complete model that it needs, whenever it wants, so as far as the client is concerned, the server model is complete. The client model is at the lower end of the food chain.
The model contains several different types of objects: Documents, Tags, and Visualization Objects. Documents and tags are built-in, while visualization objects are created and processed by a visualization plug-in, running on worker, server, and client.
We model a document object as something with:
- a method to retrieve the text
- a method to display in browser
- metadata, which is an arbitrary set of key:value pairs
Maybe the metadata also allows recursive structures ala JSON. Would be nice. Then it would be easy to model applied tags as an array of tag IDs in the document metadata, which would make tags not a special case from the point of view of the transaction system.
A tag is a top level object, currently containing a tag name and color. Tags can only be applied to documents. The tag list on a document is distinct from the tag object itself. Ideally, the tag list on a document is modelled using the general document metadata system.
A visualization object is generated from a document set by a visualization plugin. I imagine it as something like a case class or JSON object, with a unique ID.
For the current tree view, nodes would be the only visualization object (documents are built in to the system, tags are metadata). For a scatterplot view, individual dots in the plot might be visualization objects, each with document ID, position, size, color, first visible zoom level, etc.
There are three parts to a visualization plugin. These provide the following APIs:
Worker:
- CreateViz: document set => visualization objects in the DB.
Datahorse:
- applyDelta: compressed delta => changed objects and expanded delta in DB
- changedObjects: version number, object list, view => returns set of objects that have been changed in the DB since version, or created within the given view, with newest values for each object
Client
- Manages UI drawing and interaction,
- generates compressed deltas when a makes a change
- generates expanded deltas from compressed deltas so that changes can be applied (predictively) on the client side
All changes are executed in atomic operations called "transactions." The server maintains the definitive current server version and orders transactions received by all clients editing the same document set.
A transaction is expressed as a delta, a change with respect to the current state. Deltas are initially generated as compressed deltas which allows shorthand lists of changed objects, as compressed scopes (see below.) A compressed delta can be expanded with respect to a current version, on either the client or a server. An expanded delta is a complete list of all objects changed, and their changes. It contains enough information to be inverted, that is, undone. It does not reference any objects that are not changed.
A compressed scope is a way to refer to a list of objects (documents, tags, visobjs.) A compressed scope is a list of "scope elements", each of which is one of
- a literal list of object IDs
- an object query, which selects objects based object type, a set of fields to match exactly, and given match values
- visualization scopes: these have arbitrary payload and are interpreted by the visualization plugin
For the tree viz, a compressed scope is akin to the current selection object, which is a set of nodes, tags, docs. There could also be a visualization-specific scope for the tree which selects a subtree rooted at a node, down to N levels. A scatterplot viz could use selection rectangle coordinates as viz-scope query
Conceptually, a compressed scope always expands to a set of objects.
A view is an object that defines "where the user is looking" in a visualization. The precise definition is that a view is sufficient to determine whether a newly created visualization object would be visible to a client with that view. Views are represented as visualization scopes.
Each visualization has its own current view.
For the tree, the view is the set of open nodes. For the scatter plot, the view is the bounding box and zoom level of the region of the scatterplot currently displayed.
Because a view is a visualization scope, the server side plugin can answer the question, is a given visualization object visible in this view?
The client can ask the server to send any set objects over the wire, specifying the objects via a compressed scope.
The client always has a current local version. It is based on the last server version that the client knew of, the base version. The client maintains several ordered lists of transactions:
- outbox: locally applied changes yet to be sent to the server
- diffbox: locally applied changes that have been sent to the server but not yet acknowledged as applied on the server side. If the diffbox is not empty, the client version is divergent.
- delta archive: an ordered list of all expanded deltas applied to the local version. It may overlap with outbox, divergent and the undo stack.
The user can make changes to the model either through built-in UI (e.g. the document list) or the visualization. When this happens, the client
- Creates a compressed delta, and assigns it GUID (which must be globally unique.)
- Expands the delta against the client version, which produces a list of objects and their changes.
- Applies the expanded delta to the client version as best it can (a predictive update)
- Adds the compressed delta to the outbox
- Adds the expanded delta to the delta archive.
Steps 1, 2, and 3 may involve calling a client-side visualization plugin, if the change was made through the visualization UI and/or references visualization objects.
The client produces a predictive update when it expands a delta locally. The client cannot always completely update the local state after a transaction, because the transaction may alter the state in ways the client cannot compute.
Suppose Alice adds a tag T to all documents in a node N. This requires the following updates to the client state:
- Update the count of documents with tag T on all parents and children of N. This can be computed locally because we knew the tag count for T on N before tagging, and now that is the number of docs in N. So we can add the difference to all loaded parents of N.
- Update the count of documents tagged with T. Again, we can use the difference between the old and new counts.
- Tag indicators on individual documents. Because each document on the client currently has associated with it the complete path (set of nodes that include the doc) we can do this.
- Document list for T. The client caches the first ten documents with tag T, ordered by description. This list can change if more documents are tagged, but the client does not have the full list of documents so it cannot make this change.
So, Alice's client can update 1-3 immediately, but must contact the server to update 4. The update of items 1-3 is called "predictive update," and it must be designed to match exactly what the server will produce.
When the client expands a delta, it produces a predictive update. In some cases the client expanded delta will be "complete" in sense it updates all necessary state, but in others, such as this case, we will need to get a later definitive update from the server.
A correct predictive update method will produce exactly the same values that the server eventually broadcasts. This is a great integration test.
Periodically (but with high priority) the client sends all compressed deltas in the outbox to the server, in order. (They may be batched into a single POST, but the order must always be preserved.)
When the server receives a delta from a client, it
- Checks that the user has access to the given document set
- Validates the delta for syntax and semantics (not checking anything on the database).
- Forwards the still compressed delta to the database worker
- Returns OK
After the server acknowledges receipt of each delta, that delta is moved from the outbox to the diffbox.
The datahorse is (potentially) a different process on a different machine. The point is to decouple server response from database execution, and provide a clear synchronization point that orders the transactions from multiple clients. When the datahorse receives a transaction from a server, it:
- Queues the transaction, if it's busy. This produces the definitive ordering of the transactions. There is one queue per document set.
- Translates the compressed delta into instructions for the database (e.g. SQL) to generate and apply an expanded delta
- Submits these instructions to the database and ensures they execute.
- Increments the server version by one
Steps 2 and 3 may involve calling a visualization plugin, if the delta references visualization objects. The visualization plugin translates the compressed delta into database manipulations on visualization objects, sufficient to apply and store an expanded delta.
Periodically, the client polls the server to check for updates from other clients, and to complete updates of the local state which cannot be computed locally. This is the most complicated step in transaction handling, and is executed as follows:
- The client sends the server a complete list of all locally cached objects, plus the current view for all visualizations, plus the number of the client's base version (this is the last server version seen before divergence.)
- The server sends a list of expanded deltas representing the changes made by all clients (including the calling client) since that version. Each expanded delta includes the (changed) contents of all objects which were listed in step 1, plus any newly created (or moved) visualization objects are in the client view.
- If the first (oldest) delta send by the client is not the same as the first (oldest) delta in the diffbox, the client rewinds to its last server version by inverting all expanded deltas in its diffbox and applying them in reverse order. The returns the divergent version to the server version.
- Then the client applies each delta from the server, in version number order, and applies it to the local state. The base version number is set to the server version number of each applied delta.
- During this process, it compares the GUID of each delta with those in its diffbox, and removes any matching deltas from the diffbox.
- Each applied expanded delta is added to the delta archive.
The updates retrieved from the server during this process are known as "definitive" updates, as opposed to client-only "predictive" updates.
The server forwards the request to the datahorse. This includes the set of objects {C} that the client set, plus a set of views {V} which determine where each visualization is looking, plus the last server version V that the client had.
The datahorse walks through each expanded delta in the database, for all transactions applied after V, in order. For each transaction it intersects {C} with the set of objects in the expanded delta, and accumulates into a final set of updated objects {U}
At each transaction which involves the creation of visualization objects, the datahorse calls the visualization plugin for each view in {V} to ask it whether any of the newly created objects are in view. If so, the objects in view are added to {U}
Finally, the datahorse returns the current (latest) value of each object in {U} to the server, for forwarding to the client.
For undo, the client needs a bit more machinery:
An undo list, a list which only stores the user's own expanded deltas. (This mimics the outbox more than it does the delta archive.) An undo list pointer, which helps us manage "redo" as well.
When the client performs a transaction, it:
- Truncates the undo list, removing everything after the undo list pointer
- Appends the expanded delta and its GUID to the undo list
- Increments the undo list pointer
When the user hits "undo", the client:
- Creates a compressed delta for the "undo" action, which consists of the GUID of the delta the undo list pointer points to
- Decrements the undo pointer
- Processes this delta using the normal steps in "client transactions."
The actual user-visible undo on the client side is handled by step 3 of client transaction processing, the predictive update. However, we still trigger an update cycle on the server, because (for example) the view may have changed, meaning that objects changed by the undo are newly in scope and we need their values from the server.
Note that for the server to interpret this compressed delta which represents undo, the server must already have received the original delta that is being undone. This is guaranteed by ensuring that the outbox preserves order of submitted transactions.
When the user hits "redo", the client:
- returns, if the undo list pointer points to the end of the list
- Increments the undo pointer
- Creates a compressed delta for the "redo" action, which consists of the GUID of the delta the undo list pointer points to
- Processes this delta using the normal steps in "client changes."
Aside from this, "undo" and "redo" transactions are identical to any other. The compressed delta consists solely of a GUID, and both the client and the server can expand it, based on their delta archives.
The client and server need to implement several locks to ensure consistency in this asynchronous environment.
- 
The datahorse is responsible for ordering the transactions on each document set, and executing them one at a time as essentially atomic operations. Reads to the database during transaction execution must return the previous (last complete) version. 
- 
When a server begins an update cycle, as requested by the client, it must be relative to the current version. No part of any subsequent versions can be read during the computation of changed or newly in-view objects, or the retrieval of the values of these objects. 
This may have implications for how the database is structured, since it must be possible to retrieve multiple object values at different versions simultaneously, for different clients which have started update cycles at different server versions. Alternatively, client update cycles can be serialized with transactions, so that only one transaction OR update cycle can be processing at any one time, and every update cycle starts at the latest version. This is not as horrible as it sounds, as each document set can be locked independently.
During the update cycle, after the client begins sending all cached updates to the server, the client:
- cannot request the value of additional objects from the server, because these values may come from a difference version than the one the client is about to be updated to
- cannot make changes to any client side objects (e.g. predictive updates), because these changes may conflict with the rollback-update cycle
- The client can only have one active update at any time.
It would be possible to remove client lock 1 if the client is aware of the latest server version, the version that the update will bring the client to, and asks the server for object values at that version. This implies that the client API also takes a version number, and again, that the server is capable of tracking multiple client versions.
Client lock 2 may be much more difficult to remove. It would involve duplicating the current state into two states A and B. Then A would be rolled back/updated while the user sees B and all new predictive updates are applied to B, and also queued as expanded deltas in a box C. When the update is finished, the contents of C can be applied to the updated A, which becomes the new master state (B is discarded.)
Both of these plans still require client lock 3.