-
Notifications
You must be signed in to change notification settings - Fork 3
Akvo Flow Sync API
The Akvo Flow Sync API (Sync API) provides a way of getting changes that happened in the system. This is useful for synchronizing data changes out of Akvo Flow to an external system in near real-time fashion.
Before the Sync API an Akvo Flow user that wanted to synchronize data out of the system had to read all its data every single time and then try to "detect" what has changed. This process is onerous on both ends: For the server serving API requests but also for the client getting repetitive data and implementing the logic of change detection.
With the Sync API a client can obtain "events" of relevant entities that have changed in the system. It is even possible to detect entities that have been deleted. Something that is not easy with the other approach.
The Sync API publishes change and delete events for:
- Survey
- Form
- Form Instance
- Data Point
The Sync API has the following properties that must be understood:
- Eventual consistency
- At-least-once delivery
- Data authorization
- Authentication and request headers
Due to the nature on how the system records the data change events we can only guarantee that a Sync API client will eventually be "in sync" with the state of the database in Akvo Flow. That is, a client starts with some subset of the Akvo Flow data, consumes the published changes, and at some point, it will reach the same state of Akvo Flow datastore. More info: https://en.wikipedia.org/wiki/Eventual_consistency
Akvo Flow publishes the changes that happen in the system (A new survey was created, new form instance was ingested, etc) but a particular event can show up more than once in the list of published changes. The client must be prepared to process possible duplicated events.
The Sync API follows the exact same authorization model of Akvo Flow. Users of the Sync API will only see changes that are authorized. There are 2 important points to highlight:
- Due to data authorization is possible that a client can get an "empty batch" (The user is not allowed to "see" any of the changes in that batch). In this case the client should continue with the next batch.
- Events related to object deletions (
surveyDeleted
,formDeleted
,formInstanceDeleted
,dataPointDeleted
) can't be authorized. So the client can get ids for entities it doesn't know. The client must ignore those ids.
- The Sync API uses the same authentication method than the Akvo Flow REST API.
- The same request headers requirements apply
The steps in more detail:
A client must obtain and store a Sync URL by making a initial request:
https://api-auth0.akvo.org/flow/orgs/<org>/sync?initial=true
Example response:
{
"nextSyncUrl": "https://api.akvo.org/flow/orgs/<org>/sync?next=true&cursor=6534"
}
-
<org>
is the organization subdomain -
nextSyncurl
is a unique and "use as is" URL that a client must store and use for consuming change events. NOTE: The client must not try to interpret it. The server is entitled to change its structure.
Using the Akvo Flow REST API the client reads and process its current data.
Once the client has read and processed its data from the Akvo Flow REST API, the client is "ready" to start consuming change events.
Using the nextSyncUrl
from Step 1 the client makes a GET request to know what has changed in the system.
Example response:
{
"changes": {
"dataPointDeleted": [],
"dataPointChanged": [],
"formChanged": [],
"surveyChanged": [
{
"id": "203260001",
"name": "Survey test - Sync API",
"registrationFormId": "",
"createdAt": "2020-02-10T20:55:42.645Z",
"modifiedAt": "2020-02-10T20:56:04.037Z"
}
],
"formDeleted": [],
"surveyDeleted": [],
"formInstanceChanged": [],
"formInstanceDeleted": []
},
"nextSyncUrl": "https://api-auth0.akvotest.org/flow/orgs/uat2/sync?next=true&cursor=6805"
}
The response contains 2 keys:
-
changes
: It's an object with a predefined set of keys:surveyChanged
,surveyDeleted
,formChanged
,formDeleted
,dataPointChanged
,dataPointDeleted
,formInstanceChanged
,formInstanceDeleted
- Each entity use the same representation of the Akvo Flow REST API except they don't contain "URL links", only data.
-
surveyChanged
- nodataPointsUrl
-
formChanged
- noformInstancesUrl
dataPointChanged
formInstanceChanged
-
- For
Deleted
events, only entity ids are presented - Notice the
"Changed"
suffix in some keys. The Sync API makes no distinction between new and updated entities. The client can deal with this by using an upsert strategy.
- Each entity use the same representation of the Akvo Flow REST API except they don't contain "URL links", only data.
-
nextSyncUrl
: Similar to the initial request. A "use as is" URL to be used for getting the next batch of changes.
The client continues reading the changes using nextSyncUrl
until no changes are available. When a client reaches the end of the list of changes:
- The server returns
HTTP 204 (No Content)
- The server returns a response header
Cache-Control: max-age=<value>
- The client must wait the value of
max-age
(e.g. 60 seconds) before attempting to get more changes using the lastnextSyncUrl
available
Fetching data using Akvo Flow REST API takes time and is a forward only process. Consider the following scenario:
- The time spent to fetch all data is the interval between 2 timestamps (
ts0
andts1
). - Your process starts at
ts0
and has processed more than half of the stored data - Some user(s) submit data for processing to an already processed Survey.
- In this scenario you lost some Form Instances, due to the forward only nature of the processing. That new data will be only available until the next full synchronization of the data.
With the call /sync?initial=true
, your start a "Synchronization session", a mark in the full log of events. Then when your process finishes processing at ts1
, you can make a request to know what has changed since ts0
, which is to say, give me all change events after the mark.
In order to have a reliable synchronization process the client must store the nextSyncUrl
. If the client has lost the value, it must start from scratch. From Step 1 in the described process.