Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental Updates - UIS data store sync #3207

Closed
dwsutherland opened this issue Jul 6, 2019 · 4 comments
Closed

Incremental Updates - UIS data store sync #3207

dwsutherland opened this issue Jul 6, 2019 · 4 comments
Assignees
Milestone

Comments

@dwsutherland
Copy link
Member

dwsutherland commented Jul 6, 2019

For communicating Workflow Service (WS) changes:

  • Whether User Interface Server (UIS) data store sync or GraphQL updates for a text based user interface, there needs to be ZeroMQ message pattern for pushing/publishing these updates.
  • A scheme to sync/mirror the data-store between WS and UIS needs to be developed/proposed.
  • The data sync scheme puts requirements on the choice of network pattern (PUB/SUB, PUSH/PULL, PAIR/PAIR ...etc)

To kick the discussion off..: Based on the work done by @oliver-sanders, the graphql-protobuf PR, and the incremental data-store generation PR (pending merge), the following details one method/strategy to sync/mirror the WS data-store...

Scheme Proposal
The idea would be for each dynamic data element, that is a data element type containing fields that are updated and/or populated post-creation (i.e. task/family cycle instance), to contain a timestamped id.. These are already in place as stamp field, for example:

        {
            "stamp": "qar.20170201T0000+12@1562416226.7455504",
            "id": "sutherlander/baz/20170201T0000+12/qar",
            "task": "sutherlander/baz/qar",
            "state": "succeeded",
            "cyclePoint": "20170201T0000+12",
.
.
.

(edges have static fields, but are added and removed dynamically)
These stamp IDs would be collated in lists according to type, this would form a sort of skeleton of the data-store and would be constructed at each end (WS and UIS):

message PbSkeleton {
    string workflow = 1;
    repeated string tasks = 2;
    repeated string families = 3;
    repeated string jobs = 4;
    repeated string edges = 5;
}

(Definition data elements are static content only changed on run/restart/reload)
An update message would contain two lists for the corresponding type. One containing all the stamp IDs for that type and the other being all updated/new data elements:

message PbTaskUpdate {
    repeated string stamp = 1;
    repeated PbTaskProxy tasks = 2;
}

These update messages could be communicated using the PUB/SUB pattern (although there are others), and this pattern could be used for other purposes such as updates to a text interface via GraphQL..
On receipt, as a subscriber, the UIS would:

  1. Add the new data elements in it's own data-store accordingly.
  2. Compare the received stamp list with it's own skeleton.
  3. Use the difference to delete data elements.
  4. Use the difference to request missing element over REQ/RES.
  5. Update it's own skeleton.
  6. Listen for the next update message and repeat.

Now we could just hash the list of stamp to a single string and compare this, however sending the complete list has utility in the deletion and request of missing...

Example:
Take capitalised as the elements and lower as the stamp strings then:

WSData {
  stamps: [b2, c1, d1, e1, f1]
  elements: [B, C, D, E, F]
}
UISData {
  stamps: [a2, b1, c1]
  elements: [A, B, C]
}
SyncMsg {
  stamps: [b2, c1, d1, e1, f1]
  elements: [B, E, F]
}

So the SyncMsg, who's elements [B, E, F] could be populated with just the updated fields, is published/pushed to the UIS, then the UISData elements are updated/populated and new stamps generated from these:

UISData {
  stamps: [a2, b2, c1, e1, f1]
  elements: [A, B, C, E, F]
}

(As you may have noticed, the UISData missed an update message with element D. This is intentional, and it's purpose is to show how the scheme will handle this.)

Now with the stamps from the updated UISData and SyncMsg we can:

  • Use the SyncMsg difference [b2, c1, d1, e1, f1] - [a2, b2, c1, e1, f1] = [a2] to delete the corresponding elements (that are no longer in the WS data store).
  • Use the UISData difference [a2, b2, c1, e1, f1] - [b2, c1, d1, e1, f1] = [d1] to request any missing elements (i.e. if an update message is missed).

(In set operations that would be foo.difference(bar) and bar.difference(foo))

After deletions and requests are complete, generate the stamps again.
(hence why the full list of stamps are sent with an update, the other option would be to periodically request/publish the entire skeleton)

@dwsutherland dwsutherland added this to the cylc-8.0a1 milestone Jul 6, 2019
@matthewrmshin
Copy link
Contributor

Is this fixed by #3202?

@hjoliver
Copy link
Member

Is this fixed by #3202?

No, that was just incremental update of the data store inside the WFS, not incremental updates from WFS to UIS.

@matthewrmshin matthewrmshin modified the milestones: cylc-8.0a1, cylc-8.0a2 Sep 3, 2019
This was referenced Sep 23, 2019
@dwsutherland
Copy link
Member Author

dwsutherland commented Oct 15, 2019

Changes to proposed scheme:
Delta messages look like this:

message PbWorkflow {
    string stamp = 1;
    string id = 2;
    string name = 3;
    string status = 4;
    string host = 5;
    int32 port = 6;
    string owner = 7;
    repeated string tasks = 8;
    repeated string families = 9;
    PbEdges edges = 10;
    int32 api_version = 11;
    string cylc_version = 12;
    double last_updated = 13;
    PbMeta meta = 14;
    string newest_runahead_cycle_point = 15;
    string newest_cycle_point = 16;
    string oldest_cycle_point = 17;
    bool reloading = 18;
    string run_mode = 19;
    string cycling_mode = 20;
    map<string, int32> state_totals = 21;
    string workflow_log_dir = 22;
    PbTimeZone time_zone_info = 23;
    int32 tree_depth = 24;
    repeated string job_log_names = 25;
    repeated string ns_defn_order = 26;
    repeated string states = 27;
    repeated string task_proxies = 28;
    repeated string family_proxies = 29;
    string status_msg = 30;
    int32 is_held_total = 31;
}

message EDeltas {
    double time = 1;
    int64 checksum = 2;
    repeated string pruned = 3;
    repeated PbEdge deltas = 4;
}

message FDeltas {
    double time = 1;
    int64 checksum = 2;
    repeated string pruned = 3;
    repeated PbFamily deltas = 4;
}

message FPDeltas {
    double time = 1;
    int64 checksum = 2;
    repeated string pruned = 3;
    repeated PbFamilyProxy deltas = 4;
}

message JDeltas {
    double time = 1;
    int64 checksum = 2;
    repeated string pruned = 3;
    repeated PbJob deltas = 4;
}

message TDeltas {
    double time = 1;
    int64 checksum = 2;
    repeated string pruned = 3;
    repeated PbTask deltas = 4;
}

message TPDeltas {
    double time = 1;
    int64 checksum = 2;
    repeated string pruned = 3;
    repeated PbTaskProxy deltas = 4;
}

(Yes workflow delta is just PbWorkflow message)

  • These are populated with changes of a WS loop, and only update fields of the sub-messages are populated (and applied to the WS update in the same way as UIS).
  • Instead of a stamps field, I opted for a checksum field which is a hash of the stamps, this will reduce the size of a message, and can be used to tell if updates have been missed (so just request all thereafter).
  • The time field can be used to order and discard updates in a queue (discard updates received before full data request).
  • The pruned field is a list of IDs of those items to be deleted from the data-store.

(WS end implemented in #3389 so far)

@dwsutherland
Copy link
Member Author

Closing dual PRs merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants