[discussion] Change optimistic/server update ordering #2482

mattkrick · 2018-06-27T19:49:07Z

tl;dr My app shouldn't break just because someone has slow internet.

Relay's PublishQueue has a race condition built into it.
Let's take an example from a test that was passing, but definitely should not have been:

      const increaseVolumeUpdater = {
        storeUpdater: storeProxy => {
          const amp = storeProxy.get('84872');
          amp.setValue(amp.getValue('volume') + 1, 'volume');
        },
      };

      const setVolumeTo10Updater = storeProxy => {
        const amp = storeProxy.get('84872');
        amp.setValue(10, 'volume');
      };

queue.applyUpdate(increaseVolumeUpdater);
queue.commitUpdate(setVolumeTo10Updater);

As written, one would expect the volume to be increased by 1, and then set to 10.
In the test, it is set to 10 and then increased to 11.
That's pretty awful. I called increase back when the volume was 3, now it's not safe to call. What if the max safe volume is 10? Broken app.

This is because internally, the publish queue runs all committed updates first, then all client updates, then optimistic updates. Here's a list of problems with that:

Not linearizable. if 2 mutations go out, the first MUST return before the 2nd or your app can break. see pending optimistic updaters always run after local updaters #2481 for proof.
Your server updates don't have access to the current state (any optimistic updates or client updates that were run after it was sent).
Your client updates don't have access to optimistic updates, at best causing extra renders, at worst, causing errors. <-- This is how i discovered the error. My app broke when latency increased.
Your optimistic updates are run when they shouldn't be, requiring all your safety checks to occur inside the optimistic updater instead of before the mutation is called.

This is the same approach used in operational transformations, databases, and even in the world of redux (https://github.com/mattkrick/redux-optimistic-ui).

fixes #2481

mattkrick · 2018-06-28T00:09:25Z

While this goes through review, I've made it available on npm.

To install:

yarn add @mattkrick/react-relay @mattkrick/relay-runtime
add this to your webpack config's resolve :

alias: {
      'react-relay': '@mattkrick/react-relay',
      'relay-runtime': '@mattkrick/relay-runtime',
    },

I'll remove the packages once this is merged & published.

mattkrick · 2018-07-01T09:59:21Z

packages/relay-runtime/mutations/RelayRecordSourceMutator.js

@@ -148,8 +149,19 @@ class RelayRecordSourceMutator {
  }

  copyFieldsFromRecord(record: Record, sinkID: DataID): void {
-    this.copyFields(RelayModernRecord.getDataID(record), sinkID);


calling copyFields is not desirable because as the code is written, the sink gets overwritten with the base, then the mutated sink overwrites itself. This itself is a 2nd bug that is worthy of its own PR, but since this PR depends on fixing the bug, I kept them at 1 PR.

See new test in RelayRecordSourceMutator-test.js.

this may be the same/related issue i'm experiencing here #2489?

sounds like it. those tests should pass in this PR. if they don't let me know.

mattkrick · 2018-07-06T06:05:22Z

packages/relay-runtime/store/RelayPublishQueue.js

+    if (optimisticUpdate) {
+      const updateIdx = findUpdaterIdx(this._pendingUpdates, optimisticUpdate);
+      if (updateIdx !== -1) {
+        this._pendingBackupRebase = true;


note that before with arbitrary transform application order, it had to rebase each time a commit happened (from server, source, or local). now, we only rebase if it comes in & is replacing an optimistic update. net win for performance.

sibelius · 2018-07-09T22:30:10Z

@kassens does this makes sense?

mattkrick · 2018-07-10T15:27:22Z

packages/relay-runtime/store/RelayPublishQueue.js

-              response,
-            } = optimisticUpdate;
-            const selectorStore = store.commitPayload(operation, response);
-            // TODO: Fix commitPayload so we don't have to run normalize twice


cleaned up logic so no more double normalizing. performance win!

mattkrick · 2018-07-10T15:33:32Z

packages/relay-runtime/store/__tests__/RelayPublishQueue-test.js

@@ -1334,11 +1348,14 @@ describe('RelayPublishQueue', () => {
      );
      // Query payload sets name to 'zuck'
      queue.commitPayload(createOperationSelector(NameQuery, {id: '4'}), {
-        me: {


this wasn't even a valid source. createOperationSelector supplied id: 4 but all that was in the store was me.
now you can't notify the store with garbage payloads 😄

mattkrick · 2018-08-07T00:54:24Z

@kassens any time for a review? Without this PR, it's unsafe to commit a local update while there is a pending server update. I think that's a pretty great feature.

mattkrick · 2018-08-21T05:13:33Z

@kassens @sibelius @JenniferWang any plans on merging this?

mattkrick · 2019-01-31T19:08:52Z

@josephsavona anything I can do to help get this merged?
general concept: instead of committing server updates, then local updates, then optimistic updates; simply apply updates as they're received. When server updates arrive, they take the place in line of their respective optimistic updates, which eliminates the race condition between server updates & local/optimistic updates.

aside from removing the race condition, it greatly reduces the number of rebases & cuts optimistic normalization calls by 50%. Wins all around!

josephsavona · 2019-02-01T00:31:27Z

@mattkrick Thanks for flagging this. Commit/update ordering is a complex issue and it's important to note that the system is working as designed[1]. There are a lot of tradeoffs involved and there isn't a single objectively "correct" approach. With that in mind there are two main considerations: what to do with the specific changes proposed in this PR and what (if any) changes to make around commit/update ordering in general.

As for this PR, the changes proposed here constitute a non-trivial breaking change that would likely require a careful staged rollout. As such we can't move forward directly with landing the PR: even if we decide that this is the exact right direction, we would have to figure out a way to incrementally implement the change without breaking Relay apps.

As for next steps:

I'll raise this with the team and we'll investigate this topic more. We can consider, for example, whether to change the default behavior along the lines proposed here or whether this should be a user-space option.
If you're interested, it could be helpful for you or other interested community members to implement the desired behavior in user-space by re-implementing the Environment interface via composition of existing relay-runtime primitives (including possibly composing the default Environment). That would allow users to opt-in to the new behavior without forking Relay, gather feedback about it, etc while the team considers this topic.

[1] caveat that #2489 is related and needs more investigation

EDIT: I just want to be clear that we appreciate you taking the time to investigate and raise this issue. It's definitely worth further investigation!

mattkrick · 2019-02-01T01:38:54Z

As a first step, would you be open to accepting a PR that only includes the changes to RelayModernRecord.js and RelayRecordSourceMutator.js? I'd love to reach consensus that those 2 changes fix existing bugs & do not introduce any breaking changes. Plus, if you check out the new test case that those fix you'll see the root cause of #2489.

As for breaking changes, the only one I can think of is that an updater could get called twice if there were an existing pending server update. That'd only be a problem if folks had side effects (eg popping a toast) in their updater functions.

If you can think of any other breaking changes please let me know! I want this to be a smooth transition for all. I know you said that ordering is subjective, but I really hope the example mentioned in the first comment might sway your mind. Lamport's happened-before relation is a staple in just about every distributed system & it'd be great to see it in Relay!

josephsavona · 2019-02-01T06:40:54Z

As a first step, would you be open to accepting a PR that only includes the changes to RelayModernRecord.js and RelayRecordSourceMutator.js?

Thanks for offering! However there's a simple fix for #2489 so I'm just going to import that PR and apply the fix there for simplicity. One note about the RelayModernRecord changes - it's invalid to update __typename when copying fields, that field is intentionally being skipped.

As for breaking changes, the only one I can think of is that an updater could get called twice if there were an existing pending server update.

By breaking change, I'm referring to the fact the current behavior is by design and existing Relay apps may depend on it for correctness. Although the change may "fix" some use-cases, it could easily break others. As an example, in your first comment the author could just as easily be implementing a UI where there is a base volume in addition to a relative volume adjustment control (i'm thinking of cameras where you can dial exposure up/down relative to the camera's automatic metering). The first applyUpdate could be used to implement the relative adjustment. In that case the app would break under the change proposed here. (There's a separate discussion about if that code would be the best way to implement the intent i'm describing, but the point stands that it's possible someone could write code with that intent)

mattkrick · 2019-02-01T20:59:04Z

it's invalid to update __typename when copying fields, that field is intentionally being skipped.

I'm really interested in this one. Consider a subscription that returns a Union type. Receiving 2 subscription payloads that are 2 different concrete types would result in the same auto-generated storageKey. How can relay maintain entity integrity?

The first applyUpdate could be used to implement the relative adjustment. In that case the app would break under the change proposed here.

If you don't mind, I'd like to prove that the the code today offers no guaranteed order. If you agree, then I hope we can agree there's no breaking change 🙂

Consider a counter with the value of 1:

setTimeout(() => commitLocalUpdate(environment, (store) => {
  const counter = store.get('viewer').getValue('counter')
  const nextValue = counter === 1 ? counter * 100 : counter + 1
  store.get('viewer').setValue(nextValue, 'counter')
}), 1000)

commitMutation(environment, {
  updater: (store) => {
    const counter = store.get('viewer').getValue('counter')
    const nextValue = counter === 1 ? counter + 1 : counter * 100
    store.get('viewer').setValue(nextValue, 'counter')
  }
})

If the server responds in less than 1 second, then the final result is 3. If the server response takes more than 1 second, the result is 10000.

Assuming no one is calling commitUpdate directly, the only apps that could break are the ones that are already broken due to built-in race conditions. At least that's my belief. If I'm wrong lemme know!

josephsavona · 2019-02-01T23:22:03Z

If you don't mind, I'd like to prove that the the code today offers no guaranteed order. If you agree, then I hope we can agree there's no breaking change

The current implementation applies updates in the order they are received from the network by design. This reflects the fact that typical applications are not implemented to provide guarantees that the server will reflect the ordering in which actions were first initiated on the client. Relay therefore leaves the ordering up to the user - for example, a network layer could choose from a variety of strategies such as processing operations serially or executing in parallel but resolving them in execution order. Relay isn't dictating the ordering, it's respecting the application's ordering.

The end result for a typical setup - where the network implementation processes requests independently - is that later server responses overwrite earlier responses. This is a reasonable heuristic that assumes more recent responses are more likely to correspond to the current server state than older responses. But again - the choice of heuristic is really in the user-defined network layer.

Therefore this is a breaking change in the sense that it chooses to stop respecting the network ordering and instead change to an imposed, serial ordering. Even for typical network implementations that don't guarantee an ordering, network response times generally follow a distribution as opposed to being purely random, this is absolutely an observable change (if only probabilistically so).

Again, we definitely agree that this topic is worthy of further exploration. But I hope I've clarified that this is a breaking change and that we have to proceed carefully with any changes. As I mentioned earlier, we'd definitely be interested to see exploration around this in user-space: an implementation of the Environment interface that behaves as you've described here, or something that can compose a Network and add serial processing and/or resolution serialization.

mattkrick · 2019-02-02T01:28:43Z

Relay isn't dictating the ordering, it's respecting the application's ordering.

I think we found a good disagreement! I have a local update that needs to execute after an optimistic update. How can I accomplish this? (with the state being the output of the optimistic update)

josephsavona · 2019-02-04T17:22:38Z

I think we found a good disagreement

@mattkrick Your language in this discussion (here and in #2481) repeatedly focuses on actively trying to find disagreements and proving that you're right, while not acknowledging the counterpoints raised by others. In case I wasn't clear, we agree that this topic is worthy of more exploration precisely because there are behaviors that are difficult to achieve in Relay today (e.g. linear ordering of updates) because of the tradeoffs that it makes. The only disagreement I see is that we cannot proceed with this PR as-is since it constitutes a breaking change (at least I assume it's a disagreement since you haven't actually responded to my comments around this).

I have a local update that needs to execute after an optimistic update. How can I accomplish this? (with the state being the output of the optimistic update)

Yup, this is one of two examples you've brought up that are difficult to achieve (technically speaking you can use lookup() to read the optimistic state inside your local updater - so it's possible but non-obvious). I'd encourage you to explore how to achieve these behaviors in user-space using the existing primitives - either it can be done and users can opt-in to using an extension, or it can't be done and we should consider changes to make it possible. Again, I believe this can all be achieved by some combination of composing the existing Environment, composing a Network to force linear resolution, etc. The one thing I can imagine is that it may be easiest to achieve this by reusing RelayModernEnvironment but passing a custom PublishQueue instance - in that case, we could consider making PublishQueue an argument (again, it would help to see some exploration here).

mattkrick · 2019-02-04T21:05:51Z

hey, sorry if I came across as argumentative, that certainly wasn't my intent.
My goal is seeking to understand. Resolving disagreements helps achieve understanding & finding them is wonderful because it's the first step. If you say Relay respects the application's ordering & I disagree, then I'd like to find out what you know so we can find agreement. If we agree that Relay doesn't respect the application ordering, we can move to the next step of asking if it should & again checking consensus. Being right for the sake of being right doesn't do me much good if it means this PR doesn't land. If I can do better, please let me know.

a user-defined publish queue would be great! Unfortunately, this PR is really 4 parts:

custom publish queue
custom environment (pass in optimistic response to couple with server response)
custom copyFieldsFromRecord (so base record doesn't overwrite sink)
custom copyFields (to guarantee entity integrity for union types)

I agree putting this in user space without a fork would be great, but fixing the last 2 items would require rewriting the imports for all of their dependents, which would touch the majority of the package.

josephsavona · 2019-02-04T23:47:32Z

I guess I just don't see any disagreements :-) I'm just trying to explain that the behaviors you're pointing out are by design -and the reasoning - for your and others context. We agree that it would be interesting to explore alternatives!

a user-defined publish queue would be great! Unfortunately, this PR is really 4 parts:

This type of concrete list is really helpful - I agree that the first two things can hopefully be explored in user-space. If there is anything that prevents you/others from doing so, the next step would be to file an issue about the specific issue so that we can discuss remediations (and then move on to a PR if we agree on the approach).

custom copyFieldsFromRecord (so base record doesn't overwrite sink)

This is orthogonal to the core issue being discussed, but regardless I included the fix in 55ce137.

custom copyFields (to guarantee entity integrity for union types)

You're right that __typename should be updated in this case - I misunderstood when replying about this earlier. We'd be happy to review a PR that addresses just this specific issue.

mattkrick · 2019-02-05T01:53:59Z

Awesome! Thanks for understanding 🙂 That commit solved part 3 & PR #2637 is up to solve part 4.

I've updated the PR to solve the remaining 2 issues. Let me know if this works.

mattkrick · 2019-03-07T20:14:11Z

NOTE: requires #2637 (excluded that in latest rebase to master)

mattkrick · 2019-05-02T14:11:05Z

friendly bump, pretty tiny PR for a swappable publish queue

mattkrick · 2019-07-11T00:39:58Z

closing in favor of #2791, please see new PR

facebook-github-bot added the CLA Signed label Jun 27, 2018

sibelius requested review from kassens and JenniferWang and removed request for kassens June 27, 2018 19:51

mattkrick commented Jul 1, 2018

View reviewed changes

steida mentioned this pull request Jul 3, 2018

Relay declarative configs and optimistic update are broken este/este#1491

Closed

mattkrick commented Jul 6, 2018

View reviewed changes

mattkrick mentioned this pull request Jul 7, 2018

pending optimistic updaters always run after local updaters #2481

Closed

mattkrick commented Jul 10, 2018

View reviewed changes

sibelius mentioned this pull request Sep 16, 2018

Race condition in disposing selection reference in QueryRenderer#_fetch? #1850

Open

mattkrick mentioned this pull request Sep 17, 2018

[Modern] Client schema docs and examples #1656

Closed

mattkrick force-pushed the fix-pub-queue-master branch from 7e54f7b to b743af6 Compare January 31, 2019 18:52

josephsavona changed the title ~~Fix broken linearizability~~ Change optimistic/server update ordering Feb 1, 2019

josephsavona changed the title ~~Change optimistic/server update ordering~~ [RFC] Change optimistic/server update ordering Feb 1, 2019

josephsavona changed the title ~~[RFC] Change optimistic/server update ordering~~ [discussion] Change optimistic/server update ordering Feb 1, 2019

mattkrick mentioned this pull request Feb 5, 2019

remove __typename as exclusion criterion in copyFields #2637

Closed

mattkrick force-pushed the fix-pub-queue-master branch from b743af6 to 2b6ba14 Compare February 5, 2019 01:50

reset to master & cherry-pick

35f4deb

mattkrick force-pushed the fix-pub-queue-master branch from b0afd37 to 35f4deb Compare March 6, 2019 03:38

mattkrick closed this Jul 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[discussion] Change optimistic/server update ordering #2482

[discussion] Change optimistic/server update ordering #2482

mattkrick commented Jun 27, 2018 •

edited

Loading

mattkrick commented Jun 28, 2018

mattkrick Jul 1, 2018 •

edited

Loading

dminkovsky Jul 16, 2018 •

edited

Loading

mattkrick Jul 17, 2018

mattkrick Jul 6, 2018

sibelius commented Jul 9, 2018

mattkrick Jul 10, 2018

mattkrick Jul 10, 2018

mattkrick commented Aug 7, 2018 •

edited

Loading

mattkrick commented Aug 21, 2018

mattkrick commented Jan 31, 2019

josephsavona commented Feb 1, 2019 •

edited

Loading

mattkrick commented Feb 1, 2019

josephsavona commented Feb 1, 2019

mattkrick commented Feb 1, 2019

josephsavona commented Feb 1, 2019 •

edited

Loading

mattkrick commented Feb 2, 2019 •

edited

Loading

josephsavona commented Feb 4, 2019

mattkrick commented Feb 4, 2019

josephsavona commented Feb 4, 2019

mattkrick commented Feb 5, 2019 •

edited

Loading

mattkrick commented Mar 7, 2019 •

edited

Loading

mattkrick commented May 2, 2019

mattkrick commented Jul 11, 2019

[discussion] Change optimistic/server update ordering #2482

[discussion] Change optimistic/server update ordering #2482

Conversation

mattkrick commented Jun 27, 2018 • edited Loading

mattkrick commented Jun 28, 2018

mattkrick Jul 1, 2018 • edited Loading

Choose a reason for hiding this comment

dminkovsky Jul 16, 2018 • edited Loading

Choose a reason for hiding this comment

mattkrick Jul 17, 2018

Choose a reason for hiding this comment

mattkrick Jul 6, 2018

Choose a reason for hiding this comment

sibelius commented Jul 9, 2018

mattkrick Jul 10, 2018

Choose a reason for hiding this comment

mattkrick Jul 10, 2018

Choose a reason for hiding this comment

mattkrick commented Aug 7, 2018 • edited Loading

mattkrick commented Aug 21, 2018

mattkrick commented Jan 31, 2019

josephsavona commented Feb 1, 2019 • edited Loading

mattkrick commented Feb 1, 2019

josephsavona commented Feb 1, 2019

mattkrick commented Feb 1, 2019

josephsavona commented Feb 1, 2019 • edited Loading

mattkrick commented Feb 2, 2019 • edited Loading

josephsavona commented Feb 4, 2019

mattkrick commented Feb 4, 2019

josephsavona commented Feb 4, 2019

mattkrick commented Feb 5, 2019 • edited Loading

mattkrick commented Mar 7, 2019 • edited Loading

mattkrick commented May 2, 2019

mattkrick commented Jul 11, 2019

mattkrick commented Jun 27, 2018 •

edited

Loading

mattkrick Jul 1, 2018 •

edited

Loading

dminkovsky Jul 16, 2018 •

edited

Loading

mattkrick commented Aug 7, 2018 •

edited

Loading

josephsavona commented Feb 1, 2019 •

edited

Loading

josephsavona commented Feb 1, 2019 •

edited

Loading

mattkrick commented Feb 2, 2019 •

edited

Loading

mattkrick commented Feb 5, 2019 •

edited

Loading

mattkrick commented Mar 7, 2019 •

edited

Loading