Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEWABLE] A second approach to storing non-serializable data in Redux #4047

Merged
merged 12 commits into from
May 5, 2020

Conversation

chrisbobbe
Copy link
Contributor

[don't merge] redux: Store parsed ZulipVersion in Redux state, instead of string.

This commit should be read for its contrast with its counterpart,
with the same title, at #3952.

I've tried to make this PR minimally different from that one, so a
git diff or git range-diff might be informative.

Here, we use the "inside-out" approach described there:

(i.e., locating the value by its being instanceof ZulipVersion),
which doesn't depend on knowing the value's location in the Redux
state

As expected, it requires writing a (straightforward) migration.

Unfortunately, I haven't gotten far enough to actually see this
working, but I expect that our code in a working version won't need
to be drastically different, so I'm submitting this for comparison
anyhow. There appear to be some inaccuracies in the Flow libdef I
cobbled together from the TypeScript with flowgen and flow-typed create-stub, but these shouldn't be difficult to fix.

The current blocker, which blocks any use of ImmutableJS with Redux
(an integration this PR depends on, as I noted at the bottom of
#3951 (comment)),
is that we're stuck with an old version of redux-persist.

Not until rt2zz/redux-persist@a94f29139, released in v6.0.0 (we're
on ^4.10.2), do we get the ability to supply our own
serialize/deserialize functions in the config passed to
persistStore.

The only other customization for replace/revive transforms that
redux-persist allows is via a Transform created by
createTransform, as we use in #3952. But its contract requires
that you supply functions that translate between serializable and
non-serializable objects, without allowing us to take responsibility
for the actual serialization. It would be a silly exercise, but it
might be possible, to make it work with createTransform with an
extra round-trip into JSAN [sic]. JSAN is what remotedev-serialize
serializes into, and it claims to be more flexible than JSON.

I believe #2915 is our latest written knowledge about the drawbacks
of redux-persist and why it ceases to be useful for us at version
5 and above, and it looks like we have a plan to vendor it.

@chrisbobbe chrisbobbe requested a review from gnprice April 17, 2020 00:47
@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Apr 17, 2020

Not until rt2zz/redux-persist@a94f291, released in v6.0.0 (we're
on ^4.10.2), do we get the ability to supply our own
serialize/deserialize functions in the config passed to
persistStore.

Note from that linked redux-persist commit this is a really straightforward change to make (I wish it had been there from the beginning!), so that's not the challenge, once we commit to and finish doing some version of #2915.

@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Apr 17, 2020

@gnprice, locally, I've just been attempting to do the work of #2915. I think there are enough merge conflicts that it'll be easiest just to do it from scratch, rather than just using that PR as-is, if we're going to do it. But there's a lot of useful context in that PR's commit messages.

I just found additional useful context at #2762 and in this discussion.

Without manually tweaking any lines of code, I've managed to reduce the problems flagged by tools/test down to 49 errors and 15 warnings from ESLint. Here's what I did this time around, in order:

  • make sure redux-persist@4.10.2 is in node_modules
  • mv node_modules/redux-persist/src src/redux-persist
  • mv node_modules/redux-persist/es/index.js.flow src/redux-persist/index.js.flow
  • commit:
    • No flow errors (🎉)
    • Prettier finds 12 messy files
    • ESLint finds 482 errors, 16 warnings; says "431 errors and 0 warnings potentially fixable with the --fix option."
    • All Jest tests pass (🎉)
  • run tools/test lint --fix (a nice shortcut to passing --fix to the ESLint command, thanks for that 🙂)
  • commit:
    • ESLint finds 49 errors, 15 warnings
    • Prettier finds 10 messy files
  • Do cmd+s on all files in src/redux-persist to trigger auto-saving, to fix Prettier errors
  • commit:
    • ESLint finds 49 errors, 15 warnings

@chrisbobbe
Copy link
Contributor Author

Here is that output from ESLint:

Running lint...

/Users/chrisbobbe/dev/zulip-mobile/src/redux-persist/autoRehydrate.js
   5:53  error    'defaultStateReconciler' was used before it was defined                       no-use-before-define
   8:24  error    'liftReducer' was used before it was defined                                  no-use-before-define
  11:23  error    'reducer' is already declared in the upper scope                              no-shadow
  11:55  error    'liftReducer' was used before it was defined                                  no-use-before-define
  26:11  error    'logPreRehydrate' was used before it was defined                              no-use-before-define
  42:5   warning  Unexpected console statement                                                  no-console
  59:16  error    Do not access Object.prototype method 'hasOwnProperty' from target object     no-prototype-builtins
  66:9   warning  Unexpected console statement                                                  no-console
  77:9   warning  Unexpected console statement                                                  no-console
  89:5   error    Closing curly brace does not appear on the same line as the subsequent block  brace-style
  95:7   warning  Unexpected console statement                                                  no-console

/Users/chrisbobbe/dev/zulip-mobile/src/redux-persist/createPersistor.js
    1:1   error    'json-stringify-safe' should be listed in the project's dependencies. Run 'npm i -S json-stringify-safe' to add it  import/no-extraneous-dependencies
    8:66  error    'defaultSerializer' was used before it was defined                                                                  no-use-before-define
    9:68  error    'defaultDeserializer' was used before it was defined                                                                no-use-before-define
   17:21  error    Unexpected dangling '_' in '_stateInit'                                                                             no-underscore-dangle
   18:25  error    Unexpected dangling '_' in '_stateIterator'                                                                         no-underscore-dangle
   18:50  error    'defaultStateIterator' was used before it was defined                                                               no-use-before-define
   19:23  error    Unexpected dangling '_' in '_stateGetter'                                                                           no-underscore-dangle
   19:46  error    'defaultStateGetter' was used before it was defined                                                                 no-use-before-define
   20:23  error    Unexpected dangling '_' in '_stateSetter'                                                                           no-underscore-dangle
   20:46  error    'defaultStateSetter' was used before it was defined                                                                 no-use-before-define
   42:12  error    'passWhitelistBlacklist' was used before it was defined                                                             no-use-before-define
   66:28  error    'createStorageKey' was used before it was defined                                                                   no-use-before-define
   72:61  error    'warnIfSetError' was used before it was defined                                                                     no-use-before-define
  103:13  warning  Unexpected console statement                                                                                        no-console
  111:20  error    'rehydrateAction' was used before it was defined                                                                    no-use-before-define
  135:7   warning  Unexpected console statement                                                                                        no-console

/Users/chrisbobbe/dev/zulip-mobile/src/redux-persist/defaults/asyncLocalStorage.js
    6:5   warning  Unexpected console statement                                           no-console
   13:1   error    Unexpected dangling '_' in '_hasStorage'                               no-underscore-dangle
   26:7   warning  Unexpected console statement                                           no-console
   41:10  error    Expected to return a value at the end of function 'getStorage'         consistent-return
   65:13  error    Expected an assignment or function call and instead saw an expression  no-unused-expressions
   69:11  error    Expected an assignment or function call and instead saw an expression  no-unused-expressions
   79:13  error    Expected an assignment or function call and instead saw an expression  no-unused-expressions
   83:11  error    Expected an assignment or function call and instead saw an expression  no-unused-expressions
   93:13  error    Expected an assignment or function call and instead saw an expression  no-unused-expressions
   97:11  error    Expected an assignment or function call and instead saw an expression  no-unused-expressions
  107:13  error    Expected an assignment or function call and instead saw an expression  no-unused-expressions
  111:11  error    Expected an assignment or function call and instead saw an expression  no-unused-expressions

/Users/chrisbobbe/dev/zulip-mobile/src/redux-persist/getStoredState.js
   4:25  error    Expected to return a value at the end of function 'getStoredState'  consistent-return
   6:68  error    'defaultDeserializer' was used before it was defined                no-use-before-define
  23:9   warning  Unexpected console statement                                        no-console
  25:7   error    'complete' was used before it was defined                           no-use-before-define
  31:46  error    'passWhitelistBlacklist' was used before it was defined             no-use-before-define
  35:7   error    'complete' was used before it was defined                           no-use-before-define
  38:23  error    'createStorageKey' was used before it was defined                   no-use-before-define
  38:47  error    'err' is already declared in the upper scope                        no-shadow
  40:11  warning  Unexpected console statement                                        no-console
  42:32  error    'rehydrate' was used before it was defined                          no-use-before-define
  46:11  error    'complete' was used before it was defined                           no-use-before-define
  63:9   warning  Unexpected console statement                                        no-console
  70:26  error    'restoredState' is already declared in the upper scope              no-shadow
  90:26  error    'restoredState' is already declared in the upper scope              no-shadow

/Users/chrisbobbe/dev/zulip-mobile/src/redux-persist/index.js
  10:3  warning  Unexpected console statement  no-console

/Users/chrisbobbe/dev/zulip-mobile/src/redux-persist/persistStore.js
  11:5   warning  Unexpected console statement                                           no-console
  27:11  error    'complete' was used before it was defined                              no-use-before-define
  39:26  error    'rehydrateAction' was used before it was defined                       no-use-before-define
  41:11  error    'complete' was used before it was defined                              no-use-before-define
  46:18  error    'complete' was used before it was defined                              no-use-before-define
  51:5   error    Expected an assignment or function call and instead saw an expression  no-unused-expressions

/Users/chrisbobbe/dev/zulip-mobile/src/redux-persist/purgeStoredState.js
  23:13  warning  Unexpected console statement                        no-console
  41:64  error    'warnIfRemoveError' was used before it was defined  no-use-before-define
  49:7   warning  Unexpected console statement                        no-console

/Users/chrisbobbe/dev/zulip-mobile/src/redux-persist/utils/isStatePlainEnough.js
  1:1  error  'lodash' should be listed in the project's dependencies. Run 'npm i -S lodash' to add it  import/no-extraneous-dependencies

✖ 64 problems (49 errors, 15 warnings)

@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Apr 17, 2020

Would you suggest manually fixing those remaining problems? I can try to keep a careful record of exactly what I did to fix them, so it's less necessary to just assume I've done it all correctly. Alternatively, we could have ESLint ignore these errors, to avoid the possibility that a person (such as myself) accidentally introduced bugs by trying to make this output go away with manual tweaking.

@gnprice
Copy link
Member

gnprice commented Apr 17, 2020

For the part that's in the PR now: I think I quite like this approach! It feels like a lot less code than the "outside-in" approach in #3952. Thanks for prototyping both 😄

In particular, when I imagine going and converting things to Map or Immutable.Map and so on, this approach looks like it will mean a lot less code to maintain -- and the code in question is (a) boilerplate that (b) is hard to get fully type-checked, which is a bad combination.


For vendoring redux-persist so we can apply that patch to it:

I think let's start by leaving that code in its current style, and just telling ESLint and Prettier not to operate on it.

For ESLint, you can do this by adding to the overrides: section of our .eslintrc.yaml. E.g. append something like the following:

- files: ['src/redux-persist/**']
  rules:
    semi: off
    curly: off
    nonblock-statement-body-position: off
    space-before-function-paren: off
    arrow-body-style: off
    object-curly-spacing: off
    no-underscore-dangle: off
    import/order: off
    func-names: off
    object-shorthand: off

    no-use-before-define: off
    no-console: off
    prefer-const: off
    consistent-return: off
    no-unused-expressions: off
    no-var: off
    vars-on-top: off
    no-shadow: off
    no-prototype-builtins: off

Based on a rough rebase of #2915, I think that eliminates all the lint errors other than missing dependencies. And I think all of these are not direct signs of problems -- they're either stylistic choices, or asking the code to use helpful features of modern JS. But it'd be useful for you to go through them to check that for yourself, too (and perhaps organize and/or comment better than my rough example above.)

Then we can always follow up to modernize and assimilate the code. But that lets us move forward so we can do that at our leisure. (Even if the code is old-fashioned or sometimes messy, it's the same code we were relying on before anyway!)

@gnprice
Copy link
Member

gnprice commented Apr 17, 2020

For Prettier, it looks like adding to .prettierignore is the way:
https://prettier.io/docs/en/ignore.html#ignoring-files

Like this, since it's apparently gitignore syntax:

/src/redux-persist/

@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Apr 21, 2020

Hi @gnprice, I just pushed my changes, getting our fork of redux-persist working. At the tip, there's a commit that (as far as I've seen) has state.narrows working as an Immutable.Map, with this approach.

I agree that it's much nicer to see that pretty much all of the changes in that commit are there strictly to adapt to Immutable.Map's API; there's none of the boilerplate we saw in #3952.

I figured it would be nice to show you the state.narrows example in the context of the ZulipVersion changes, then (if we're going with this approach) we'll probably put that into its own PR with type-checking for those tests. I mentioned one kind of question I had, about tests, in the commit message, but there are others.

@rk-for-zulip
Copy link
Contributor

rk-for-zulip commented Apr 21, 2020

For the part that's in the PR now: I think I quite like this approach! It feels like a lot less code than the "outside-in" approach in #3952. Thanks for prototyping both

It's less code, but it's less code in the way that eval is less code than a parser. It's a security hole. (Or it opens the way for them, at least.)

(EDIT: this reads like hyperbole, and in this world, it is; it arises from my error below.)

From #3952 (comment):

For this to work, naturally it's important that you don't have any objects that accidentally have __serializedType__ as a key. I think that's actually true for us: the keys in our objects-as-maps all live in well-defined namespaces, namely numbers, emails, comma-separated numbers or emails, and JSON-serialized narrows. This is one of the good reasons not to expand our use of objects-as-maps and instead to work to reduce it.

The problem is that this is a terrifyingly fragile global property. It's true today – by chance, rather than by design! – but it's totally plausible that a year from now no one will remember that this is something we need to worry about, and that a new object-as-map will be added which can have arbitrary keys.

With the types we have at hand, I don't believe this could do more than cause a denial-of-service (e.g., after connecting to a Zulip server with a poorly- or maliciously-chosen emoji name, the app crashes on startup until it's uninstalled); but as more types are added to the deserializer, the chances that someone will be able to inject a construct that has more subtle security implications increases combinatorically.

This is unfortunately true of every deserialization mechanism which, like remotedev-serialize, takes advantage of JSON.{stringify,parse}'s replacer and reviver functions. (The worst part is that this could have been easily mitigated if only those functions received the full path to a (de)serialized element.)

(EDIT: As shown by the later solution to this issue, this last was incorrect.)

@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Apr 21, 2020

Hmm, OK.

Take a look at UserPresence:

/**
 * A user's presence status, including all information from all their clients.
 *
 * The `aggregated` property equals one of the others.  For details, see:
 *   https://zulipchat.com/api/get-presence
 *
 * See also the app's `getAggregatedPresence`, which reimplements a version
 * of the logic to compute `aggregated`.
 */
export type UserPresence = {|
  aggregated: ClientPresence,
  [client: string]: ClientPresence,
|};

I haven't gone through to verify whether and how an arbitrary client string could end up there, but the effort it would take to do so supports the idea that this opens us up to some hard-to-plug security holes. Assuming that's possible, it sounds like the denial of service would go something like this, if we're at the commit at the tip of this branch:

  1. A client maliciously causes a client to be __serializedType__ in the app of a victim, like so, and it gets persisted verbatim:

In state.presence:

{
  email@example.com: {
    aggregated: { ... },
    __serializedType__: { ... }
  }
}

where { ... } is, e.g., { status: 'idle', timestamp: 123456, client: '__serializedType'__ } (a legal value).

  1. Then (regardless of whether 'data' exists as another client, which is possible and would allow for further manipulation) the entire value at email@example.com is revived as an Immutable.Map.

Here's an excerpt of the default reviver that accomplishes that, in node_modules/immutable/serialize.js:

  function reviver(key, value) {
    if (typeof value === 'object' && value !== null && '__serializedType__'  in value) {
      var data = value.data;
      switch (value.__serializedType__) {
        case 'ImmutableMap': return Immutable.Map(data);
        // ... several other cases
        default: return data;
      }
    }
    return value;
  }
  1. All the places the victim's app expects to consume that value as an object-as-map break because it's an Immutable.Map.

Does this sound like a valid example of a simple denial-of-service you're thinking of? I don't mean it to distract from your larger concern that

as more types are added to the deserializer, the chances that someone will be able to inject a construct that has more subtle security implications increases combinatorically.

@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Apr 21, 2020

And would it be an acceptable defense against all such attacks to make the replacer drop all user-supplied keys that are __serializedType__? I suppose we can't assume that, in all cases, it's OK to just drop even malicious data on the floor, without causing weird bugs or another denial-of-service. Can we?

Or to use some kind of secret to prove that a key was persisted as __serializedType__ because my own replacer asked it to be?

@rk-for-zulip
Copy link
Contributor

  1. Then (regardless of whether 'data' exists as another client, which is possible and would allow for further manipulation) the entire value at email@example.com is revived as an Immutable.Map.
  2. All the places the victim's app expects to consume that value as an object-as-map break because it's an Immutable.Map.

Does this sound like a valid example of a simple denial-of-service you're thinking of?

Yes, although I think there's a shorter path to it: the rehydrator attempts to revive the entire value at email@example.com as an Immutable.List – but the constructor of Immutable.List complains of an invalid value, so an unhandled exception is thrown during rehydration, so the app never starts. (Or possibly handles the exception by throwing away all the rehydrated data and starting with a fresh app-state – which is arguably not denial-of-service, but is still a concerning bug.)

And would it be an acceptable defense against all such attacks to make the replacer drop all user-supplied keys that are __serializedType__? I suppose we can't assume that, in all cases, it's OK to just drop data on the floor, without causing weird bugs or another denial-of-service. Can we?

We can't. Although I'm pretty sure it won't be true in the ClientPresence case, for arbitrary other objects-as-map, there may be references to the dropped data in the values of some other subtree of the state – going back to my hypothetical emoji example, perhaps lists of reactions on messages.

(Also, Christopher Null has a few opinions on the subject.)

@rk-for-zulip
Copy link
Contributor

rk-for-zulip commented Apr 21, 2020

Or to use some kind of secret to prove that a key was persisted as __serializedType__ because my own replacer asked it to be?

Now we're getting into something feasible, I think. I can't see any way to make a secret work, but one might be able to escape any key matching /^(!*)__serializedType__$/ as /^!\1__serializedType__$/, and invert that transformation during rehydration.

EDIT: Which suggests, I think, that we should really be characterizing this entire class of security holes as failures of the serialization function to be injective.

@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Apr 21, 2020

I just found a flaw in my example above; glad I spelled out the steps.

I assumed it would be revived as an Immutable.Map, but actually, that wouldn't happen unless the value at __serializedType__ were 'ImmutableMap', which { ... } isn't. Following through the default case, it checks the data property, which is missing, so the value at email@example.com would be undefined, I think. Still damaging, and all the rest is still possible to creep in elsewhere, including errors at rehydrate time, as you've pointed out.

I agree it would be problematic to drop data on purpose, too. Even in the presence case, that would violate the UserPresence JSDoc (above):

 * The `aggregated` property equals one of the others.  For details, see:
 *   https://zulipchat.com/api/get-presence

(It's already violated, but in a minor way that doesn't violate our types — still, I should probably fix that JSDoc.)

(Also, Christopher Null has a few opinions on the subject.)

Lol. 😆

@gnprice
Copy link
Member

gnprice commented Apr 21, 2020

one might be able to escape any key matching /^(!*)__serializedType__$/ as /^!\1__serializedType__$/, and invert that transformation during rehydration.

I think this is a straightforward solution to all the problems discussed in the last day's messages on this thread.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @chrisbobbe ! This looks great.

A couple of remarks below about the workflow for libdefs. Then I think the one substantive thing remaining is to make the replacer/reviver fully round-trip-clean, by escaping __serializedType__ properties as discussed above.

When doing that, one thing that will probably help is to split up the main "redux: Store parsed ZulipVersion in Redux state" into two commits:

  • One inserts redux-persist into things, plus our fix for proper round-tripping, but doesn't yet do anything about ZulipVersion.
  • Then the next starts replacing/reviving ZulipVersion objects. ... hmm and maybe even stops there, touching only the replacer and reviver, so splitting into three commits?
  • And then the last commit is the one that actually changes the reducer to maintain a ZulipVersion object instead of a string, and adds the migration to match.
    • In fact if we wanted we could even split out as a fourth commit the changes to the action types, and so the changes at the various places we dispatch those actions. But that would add a bit of churn at the reducers, and the reducer + action changes aren't a particularly complex diff combined, so I think it's just as good to keep these together as one commit.

Comment on lines 17 to 21
/**
* Flowtype definitions for index
* Generated by Flowgen from a Typescript Definition
* Flowgen v1.10.0
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting!

Where did this file come from -- did you have to run Flowgen yourself (and how)? Or did you run something like yarn update-libdefs and find that that ran Flowgen automatically?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... I'm also kind of puzzled because I don't see any TypeScript in that library!

$ find node_modules/remotedev-serialize/
node_modules/remotedev-serialize/
node_modules/remotedev-serialize/README.md
node_modules/remotedev-serialize/helpers
node_modules/remotedev-serialize/helpers/index.js
node_modules/remotedev-serialize/immutable
node_modules/remotedev-serialize/immutable/serialize.js
node_modules/remotedev-serialize/immutable/index.js
node_modules/remotedev-serialize/package.json
node_modules/remotedev-serialize/index.js
node_modules/remotedev-serialize/constants
node_modules/remotedev-serialize/constants/options.js

So I wonder where the information is coming from.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! It comes from DefinitelyTyped, and I did indeed run Flowgen myself. I'll get this all meticulously documented in the next revision.

declare export type Reviver = (key: string, value: any, reviver: DefaultReviver) => any;
declare export function immutable(
immutable: any,
refs?: Refs | null,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to find a way that this kind of fix won't get automatically clobbered when we, say, upgrade this library. (Or when we just run yarn update-libdefs, after upgrading any of our libraries.) How to do that depends on what the workflow is that generated these files in the first place, though. Maybe discuss in a chat thread?

Copy link
Contributor Author

@chrisbobbe chrisbobbe Apr 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! If it can fit in seamlessly enough, perhaps, to pick up this thread? That's where I learned that putting things in flowtyped/npm means they might get clobbered.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure -- a relevant existing thread is even better than a new one 🙂

chrisbobbe pushed a commit to chrisbobbe/zulip-mobile that referenced this pull request Apr 29, 2020
…ing.

`remotedev-serialize` has not been used yet (it was only added in
the previous commit); this work is purely preventive.

Ray points out [1] a security hole (or eventual security hole) that
would arise from a standard use of `remotedev-serialize`, where its
replacers/revivers (custom or default) operate on objects-as-map in
Redux where the keys are arbitrary (i.e., not within well-defined
namespaces, such as numbers, emails, JSON-serialized narrows, etc.),
especially if those keys can be manipulated by users.

`remotedev-serialize` relies on a magic string [2],
'__serializedType__'. This string acts as an identifier, but it gets
thrust into an area where data belongs (and has always belonged), so
collisions are a problem unless we do something to ensure that it
remains unique against all keys that represent data.

So, solve this by escaping that magic string where it occurs in
data, so that it doesn't activate the hidden functionality. Develop
and recommend a wrapper that has largely the same API as
`remotedev-serialize` [3] and takes care of all the necessary
escaping, so development can continue without having to think about
and make judgments related to this security hole.

[1]: zulip#4047 (comment)

[2]: zulip#4062 (comment)

[3]: except that it strongly recommends using a `const` variable
containing '__serializedType__', instead of making new string
literals every time. This variable is also used in the wrapper's
implementation, so consumers can be certain that it's the same one
that is properly escaped.
@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Apr 29, 2020

Edited to update commit IDs / topic lines, following my force-push noted at #4047 (comment).

OK, I've pushed my next revision. I explain why tests are failing at the tip, at the tip:

89c5d96 [DO NOT MERGE]: Rehydrate time trial results.

That commit, and its parent (they're both labeled [DO NOT MERGE]), are NEVER intended to be merged and exist only for discussion.

I'm hoping

235458e deps: Add immutable and remotedev-serialize.

answers all your questions about the workflow for libdefs.

@gnprice said:

One inserts redux-persist into things, plus our fix for proper round-tripping, but doesn't yet do anything about ZulipVersion.

I think you might have meant remotedev-serialize, rather than redux-persist? The round-tripping fix is tied in with remotedev-serialize; the vendoring of redux-persist isn't related to the security hole that the escaping addresses.

From "insert remotedev-serialize into things", through the proper round-tripping fix, but without doing anything about ZulipVersion, is this series:

235458e deps: Add immutable and remotedev-serialize.
d94379a [DO NOT MERGE] redux [security]: Add remotedev-serialize wrapper with proper escaping.
10d4fdf libdefs: Tweak immutable export from remotedev-serialize.
443ba2d redux: Scaffolding for custom replace/revive logic.

I think it makes sense that these are four separate commits; what do you think? (Arguably, I could split "Add immutable and remotedev-serialize" into two commits, one for each dependency. Immutable is kind of a peer dependency of remotedev-serialize, in that it functions as a plugin, though they've listed it as a dev dependency.)

Then the next starts replacing/reviving ZulipVersion objects. ... hmm and maybe even stops there, touching only the replacer and reviver, so splitting into three commits?

Done; that's

3fa900a redux: Replace/revive ZulipVersion instances.

And then the last commit is the one that actually changes the reducer to maintain a ZulipVersion object instead of a string, and adds the migration to match.

Done as

a904564 redux: Store parsed ZulipVersion in Redux state, instead of string.

In fact if we wanted we could even split out as a fourth commit the changes to the action types, and so the changes at the various places we dispatch those actions. But that would add a bit of churn at the reducers, and the reducer + action changes aren't a particularly complex diff combined, so I think it's just as good to keep these together as one commit.

Makes sense. If you change your mind, I'm happy to split this out.

@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Apr 29, 2020

As always, I'm happy to be convinced that my approach, or explicit or implicit assumptions, are wrong. 🙂

I do have some questions. As I mentioned here, it's harder to test that escape() and unescape() are called in the correct places, than to test those functions themselves. So, those tests are left out. (In retrospect, this should mean

724a3be redux [security]: Add remotedev-serialize wrapper with proper escaping.

gets a [DO NOT MERGE] tag — I'll go and add that right now. EDIT: Done, as

d94379a [DO NOT MERGE] redux [security]: Add remotedev-serialize wrapper with proper escaping.

.)

@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Apr 29, 2020

Another thing is that the types for renamedEntries and untouchedEntries, in the function returned by the makeObjectKeyRenamer factory, are very verbose, though I think they're functionally correct:

const renamedEntries: $ElementType<$Call<typeof objectEntries, InexactObject>, number>[] = [];
const untouchedEntries: $ElementType<$Call<typeof objectEntries, InexactObject>, number>[] = [];

I can't use the return type of objectEntries directly because it's a $ReadOnlyArray, and I need to push to renamedEntries and untouchedEntries. So, that weird circumlocution.

chrisbobbe pushed a commit to chrisbobbe/zulip-mobile that referenced this pull request Apr 29, 2020
…th proper escaping.

Query: I have tests for `escape` and `unescape`, here, but I don't
have tests to ensure these functions are called in the correct
places. This is harder to test for (shall I mock
`remotedev-serialize`?).

Below the line is my draft commit message, written before this query
occurred to me.

-------------------------------------------------------------------

`remotedev-serialize` has not been used yet (it was only added in
the previous commit); this work is purely preventive.

Ray points out [1] a security hole (or eventual security hole) that
would arise from a standard use of `remotedev-serialize`, where its
replacers/revivers (custom or default) operate on objects-as-map in
Redux where the keys are arbitrary (i.e., not within well-defined
namespaces, such as numbers, emails, JSON-serialized narrows, etc.),
especially if those keys can be manipulated by users.

`remotedev-serialize` relies on a magic string [2],
'__serializedType__'. This string acts as an identifier, but it gets
thrust into an area where data belongs (and has always belonged), so
collisions are a problem unless we do something to ensure that it
remains unique against all keys that represent data.

So, solve this by escaping that magic string where it occurs in
data, so that it doesn't activate the hidden functionality. Develop
and recommend a wrapper that has largely the same API as
`remotedev-serialize` [3] and takes care of all the necessary
escaping, so development can continue without having to think about
and make judgments related to this security hole.

[1]: zulip#4047 (comment)

[2]: zulip#4062 (comment)

[3]: except that it strongly recommends using a `const` variable
containing '__serializedType__', instead of making new string
literals every time. This variable is also used in the wrapper's
implementation, so consumers can be certain that it's the same one
that is properly escaped.
@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Apr 29, 2020

(Added @ray-kraesig as a reviewer because he prompted the preventive measures about the security hole and may like to request or follow revisions to the implementation.)

@chrisbobbe chrisbobbe changed the title A second approach to storing non-serializable data in Redux [REVIEWABLE] A second approach to storing non-serializable data in Redux Apr 29, 2020
});

describe('escape and unescape are inverses of each other', () => {
for (let i = 0; i < 100; i++) {
Copy link
Contributor Author

@chrisbobbe chrisbobbe Apr 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: now moot, see #4047 (comment).

I'm not sure why I chose 100 here. We can't (or I can't!) say with what probability this will flake, which I suppose was a prerequisite for using a randomized algorithm in src/utils/__tests__/backoffMachine-test.js, under 'timeouts are random from zero to 100ms, 200ms, 400ms, 800ms...'.

}
};

const escapeKeys = makeObjectKeyRenamer(escape);
Copy link
Contributor Author

@chrisbobbe chrisbobbe Apr 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: this and my comment that follows it are now moot; see #4047 (comment).

I wouldn't insist on using a makeObjectKeyRenamer factory; we could just have a renameObjectKeys(obj, renameFn) and pass escape or unescape as renameFn every time. But, since escapeKeys and unescapeKeys are used several times, I think it's more readable that they exist as separate functions, with those names, and a factory function seemed a natural way to accomplish that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, since escapeKeys and unescapeKeys are used several times

— er, no they're not! 🙃 I must have been thinking of a previous iteration I never submitted for review.

@chrisbobbe chrisbobbe force-pushed the pr-v2-zulip-version-redux branch 2 times, most recently from 05e003b to f7ccec3 Compare May 1, 2020 19:02
@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented May 1, 2020

OK, I've pushed my branch, using zulip/remotedev-serialize@5f9f759a4 as Greg put together yesterday (thanks again!), instead of the wrapper I wrote with the /^(!*)__serializedType__$/ to /^!\1__serializedType__$/ escape. I'll go edit my comments that look like they're still waiting for a reply, but are now moot.

@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented May 4, 2020

Oh, and this is still a draft because of

f7ccec3 [DO NOT MERGE] Proof-of-concept for state.narrows being Immutable.Map.

at the tip; see note at #4047 (comment).

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

Small comments below; otherwise I think this is all ready to merge (apart from the demo commit at the tip.)

flow-typed/remotedev-serialize_vx.x.x.js Outdated Show resolved Hide resolved
package.json Outdated Show resolved Hide resolved
package.json Outdated Show resolved Hide resolved
src/boot/store.js Outdated Show resolved Hide resolved
src/boot/store.js Outdated Show resolved Hide resolved
Chris Bobbe added 12 commits May 5, 2020 10:37
Keys with value `undefined` don't survive round-trips into JSON, and
the zulipVersion will be persisted in JSON. Use `null`, to maintain
that Account objects always have the same number of keys, and we
don't have to worry about the distinction between an optional
`string` property and a property that's `string | void`.

The same strategy is in use for the ackedPushToken.

As a side-effect, it's much simpler to distinguish between the
second and third behaviors of `makeAccount` in the list below, for
tests. We don't need the `shouldHaveZulipVersion` param anymore:

1. Supply our own valid string value for the `zulipVersion` field in
the returned Account object

2. Make the returned Account object have an empty `zulipVersion`

3. Make the returned Account object have a default, valid string
value for `zulipVersion`
In zulip#2762 (never merged), we tried to use v5 of redux-persist and
found that it had breaking changes that unacceptably reduced our
flexibility with migrations, so we decided never to use it, and
instead to vendor it and make our own changes as needed.

More background on the limitations of v5 is at
https://chat.zulip.org/#narrow/stream/243-mobile-team/subject/redux-persist.20v5/near/624465.
Its new way of doing migrations means you can't write a migration
that is aware of the entire state tree, unless you limit yourself to
storing the entire tree in a single key-value entry. We need to be
able to split it up for performance (avoiding a rewrite to disk of
everything each time a small change is made) and because each
key-value pair is has a hard limit of 2MB, after which the data is
just dropped. Even after compression, our entire state tree would be
within a factor of 4 of this limit, which is unacceptable.

More generally, the redux-persist maintainers seem not to have deep
familiarity with apps that depend on correct, effective migrations
to persist data across updates.

Our immediate need is to incorporate the changes from
rt2zz/redux-persist@a94f29139 and rt2zz/redux-persist@13a0b600b (not
released until 6.0.0) so that we can pass custom
serialize/deserialize functions for replace/revive logic so we can
store non-serializable data (such as a ZulipVersion) in Redux.

PR zulip#2915 was opened a while ago to vendor redux-persist, but it grew
stale and gathered merge conflicts.

First, ensure that the version of redux-persist in node_modules is
4.10.2, which is what we'd been using. Then, copy the contents of
`node_modules/redux-persist/src` to `src/third/redux-persist` (this
contains all the needed source code) and copy
`node_modules/redux-persist/es/index.js.flow` there too; it has the
needed type definitions.

Then, address redux-persist's two dependencies that weren't listed
in our own package.json:

- `json-stringify-safe`: Add to package.json.

- `lodash`: The only bit of lodash that it uses is `isPlainObject`,
so add `lodash.isplainobject` to the package.json and make this
change, the only change to our vendored copy's source code:

  In `src/third/redux-persist/utils/isStatePlainEnough.js:

  ```diff
  - import isPlainObject from 'lodash/isPlainObject'
  + import isPlainObject from 'lodash.isplainobject'
  ```

Then run `yarn`.

Rather than fixing all the issues ESLint and Prettier flagged, tell
those tools to ignore them -- we didn't introduce them, and we'd
rather not accidentally introduce bugs by even these small changes.
Some of the output indicated potential code-quality issues (see note
at `no-use-before-define` in .eslintrc.yaml), some, just differing
style choices (e.g., no semicolons), and others, just the fact that
redux-persist wasn't using the very latest JavaScript features.
See the previous commit for motivation.
Incorporate the changes from rt2zz/redux-persist@a94f29139 and
rt2zz/redux-persist@13a0b600b so that we can pass custom
serialize/deserialize functions for replace/revive logic, to
eventually store non-serializable data (such as a ZulipVersion;
that's this series) in Redux.
In an upcoming commit, we will set up a system of "replacer" and
"reviver" transforms so that we can store non-serializable objects,
such as ZulipVersion instances, in Redux, and have a serializable
format be persisted to disk.

This same system will be used for non-serializable objects besides
ZulipVersion instances; see zulip#3949 and zulip#3950, which will be much
easier after this series.

We use our own fork of `remotedev-serialize`, at 5f9f759a4, which
Greg authored atop `zalmoxisus/remotedev-serialize.git` at 0.1.8,
because it fixes an important bug. Also, the project seems
unmaintained, and the maintainer hasn't been active anywhere on
GitHub.

The bug is that data fails to round-trip when it contains a
"__serializedType__" key, which is used by the reviver to identify
and reverse a particular replace transform. Such data, if present,
could trigger hidden functionality.

Also, explain the process of getting the `remotedev-serialize`
libdef set up, in docs/howto/libdefs.md.
This sets up how we'd like to plug in custom replacers and revivers,
but doesn't add any yet.

We can use the '__serializedType__' string freely because we're
using `zulip/remotedev-serialize@5f9f759a4`, which contains a fix
for proper round-tripping of that string, to avoid collisions when
that string occurs as a key representing data, and the unplanned
behavior such collisions would cause.
No ZulipVersion instances are stored in Redux in this commit; this
just handles the replace/revive logic for when they are, in an
upcoming commit.
Storing the parsed ZulipVersion instance in Redux makes it easier to
work with, so we don't have to think about constructing a new
instance as part of every server feature check.

The one tricky bit is that our Redux state is serialized by
redux-persist so that it can be stored in ZulipAsyncStorage, but the
ZulipVersion instance itself is not serializable. Thankfully, it can
be fully represented by its raw version string, which is
serializable.

So, on entry into ZulipAsyncStorage, we just have to convert the
ZulipVersion instance into the raw version string, with .raw()
("replace" it), and on exit, make a new ZulipVersion instance, by
calling the constructor with that raw version string ("revive" it).

We considered two main strategies for locating the bit of state to
be transformed (in this case, the `zulipVersion` field, which stores
a ZulipVersion value, in elements of the `state.accounts` array):

1) The "outside-in" strategy, of identifying the value by the
extrinsic property of where it is; i.e., that it's at the field
named 'zulipVersion' on elements of the `state.accounts` array, or

2) The "inside-out" strategy, of identifying the value by its
intrinsic property of being `instanceof ZulipVersion`.

We chose the latter. When we work on zulip#3950, converting our
object-as-map fields to use Map or Immutable.Map, we'll be making
similar, sweeping changes to many different parts of the state, so
it's most natural for the bulk of our logic to be independent of the
location in state, and focus instead on the type of non-serializable
object being stored. This approach conveniently clears the path to
use ImmutableJS for additional reasons discussed later.

An exploration of the "outside-in" approach is left as the un-merged
PR zulip#3952. The main advantage of that approach is that we wouldn't
need to write migrations, but it had the significant drawback of
requiring a lot of code (for locating the bit of state to be
transformed) that was 1) boilerplate, and 2) difficult to get fully
type-checked, which is a bad combination. These concerns were not
sufficiently alleviated by the complexity of that boilerplate being
bounded by the complexity of the reducer(s) in charge of the
corresponding part(s) of the state: it's better just to not have to
write the boilerplate.

There's nothing stopping us from mixing the two approaches, in
future, but it would be nice to stick to one as far as possible, for
simplicity.

For the "inside-out" implementation, we use `remotedev-serialize`
(added in a recent commit) [1], with custom replacer and reviver
functions, defined in the previous commit. As Greg explains at
zulip#3952 (comment):

"""
* at save time, values that are `instanceof ZulipVersion` get turned
into objects like `{__serializedType__: 'ZulipVersion', data:
'2.0.0-rc1'}`, before being JSON-serialized

* at load time, after JSON deserialization, values that are objects
with a `__serializedType__` property get transformed in a way
identified by that property's value: in particular if
`__serializedType__: 'ZulipVersion'`, then `data` is fed to `new
ZulipVersion`
"""

Since we've never dealt with the Zulip version in this "live" (i.e.,
non-serializable) form, we have to write a migration. This was
straightforward, but we MUST remember to write this kind of
migration in the future.

For making a value "live", where it wasn't before, the migration
needs to:

1) As input, take the previous shape of the data. Don't confuse this
with the *current* way of storing the "dead" shape. Just like any
other migration, the previous shape is the input.

2) As output, give the "live" form of the data. Once it's in Redux,
the replacer will take care of persisting it in the correct "dead"
form.

As mentioned above, this approach further clears the way for
ImmutableJS; zulip#3949 and zulip#3950 will be much easier after this. I've
neglected to mention that the primary purpose of
`remotedev-serialize` is to replace and revive ImmutableJS objects!
Its "default" replacers and revivers do this. The custom
replacer/reviver functions we use here are a nice feature provided
on the side.

(We added `immutable` as a dependency in a recent commit, since
`remotedev-serialize` depends on it.)

Fixes: zulip#3951

[1]: We actually use our own fork,
`zulip/remotedev-serialize@5f9f759a4`, which fixes a bug where
"__serializedType__" keys in data fail to round-trip properly.
It may feel smoother not to have to call the constructor when doing
server version checks.
@chrisbobbe
Copy link
Contributor Author

OK! Just pushed those changes.

@gnprice gnprice marked this pull request as ready for review May 5, 2020 19:29
@gnprice gnprice force-pushed the pr-v2-zulip-version-redux branch from cf728a8 to 8e7c7b8 Compare May 5, 2020 19:29
@gnprice gnprice merged commit 8e7c7b8 into zulip:master May 5, 2020
@gnprice
Copy link
Member

gnprice commented May 5, 2020

Merged! Thanks again.

For the workflow: I think after seeing this experiment, what I'd like to do in a similar situation in the future is:

  • Have the main PR stop at a version that's mergeable.
  • For any non-mergeable demos, debugging, perf-testing, etc., have another branch on top of the main one. That branch could be a second, draft PR; or perhaps just mention the branch name and people can fetch it.

This way the usual PR workflows can behave as normal for the main branch.

Does that make sense? We can talk more in #mobile-team, too.

@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented May 6, 2020

Merged! Thanks again.

Thank you! Glad to see a clearer way to replacing our objects-as-map with good data structures, etc.

Does that make sense? We can talk more in #mobile-team, too.

Sure, that makes sense!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants