Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalized persistence #903

Merged
merged 23 commits into from
Sep 16, 2019
Merged

Generalized persistence #903

merged 23 commits into from
Sep 16, 2019

Conversation

alexcjohnson
Copy link
Collaborator

@alexcjohnson alexcjohnson commented Sep 4, 2019

Building generalized prop persistence into the renderer. Engineered somewhat similarly to uirevision in plotly.js, but geared more toward saving state across page reloads, user changes are saved and, when the entire component or larger is refreshed with the same original value, these changes are reapplied.

Components can use this by defining props persistence, persisted_props, and persistence_type - and if necessary the class property persistenceTransforms. See the comment at the top of persistence.js for details.

Linked PR coming in dash-table as the primary use case.

  • I have added tests, or extended existing tests, to cover any new features or bugs fixed in this PR
  • I have added entry in the CHANGELOG.md
  • If this PR needs a follow-up in dash docs, community thread, I have mentioned the relevant URLS as follow
    • this github #PR number updates the dash docs
    • here is the show and tell thread in plotly dash community

return;
}

forEachObjIndexed((newPropVal, propName) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The persisted props will be few vs. the actual props of the components. Maybe we could loop on the persisted_props instead and const newPropVal = newProps[propName] instead? These actions are user driven, so not in a tight loop, either way, I do not have a strong opinion.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the time newProps only has one key, regardless of how many are in the component. Table is a bit of an outlier in that the 11 derived_* props all arrive at once, but it also seems to call setProps twice in many cases - once with just input prop, and again with all the derived_* - so I think even there it's roughly equal cost either way: as I have it, looping over 1 then 11 items; flipping it we'd loop over 6+6 items.

That said, if(newProps[propName]) is probably much faster than if(persisted_props.indexOf(propName) !== -1) and most components other than table will probably only have one or two persisted props. So yeah, I'll make the change. 🎉

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setItem(key, value) {
const setVal = value === void 0 ? UNDEFINED : JSON.stringify(value);
this._storage.setItem(storePrefix + key, setVal);
}
Copy link
Contributor

@Marc-Andre-Rivet Marc-Andre-Rivet Sep 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think setItem calls might need to be more defensive.

  1. Some browsers and some setups will not define window.localStorage and/or window.sessionStorage (I do not think this is a concern in 2019 anymore... -- everything >= IE8 supports this afaik)
  2. Some browsers will expose local/session storage with storage size of 0 byte - user configurations on certain browsers could also cause an issue here
  3. setItem willthrow an error if the storage space is exceeded - this will disrupt the app's render flow

e.g. Each prop in the loop at https://github.com/plotly/dash/pull/903/files#diff-1f9fdd7834ce2c4e31ca7ce6aa890a1fR203 could be guarded in a try ... catch and/or LocalStorage be defaulted to MemoryStorage -- a failing localStorage will fail to re-apply the values after an update.

More generally, should failure to store be conveyed to the user in debug mode and how?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, good call. (1) I think you're right, not an issue. But (2) and (3) are worth protecting against. Also I notice dcc.Store needs this treatment -> plotly/dash-core-components#635

(2) seems like all we can do is fall back on memory.
(3) we perhaps have some more options: I can imagine in fact app developers filling up their storage with data from old apps - perhaps we can clear out data associated with IDs that don't exist on the page? Certainly not ideal, as those IDs may pop up later. I suppose we could do better by including a timestamp with the data, and delete the oldest ones first. That's a lot of overhead though for a hopefully rare edge case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(2) seems like all we can do is fall back on memory.

Yes. At least in memory will preserve the behavior while in a session.

(3) we perhaps have some more options: I can imagine in fact app developers filling up their storage with data from old apps [...]

Interesting. All callbacks are defined though.. maybe we could use (dependencyGraph + persisted_props) to identify ids that are not significant for this app? Seems weird..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defensive behavior added in 83b8cd3

  • Don't try to access stores until requested; That way the app is ready, so we can dispatch any errors to devtools.
  • Try to add some reasonably large data to the store - I chose 65K chars (130kB) as a meaningful quantity (more than most persistence needs, unless you store table data or something) but still far less than normal webstore limits
  • If it fails on initialization, try to clear out the persistence-related keys and try again
  • Then give up and use memory if it still fails
  • If it fails later on (not during init), don't try to recover or fall back - just report the error to both the console and devtools. We could explore ways to be smarter about this but it's more involved - see TODO 83b8cd3#diff-1f9fdd7834ce2c4e31ca7ce6aa890a1fR134

* [propName]: {
* extract: propValue => valueToStore,
* apply: (storedValue, propValue) => newPropValue
* }
Copy link
Contributor

@Marc-Andre-Rivet Marc-Andre-Rivet Sep 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we only support one extract from a complex prop or should we support an array of extracts? If so, should we consider a way to expose/select these extracts in the persistence_props.

e.g. Today someone can change the name nested prop in columns, tomorrow maybe they are fully customizing the table along with their data and want to save their columns[i].format settings

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that was the point of the TODO comment below https://github.com/plotly/dash/pull/903/files#diff-1f9fdd7834ce2c4e31ca7ce6aa890a1fR195

Even if we don't have a current use case, the intent would certainly be clearer if we listed the sub-prop specifically. And then we don't lock ourselves out for later.

So what should this look like? Try to match the structure fully ("columns[i].name" or "columns[].name") or just a simple namespace of the outer prop ("columns.name")? I think I'd go for the simpler one; I don't think we want to try to automate extract/apply, there are too many cases to consider. What if we want to extract names keyed by column ID, for example?

(Also note to self: if we do this we also need to only store a change if the extracted piece changed vs. the previous value)

Copy link
Contributor

@Marc-Andre-Rivet Marc-Andre-Rivet Sep 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed the TODO 🤦‍♂

namespace of the outer prop ("columns.name")

Yes. That's what I had in mind.

I don't think we want to try to automate extract/apply

I don't think so either. Might be possible for simple cases, but that could be added later if we have a clear target.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namespaced in 78df979

@Marc-Andre-Rivet
Copy link
Contributor

One optimization, one API question, one storage question. Otherwise this looks pretty good!

@Marc-Andre-Rivet
Copy link
Contributor

Marc-Andre-Rivet commented Sep 5, 2019

Random thought: IndexedDB does not suffer the same limitations as LocalStorage/SessionStorage:

  • it's faster (no need to parse / stringify - supports blobs and files natively) and non-blocking (sync / async APIs)
  • it allows larger data sets (up to ~50MB)
  • it's also widely supported (>= IE 10; at least for the major parts of the API; https://caniuse.com/#feat=indexeddb)
  • ...but is also more complex to implement.
  • if there are security concerns, unlike local/session storage, the content is not automatically accessible by other scripts in the page

https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API
https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API/Using_IndexedDB

If it works well with our scenarios, this could be added as an additional persistence_type later for use cases if/where performance is an issue.

s += s;
}
return s;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😀

* dispatch as a parameter, so it can report OOM to devtools
*/
setItem(key, value, dispatch) {
const setVal = value === void 0 ? UNDEFINED : JSON.stringify(value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First time I've seen someone use void 0 in a while in JS :)

As ES5 explicitly states that undefined is read-only, I don't think we need to protect against reassignment.

IE9 and Edge10 have full ES5 support so I don't think this is a concern.

Was there an additional concern here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, just satisfying the linter but you're right, it would be better to just disable that check in the linter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔪 no-undefined 15ca0e5

}
// TODO: Should we clear storage here? Or fall back to memory?
// Probably not, unless we want to handle this at a higher level
// so we can keep all 3 items in sync
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for our first implementation of persistence.

  1. Make sure there's some space available in storage, clear if not, fallback to memory if still not.
  2. Write until there's no space left.

Not sure what the internal error looks like but maybe we want to make it cleaner / user focused, like in the initialization step.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the internal error looks like but maybe we want to make it cleaner / user focused, like in the initialization step.

good point - gave it a nicer error message -> 3669df6

@Marc-Andre-Rivet
Copy link
Contributor

@alexcjohnson Looks good, waiting for the tests / changelog / etc. and 📤

@@ -152,7 +156,7 @@ jobs:
name: ️️🏗️ build misc
command: |
. venv/bin/activate && pip install --upgrade -e . --quiet && mkdir packages
git clone --depth 1 https://github.com/plotly/dash-table.git
git clone --depth 1 -b persistence https://github.com/plotly/dash-table.git
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@byronz there seems to be something funny about how these built packages get picked up by the next step. The first time I ran these tests it looked like they got the old dash-table, but then running them again everything worked.

Perhaps something funny about having the two steps build-misc and build-core both writing to the same directory packages? Maybe they should each write to a separate dir, and we have the test job either recombine them into one before installing them, or just install from the two separate dirs?

}
}
"presets": ["@babel/preset-env", "@babel/preset-react"],
"plugins": ["@babel/plugin-proposal-object-rest-spread"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not absolutely 100% certain but I don't think the spread still needs to be set explicitly.

Copy link
Contributor

@Marc-Andre-Rivet Marc-Andre-Rivet Sep 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly necessary but while updating the toolchain, might as well move away from the babel/polyfill core-js@2 wrapper to use the newer core-js@3 polyfills directly instead.

To do so, remove babel/polyfill from package deps, add core-js, update this file to

{
    "presets": [["@babel/preset-env", {
        "useBuiltIns": "usage",
        "corejs": 3
        }], "@babel/preset-react"],
    "plugins": ["@babel/plugin-proposal-object-rest-spread"]
}

and remove babel/polyfill from the renderer's webpack entrypoint.

core-js 3 is a more modular implementation than v2, with better support for es2018 features (v2 has been on feature freeze for a long time now) and usage will only pull what's necessary for the renderer, testing this locally reduces the build from ~253kB to ~206kB for the production bundle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whatwg-fetch is still necessary afaict as core-js will not support it b/c it's not part of the ECMAScript standard.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hints! Confirmed, plugin-proposal-object-rest-spread is unnecessary, it has no effect. But I see the same benefit you do, 253K -> 206K - that's a pretty big win just from a build change 🏆 -> ba8d5ba

@@ -164,6 +165,7 @@ class TreeContainer extends Component {
_dashprivate_dependencies,
_dashprivate_dispatch,
_dashprivate_path,
_dashprivate_layout,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YAL yet another layout

return app


def rename_and_hide(dash_duo, rename=0, new_name='x', hide=1):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think it makes sense to create a small set of APIs to handle common dash operations in the future?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably make the new_name more "newer" than a simple 'x', I also like to add sometimes randomness to avoid mistakes in tests,

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think it makes sense to create a small set of APIs to handle common dash operations in the future?

Sure, but only after we find that we're reusing similar functionality in multiple places. So far this one seems pretty specialized to me.

I would probably make the new_name more "newer" than a simple 'x'

Good call -> 5fa4d55

I also like to add sometimes randomness to avoid mistakes in tests

As long as the test outcome is completely deterministic (to the best of our abilities) I suppose, but my preference is single-use unusual but not really random values.

@@ -92,6 +96,7 @@ jobs:
command: |
sudo pip install --upgrade virtualenv
python -m venv venv || virtualenv venv && . venv/bin/activate
sed -i '' '/dash/d' requires-install.txt
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@byronz as discussed - to prevent the possible double-install of the rest of dash core, since we're going to be building and installing these from specific branches/commits anyway (to resolve the test failures https://github.com/plotly/dash/pull/903/files#r323830675)

@@ -96,6 +96,7 @@ jobs:
command: |
sudo pip install --upgrade virtualenv
python -m venv venv || virtualenv venv && . venv/bin/activate
sed -i '' '/dash/d' requires-install.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐧

@alexcjohnson
Copy link
Collaborator Author

After discussions with a client today, I'm planning one significant change to this PR: instead of a single persisted value for each prop, we should support one for each persistence value. That way if you change the persistence value (from one truthy thing to another), you will reapply the user preferences for that new value.

For example, let's say you can select from many portfolios, and the user may choose different currencies for each: portfolio A displayed in dollars, portfolio B in euros, for example. Use the portfolio ID or name as persistence, and then we'll store a different currency for each portfolio.

@Marc-Andre-Rivet
Copy link
Contributor

Marc-Andre-Rivet commented Sep 13, 2019

instead of a single persisted value for each prop, we should support one for each persistence value

The concerns below might be premature, just want to make sure we keep the simple usage in mind as we iterate.

Storing a single value for each prop seems to provide a lot of value for very little added complexity for the app developer (e.g. persistence=true). If the enhanced version of this feature ends up requiring a more involved API, I'd suggest keeping both the basic and the advance versions of the API so that the complexity the app developer is exposed to grows with the complexity of the task he's trying to accomplish.

Specifically, I'm wondering how we plan to clean up the storage if all keys are kept, apart from clearing everything when space runs out. We'll have to invalidate them at some point or have a set of persistence keys to be kept? Do we have a mechanism in mind (clear on persistence=false?), or a way of avoiding the problem altogether?

@alexcjohnson
Copy link
Collaborator Author

I don't think the API needs to change at all (unless perhaps we add an option to use the current single-key behavior instead, but I'd leave that out until requested, would be easy to add later), all that changes is our code and the structure of the info we store. And for the simple use case persistence=True none of the behavior will change either, nor will it consume more storage. It'll only affect multi-key behavior and storage consumption.

@alexcjohnson
Copy link
Collaborator Author

multi-key persistence ended up being pretty easy, and actually simplifies our storage usage -> 09b9393

Copy link
Contributor

@Marc-Andre-Rivet Marc-Andre-Rivet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💃

@bg-bi
Copy link

bg-bi commented Nov 13, 2019

I noticed an error in the documentation, please see #1012. Thanks.

HammadTheOne pushed a commit to HammadTheOne/dash that referenced this pull request May 28, 2021
HammadTheOne pushed a commit that referenced this pull request Jul 23, 2021
HammadTheOne added a commit that referenced this pull request Jul 23, 2021
* Update dash-info.yaml

* Updated CHANGELOG

* Update CHANGELOG description
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants