From 9c5f20920d3181450a9feefc340e77935f13ec1c Mon Sep 17 00:00:00 2001 From: Ville Brofeldt Date: Wed, 2 Mar 2022 12:55:34 +0200 Subject: [PATCH] update docs --- UPDATING.md | 2 +- docs/docs/installation/cache.mdx | 37 ++++++++++++++++---------------- superset/config.py | 2 ++ 3 files changed, 21 insertions(+), 20 deletions(-) diff --git a/UPDATING.md b/UPDATING.md index 1ebbd1adda6f6..77a9f7da9b754 100644 --- a/UPDATING.md +++ b/UPDATING.md @@ -26,7 +26,7 @@ assists people when migrating to a new version. ### Breaking Changes -- [18976](https://github.com/apache/superset/pull/18976): A new `DEFAULT_CACHE_CONFIG_FUNC` parameter has been introduced in `config.py` which makes it possible to define a default cache config that will be used as the basis for all cache configs. When running the app in debug mode, the app will default to use `SimpleCache`; in other cases the default cache type will be `NullCache`. In addition, `DEFAULT_CACHE_TIMEOUT` has been deprecated and moved into `DEFAULT_CACHE_CONFIG_FUNC` (will be removed in Superset 2.0). For installations using Redis or other caching backends, it is recommended to set the default cache options in `DEFAULT_CACHE_CONFIG_FUNC` to ensure the primary cache is always used if new caches are added. +- [18976](https://github.com/apache/superset/pull/18976): A new `DEFAULT_CACHE_CONFIG` parameter has been introduced in `config.py` which makes it possible to define a default cache config that will be used as the basis for all cache configs. When running the app in debug mode, the app will default to use `SimpleCache`; in other cases the default cache type will be `NullCache`. In addition, `DEFAULT_CACHE_TIMEOUT` has been deprecated and moved into `DEFAULT_CACHE_CONFIG` (will be removed in Superset 2.0). For installations using Redis or other caching backends, it is recommended to set the default cache options in `DEFAULT_CACHE_CONFIG` to ensure the primary cache is always used if new caches are added. - [17881](https://github.com/apache/superset/pull/17881): Previously simple adhoc filter values on string columns were stripped of enclosing single and double quotes. To fully support literal quotes in filters, both single and double quotes will no longer be removed from filter values. - [17984](https://github.com/apache/superset/pull/17984): Default Flask SECRET_KEY has changed for security reasons. You should always override with your own secret. Set `PREVIOUS_SECRET_KEY` (ex: PREVIOUS_SECRET_KEY = "\2\1thisismyscretkey\1\2\\e\\y\\y\\h") with your previous key and use `superset re-encrypt-secrets` to rotate you current secrets - [15254](https://github.com/apache/superset/pull/15254): Previously `QUERY_COST_FORMATTERS_BY_ENGINE`, `SQL_VALIDATORS_BY_ENGINE` and `SCHEDULED_QUERIES` were expected to be defined in the feature flag dictionary in the `config.py` file. These should now be defined as a top-level config, with the feature flag dictionary being reserved for boolean only values. diff --git a/docs/docs/installation/cache.mdx b/docs/docs/installation/cache.mdx index 4a4258a60e4a5..c91cd51176004 100644 --- a/docs/docs/installation/cache.mdx +++ b/docs/docs/installation/cache.mdx @@ -7,20 +7,29 @@ version: 1 ## Caching -Superset uses [Flask-Caching](https://flask-caching.readthedocs.io/) for caching purpose. For security reasons, -there are two separate cache configs for Superset's own metadata (`CACHE_CONFIG`) and charting data queried from -connected datasources (`DATA_CACHE_CONFIG`). However, Query results from SQL Lab are stored in another backend -called `RESULTS_BACKEND`, See [Async Queries via Celery](/docs/installation/async-queries-celery) for details. - -Configuring caching is as easy as providing `CACHE_CONFIG` and `DATA_CACHE_CONFIG` in your +Superset uses [Flask-Caching](https://flask-caching.readthedocs.io/) for caching purpose. Default caching options +can be set by overriding the `DEFAULT_CACHE_CONFIG` in your `superset_config.py`. Unless overridden, the default +cache type will be set to `SimpleCache` when running in debug mode, and `NullCache` otherwise. + +Currently there are five separate cache configurations to provide additional security and more granular customization options: +- Metadata cache (optional): `CACHE_CONFIG` +- Charting data queried from datasets (optional): `DATA_CACHE_CONFIG` +- SQL Lab query results (optional): `RESULTS_BACKEND`. See [Async Queries via Celery](/docs/installation/async-queries-celery) for details +- Dashboard filter state (required): `FILTER_STATE_CACHE_CONFIG`. +- Explore chart form data (required): `EXPLORE_FORM_DATA_CACHE_CONFIG` + +Configuring caching is as easy as providing a custom cache config in your `superset_config.py` that complies with [the Flask-Caching specifications](https://flask-caching.readthedocs.io/en/latest/#configuring-flask-caching). - Flask-Caching supports various caching backends, including Redis, Memcached, SimpleCache (in-memory), or the -local filesystem. +local filesystem. Custom cache backends are also supported. See [here](https://flask-caching.readthedocs.io/en/latest/#custom-cache-backends) for specifics. +Note that Dashboard and Explore caching is required, and configuring the application with either of these caches set to `NullCache` will +cause the application to fail on startup. Also keep in mind, tht when running Superset on a multi-worker setup, a dedicated cache is required. +For this we recommend running either Redis or Memcached: + +- Redis (recommended): we recommend the [redis](https://pypi.python.org/pypi/redis) Python package - Memcached: we recommend using [pylibmc](https://pypi.org/project/pylibmc/) client library as `python-memcached` does not handle storing binary data correctly. -- Redis: we recommend the [redis](https://pypi.python.org/pypi/redis) Python package Both of these libraries can be installed using pip. @@ -28,16 +37,6 @@ For chart data, Superset goes up a “timeout search path”, from a slice's con to the datasource’s, the database’s, then ultimately falls back to the global default defined in `DATA_CACHE_CONFIG`. -``` -DATA_CACHE_CONFIG = { - 'CACHE_TYPE': 'redis', - 'CACHE_DEFAULT_TIMEOUT': 60 * 60 * 24, # 1 day default (in secs) - 'CACHE_KEY_PREFIX': 'superset_results', - 'CACHE_REDIS_URL': 'redis://localhost:6379/0', -} -``` - -Custom cache backends are also supported. See [here](https://flask-caching.readthedocs.io/en/latest/#custom-cache-backends) for specifics. Superset has a Celery task that will periodically warm up the cache based on different strategies. To use it, add the following to the `CELERYBEAT_SCHEDULE` section in `config.py`: diff --git a/superset/config.py b/superset/config.py index d751eda480ed7..59eb069e5dd51 100644 --- a/superset/config.py +++ b/superset/config.py @@ -592,12 +592,14 @@ def _try_json_readsha(filepath: str, length: int) -> Optional[str]: # Cache for filters state (will be merged with DEFAULT_CACHE_CONFIG) FILTER_STATE_CACHE_CONFIG: CacheConfig = { "CACHE_DEFAULT_TIMEOUT": int(timedelta(days=90).total_seconds()), + # should the timeout be reset when retrieving a cached value "REFRESH_TIMEOUT_ON_RETRIEVAL": True, } # Cache for chart form data (will be merged with DEFAULT_CACHE_CONFIG) EXPLORE_FORM_DATA_CACHE_CONFIG: CacheConfig = { "CACHE_DEFAULT_TIMEOUT": int(timedelta(days=7).total_seconds()), + # should the timeout be reset when retrieving a cached value "REFRESH_TIMEOUT_ON_RETRIEVAL": True, }