Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIP-23: Persist SQL Lab state in the backend #8060

Merged
merged 45 commits into from
Nov 14, 2019
Merged

Conversation

betodealmeida
Copy link
Member

@betodealmeida betodealmeida commented Aug 17, 2019

CATEGORY

Choose one

  • Bug Fix
  • Enhancement (new features, refinement)
  • Refactor
  • Add tests
  • Build / Development Environment
  • Documentation

SUMMARY

This PR implements SIP-23, moving the persistence of SQL Lab state from the browser's localStorage to the backend.

Currently, the redux store in localStorage has the following structure:

sqlLab: {
  activeSouthPaneTab: "Results",
  alerts: [],
  databases: {...},
  offline: false,
  queries: {...},
  queriesLastUpdate: 1565918142000,
  queryEditors: [...],
  tabHistory: [...],
  tables: [...],
}

With this PR, when loading SQL Lab the following keys are loaded from the DB:

  • databases
  • queries
  • queryEditors (from the new tables tab_state_view and table_schema)
  • tables (from the new table table_schema)

Note that only the active query editor is fully loaded; for the other tabs only the id and label is loaded, and the content (query, DB, schema, table schemas) is loaded on tab switch.

As the user interacts with SQL Lab, changes to queryEditors, tables and queries are synced to the backend automatically. This includes the SQL being written, the table schemas being displayed (expanded or not), the database/schema selected, the active tab, etc.

For queries, as soon as a new tab is open we create a row in the query table to store the query, even if it hasn't run. This and the syncing of queryEditors and tables is done through new custom views created in this PR, /tabstateview/ and /tableschemaview/.

For results, I changed the backend so that sync queries are also stored in the results backend if one is configured. This allows us to fetch the results when a tab is loaded, without having to store them in the database, since they can be large.

Migration

This PR requires a DB migration to set up the initial tables, tab_state and table_schema. When a user first visits SQL Lab after this migration, the state of the following keys will be a combination of the backend (nothing, initially) and their localStorage:

  • queryEditors
  • tables
  • tabHistory

When <TabbedSqlEditors> mounts, it will identify objects in these keys that come from localStorage, and migrate them to the backend with a custom action migrateLocalStorage. Once one of these objects has been migrated successfully, it is removed from localStorage, so that the migration doesn't have to be atomic.

Once everything has been migrated, the redux key in localStorage is removed.

Concurrent access from multiple browser tabs

It's a non-goal in this PR to handle users working on the same query editor from multiple tabs. The current behavior remains: the last update is persisted.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

There are no visible changes.

TEST PLAN

I added unit tests for all the modified Redux actions (which required fixing the reducer unit tests, since they were tightly coupled with actions). I also added a new integration test that runs the SQL Lab integration tests with the feature flag enabled.

I also performed manual testing, testing the migration, adding tabs, renaming them, running queries, removing tabs, loading table schema, etc.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

REVIEWERS

@graceguo-supercat @etr2460 @mistercrunch

@betodealmeida betodealmeida added sip Superset Improvement Proposal change:backend Requires changing the backend .database sqllab Namespace | Anything related to the SQL Lab risk:breaking-change Issues or PRs that will introduce breaking changes risk:db-migration PRs that require a DB migration labels Aug 18, 2019
@etr2460
Copy link
Member

etr2460 commented Aug 20, 2019

For such a major change, I think introducing a feature flag would be preferable since it might require significant testing to ensure the solution works at scale. I know the PR has a db migration, but it should be harmless to do the db migration regardless but feature flag all the other code

@betodealmeida
Copy link
Member Author

For such a major change, I think introducing a feature flag would be preferable since it might require significant testing to ensure the solution works at scale. I know the PR has a db migration, but it should be harmless to do the db migration regardless but feature flag all the other code

Sounds good, I'll do that.

@betodealmeida
Copy link
Member Author

@etr2460, I put all the functionality behind a feature flag. I tested locally and it works, and switching the flag on migrates everything to the DB.

I need to add Javascript unit tests and integration tests covering the feature flag, but I think this is ready for review.

Copy link
Member

@etr2460 etr2460 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple questions to start, i'm going to go through this again once i have a bit more time

table = Column(String(256))

# JSON describing the schema, partitions, latest partition, etc.
results = Column(Text)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this include any data from the datasource, or just metadata about it? If it's just metadata, I think that might be a better field name as results makes me think of the results from a query

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. This is only for metadata, the results are stored in the results backend (if configured), and retrieved when the tab loads. I'll rename it.

@@ -245,9 +245,6 @@ def execute_sql_statements(
db_engine_spec = database.db_engine_spec
db_engine_spec.patch()

if store_results and not results_backend:
raise SqlLabException("Results backend isn't configured.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we still throw an error? I'd imagine stuff wouldn't work right if you're trying to store results without a backend so we shouldn't let the user even try it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could change this to if async_ and not results_backend: raise ....

With this PR, now all queries have store_results set, since for sync queries we still want to try to store them in order to load them later. I can remove the store_results flag (since it's always true), and raise the exception if the query is async and there is not results backend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of store_results is that the query payload (including results) should be persisted in the RESULTS_BACKEND. As it stands, this is causing synchronous queries to try and persist this data, even if async queries are disabled for a database, resulting in failure.

@betodealmeida
Copy link
Member Author

I'm fixing the unit tests and adding integration tests.

@betodealmeida
Copy link
Member Author

@etr2460, do you want to take another stab at reviewing this?

I'm currently working on fixing the failing unit tests, but I'm suspecting they might be unrelated and I'm having a hard time reproing them locally. But while I figure out the problem, I appreciate if you could do another pass on reviewing it, since this is a big PR and there have been other changes to SQL Lab in the meantime. Thanks!

@betodealmeida
Copy link
Member Author

I added new integration tests, but looks like Travis is not running them. I ran them locally and they passed:

$ tox -e cypress-sqllab-backend-persist
...
  (Run Finished)


      Spec                                                Tests  Passing  Failing  Pending  Skipped
  ┌────────────────────────────────────────────────────────────────────────────────────────────────┐
  │ ✔ sqllab/index.test.js                      00:11        5        4        -        1        - │
  └────────────────────────────────────────────────────────────────────────────────────────────────┘
    All specs passed!                           00:11        5        4        -        1        -


real    0m23.016s
user    0m27.370s
sys     0m5.313s
______________________________________________________________________________________________________________________________________ summary _______________________________________________________________________________________________________________________________________
  cypress-sqllab-backend-persist: commands succeeded
  congratulations :)

@codecov-io
Copy link

Codecov Report

Merging #8060 into master will decrease coverage by 7.71%.
The diff coverage is 54.76%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8060      +/-   ##
==========================================
- Coverage   73.71%   65.99%   -7.72%     
==========================================
  Files         115      479     +364     
  Lines       12059    23265   +11206     
  Branches        0     2599    +2599     
==========================================
+ Hits         8889    15354    +6465     
- Misses       3170     7767    +4597     
- Partials        0      144     +144
Impacted Files Coverage Δ
superset/assets/src/SqlLab/App.jsx 0% <ø> (ø)
superset/assets/src/reduxUtils.js 58.82% <0%> (ø)
...rset/assets/src/SqlLab/reducers/getInitialState.js 0% <0%> (ø)
superset/assets/src/SqlLab/reducers/sqlLab.js 41.2% <0%> (ø)
superset/sql_lab.py 77.46% <100%> (ø) ⬆️
superset/config.py 88.64% <100%> (ø) ⬆️
superset/assets/src/components/TableSelector.jsx 84.16% <100%> (ø)
...uperset/assets/src/SqlLab/components/ResultSet.jsx 79.77% <100%> (ø)
superset/assets/src/featureFlags.ts 90.9% <100%> (ø)
superset/views/core.py 71.23% <28.57%> (-0.24%) ⬇️
... and 371 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6aa10dc...94d7634. Read the comment docs.

@codecov-io
Copy link

codecov-io commented Aug 29, 2019

Codecov Report

Merging #8060 into master will decrease coverage by 0.15%.
The diff coverage is 53.25%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #8060      +/-   ##
=========================================
- Coverage   66.66%   66.5%   -0.16%     
=========================================
  Files         450     450              
  Lines       22734   23117     +383     
  Branches     2366    2451      +85     
=========================================
+ Hits        15155   15375     +220     
- Misses       7441    7593     +152     
- Partials      138     149      +11
Impacted Files Coverage Δ
superset/assets/src/SqlLab/App.jsx 0% <ø> (ø) ⬆️
superset/assets/src/reduxUtils.js 66.17% <0%> (-2.79%) ⬇️
...rset/assets/src/SqlLab/reducers/getInitialState.js 0% <0%> (ø) ⬆️
superset/config.py 90% <100%> (ø) ⬆️
superset/assets/src/components/TableSelector.jsx 77.69% <100%> (ø) ⬆️
superset/assets/src/SqlLab/components/App.jsx 77.77% <100%> (ø) ⬆️
superset/views/core.py 72.19% <100%> (+0.13%) ⬆️
superset/sql_lab.py 77.93% <100%> (ø) ⬆️
...rset/assets/src/SqlLab/components/LimitControl.jsx 91.17% <100%> (ø) ⬆️
...uperset/assets/src/SqlLab/components/ResultSet.jsx 79.77% <100%> (ø) ⬆️
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 59bc220...58cb383. Read the comment docs.

@@ -39,7 +39,7 @@ describe('TabbedSqlEditors', () => {
'newEditorId',
];

const tables = [Object.assign({}, table[0], {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bug, table[0] is undefined.


const initialState = mockState.sqlLab;

describe('sqlLabReducer', () => {
describe('CLONE_QUERY_TO_NEW_TAB', () => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This action was removed.

qe = Object.assign({}, defaultQueryEditor);
newState = sqlLabReducer(newState, actions.addQueryEditor(qe));
qe = newState.queryEditors[newState.queryEditors.length - 1];
const action = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed a lot (all?) of the unit tests to be proper unit tests, removing the call to actions.

sa.Column("database_id", sa.Integer(), nullable=False),
sa.Column("schema", sa.String(length=256), nullable=True),
sa.Column("table", sa.String(length=256), nullable=True),
sa.Column("description", sa.Text(), nullable=True),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etr2460 I renamed results to description, since metadata is reserved in SQLAlchemy.

@graceguo-supercat
Copy link

hi @betodealmeida I have one concern for this PR: once this feature is released to production, the existed sql lab users open Sql lab, they will not see their old query tabs and query history.
I know the data is still in localStorage. Can we make this localStorage=>server-side transition more smooth? some of our sql lab users really, really rely on these history data.

@betodealmeida
Copy link
Member Author

hi @betodealmeida I have one concern for this PR: once this feature is released to production, the existed sql lab users open Sql lab, they will not see their old query tabs and query history.
I know the data is still in localStorage. Can we make this localStorage=>server-side transition more smooth? some of our sql lab users really, really rely on these history data.

@graceguo-supercat this PR does migrate the existing state to the backend, including the query tabs and table schemas. I'm not 100% sure about the query history, I'll test it and if not I'll make sure to include it in the migration as well.

@graceguo-supercat
Copy link

@betodealmeida thanks for the work. correction: I can see all the query tabs work correct, the missing ones are item under Query History tab. Thank you!

@betodealmeida
Copy link
Member Author

@graceguo-supercat I added code to migrate the query history as well. I'm now taking a look at the errors you found with the results backend.

Copy link

@graceguo-supercat graceguo-supercat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🎉🎉

@betodealmeida betodealmeida merged commit d66bc5a into apache:master Nov 14, 2019
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.36.0 labels Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels change:backend Requires changing the backend risk:breaking-change Issues or PRs that will introduce breaking changes risk:db-migration PRs that require a DB migration sip Superset Improvement Proposal size/XXL sqllab Namespace | Anything related to the SQL Lab 🚢 0.36.0
Projects
Status: Implemented / Done
Development

Successfully merging this pull request may close these issues.

7 participants