-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(chore): dashboard requests to database equal the number of slices it has #24709
fix(chore): dashboard requests to database equal the number of slices it has #24709
Conversation
Codecov Report
@@ Coverage Diff @@
## master #24709 +/- ##
=======================================
Coverage 68.97% 68.97%
=======================================
Files 1901 1901
Lines 74008 74008
Branches 8183 8183
=======================================
Hits 51047 51047
Misses 20840 20840
Partials 2121 2121
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
…removed joining slices
@@ -63,8 +63,6 @@ def get_by_id_or_slug(cls, id_or_slug: int | str) -> Dashboard: | |||
query = ( | |||
db.session.query(Dashboard) | |||
.filter(id_or_slug_filter(id_or_slug)) | |||
.outerjoin(Slice, Dashboard.slices) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dpgaspar Hello! Can you look at the PR? |
@john-bodley @dpgaspar Hi! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems safe to remove those outerjoins, yet it's not clear to me what kind of requests are you referring to, will assume db request/queries:
Tested with /api/v1/dashboard/8/charts (using the example dashboard Sales)
With this change, generated queries:
SELECT dashboards.uuid AS dashboards_uuid, dashboards.created_on AS dashboards_created_on, dashboards.changed_on AS dashboards_changed_on, dashboards.id AS dashboards_id, dashboards.dashboard_title AS dashboards_dashboard_title, dashboards.position_json AS dashboards_position_json, dashboards.description AS dashboards_description, dashboards.css AS dashboards_css, dashboards.certified_by AS dashboards_certified_by, dashboards.certification_details AS dashboards_certification_details, dashboards.json_metadata AS dashboards_json_metadata, dashboards.slug AS dashboards_slug, dashboards.published AS dashboards_published, dashboards.is_managed_externally AS dashboards_is_managed_externally, dashboards.external_url AS dashboards_external_url, dashboards.created_by_fk AS dashboards_created_by_fk, dashboards.changed_by_fk AS dashboards_changed_by_fk
superset_app | FROM dashboards LEFT OUTER JOIN (dashboard_user AS dashboard_user_1 JOIN ab_user ON ab_user.id = dashboard_user_1.user_id) ON dashboards.id = dashboard_user_1.dashboard_id LEFT OUTER JOIN (dashboard_roles AS dashboard_roles_1 JOIN ab_role ON ab_role.id = dashboard_roles_1.role_id) ON dashboards.id = dashboard_roles_1.dashboard_id
superset_app | WHERE dashboards.id = %(id_1)s
SELECT dashboards.uuid AS dashboards_uuid, dashboards.created_on AS dashboards_created_on, dashboards.changed_on AS dashboards_changed_on, dashboards.id AS dashboards_id, dashboards.dashboard_title AS dashboards_dashboard_title, dashboards.position_json AS dashboards_position_json, dashboards.description AS dashboards_description, dashboards.css AS dashboards_css, dashboards.certified_by AS dashboards_certified_by, dashboards.certification_details AS dashboards_certification_details, dashboards.json_metadata AS dashboards_json_metadata, dashboards.slug AS dashboards_slug, dashboards.published AS dashboards_published, dashboards.is_managed_externally AS dashboards_is_managed_externally, dashboards.external_url AS dashboards_external_url, dashboards.created_by_fk AS dashboards_created_by_fk, dashboards.changed_by_fk AS dashboards_changed_by_fk
superset_app | FROM dashboards LEFT OUTER JOIN (dashboard_user AS dashboard_user_1 JOIN ab_user ON ab_user.id = dashboard_user_1.user_id) ON dashboards.id = dashboard_user_1.dashboard_id LEFT OUTER JOIN (dashboard_roles AS dashboard_roles_1 JOIN ab_role ON ab_role.id = dashboard_roles_1.role_id) ON dashboards.id = dashboard_roles_1.dashboard_id
superset_app | WHERE dashboards.id = %(id_1)s
SELECT slices.uuid AS slices_uuid, slices.created_on AS slices_created_on, slices.changed_on AS slices_changed_on, slices.id AS slices_id, slices.slice_name AS slices_slice_name, slices.datasource_id AS slices_datasource_id, slices.datasource_type AS slices_datasource_type, slices.datasource_name AS slices_datasource_name, slices.viz_type AS slices_viz_type, slices.params AS slices_params, slices.query_context AS slices_query_context, slices.description AS slices_description, slices.cache_timeout AS slices_cache_timeout, slices.perm AS slices_perm, slices.schema_perm AS slices_schema_perm, slices.last_saved_at AS slices_last_saved_at, slices.last_saved_by_fk AS slices_last_saved_by_fk, slices.certified_by AS slices_certified_by, slices.certification_details AS slices_certification_details, slices.is_managed_externally AS slices_is_managed_externally, slices.external_url AS slices_external_url, slices.created_by_fk AS slices_created_by_fk, slices.changed_by_fk AS slices_changed_by_fk
superset_app | FROM slices, dashboard_slices
superset_app | WHERE %(param_1)s = dashboard_slices.dashboard_id AND slices.id = dashboard_slices.slice_id
SELECT tables.uuid AS tables_uuid, tables.created_on AS tables_created_on, tables.changed_on AS tables_changed_on, tables.id AS tables_id, tables.description AS tables_description, tables.default_endpoint AS tables_default_endpoint, tables.is_featured AS tables_is_featured, tables.filter_select_enabled AS tables_filter_select_enabled, tables."offset" AS tables_offset, tables.cache_timeout AS tables_cache_timeout, tables.params AS tables_params, tables.perm AS tables_perm, tables.schema_perm AS tables_schema_perm, tables.is_managed_externally AS tables_is_managed_externally, tables.external_url AS tables_external_url, tables.table_name AS tables_table_name, tables.main_dttm_col AS tables_main_dttm_col, tables.database_id AS tables_database_id, tables.fetch_values_predicate AS tables_fetch_values_predicate, tables.schema AS tables_schema, tables.sql AS tables_sql, tables.is_sqllab_view AS tables_is_sqllab_view, tables.template_params AS tables_template_params, tables.extra AS tables_extra, tables.normalize_columns AS tables_normalize_columns, tables.created_by_fk AS tables_created_by_fk, tables.changed_by_fk AS tables_changed_by_fk, anon_1.slices_datasource_id AS anon_1_slices_datasource_id, anon_1.slices_datasource_type AS anon_1_slices_datasource_type
superset_app | FROM (SELECT DISTINCT slices.datasource_id AS slices_datasource_id, slices.datasource_type AS slices_datasource_type
superset_app | FROM slices, dashboard_slices
superset_app | WHERE %(param_1)s = dashboard_slices.dashboard_id AND slices.id = dashboard_slices.slice_id) AS anon_1 JOIN tables ON anon_1.slices_datasource_id = tables.id AND anon_1.slices_datasource_type = %(datasource_type_1)s
Before this change:
SELECT dashboards.uuid AS dashboards_uuid, dashboards.created_on AS dashboards_created_on, dashboards.changed_on AS dashboards_changed_on, dashboards.id AS dashboards_id, dashboards.dashboard_title AS dashboards_dashboard_title, dashboards.position_json AS dashboards_position_json, dashboards.description AS dashboards_description, dashboards.css AS dashboards_css, dashboards.certified_by AS dashboards_certified_by, dashboards.certification_details AS dashboards_certification_details, dashboards.json_metadata AS dashboards_json_metadata, dashboards.slug AS dashboards_slug, dashboards.published AS dashboards_published, dashboards.is_managed_externally AS dashboards_is_managed_externally, dashboards.external_url AS dashboards_external_url, dashboards.created_by_fk AS dashboards_created_by_fk, dashboards.changed_by_fk AS dashboards_changed_by_fk
superset_app | FROM dashboards LEFT OUTER JOIN (dashboard_slices AS dashboard_slices_1 JOIN slices ON slices.id = dashboard_slices_1.slice_id) ON dashboards.id = dashboard_slices_1.dashboard_id LEFT OUTER JOIN tables ON slices.datasource_id = tables.id AND slices.datasource_type = %(datasource_type_1)s LEFT OUTER JOIN (dashboard_user AS dashboard_user_1 JOIN ab_user ON ab_user.id = dashboard_user_1.user_id) ON dashboards.id = dashboard_user_1.dashboard_id LEFT OUTER JOIN (dashboard_roles AS dashboard_roles_1 JOIN ab_role ON ab_role.id = dashboard_roles_1.role_id) ON dashboards.id = dashboard_roles_1.dashboard_id
superset_app | WHERE dashboards.id = %(id_1)s
SELECT slices.uuid AS slices_uuid, slices.created_on AS slices_created_on, slices.changed_on AS slices_changed_on, slices.id AS slices_id, slices.slice_name AS slices_slice_name, slices.datasource_id AS slices_datasource_id, slices.datasource_type AS slices_datasource_type, slices.datasource_name AS slices_datasource_name, slices.viz_type AS slices_viz_type, slices.params AS slices_params, slices.query_context AS slices_query_context, slices.description AS slices_description, slices.cache_timeout AS slices_cache_timeout, slices.perm AS slices_perm, slices.schema_perm AS slices_schema_perm, slices.last_saved_at AS slices_last_saved_at, slices.last_saved_by_fk AS slices_last_saved_by_fk, slices.certified_by AS slices_certified_by, slices.certification_details AS slices_certification_details, slices.is_managed_externally AS slices_is_managed_externally, slices.external_url AS slices_external_url, slices.created_by_fk AS slices_created_by_fk, slices.changed_by_fk AS slices_changed_by_fk
superset_app | FROM slices, dashboard_slices
superset_app | WHERE %(param_1)s = dashboard_slices.dashboard_id AND slices.id = dashboard_slices.slice_id
SELECT tables.uuid AS tables_uuid, tables.created_on AS tables_created_on, tables.changed_on AS tables_changed_on, tables.id AS tables_id, tables.description AS tables_description, tables.default_endpoint AS tables_default_endpoint, tables.is_featured AS tables_is_featured, tables.filter_select_enabled AS tables_filter_select_enabled, tables."offset" AS tables_offset, tables.cache_timeout AS tables_cache_timeout, tables.params AS tables_params, tables.perm AS tables_perm, tables.schema_perm AS tables_schema_perm, tables.is_managed_externally AS tables_is_managed_externally, tables.external_url AS tables_external_url, tables.table_name AS tables_table_name, tables.main_dttm_col AS tables_main_dttm_col, tables.database_id AS tables_database_id, tables.fetch_values_predicate AS tables_fetch_values_predicate, tables.schema AS tables_schema, tables.sql AS tables_sql, tables.is_sqllab_view AS tables_is_sqllab_view, tables.template_params AS tables_template_params, tables.extra AS tables_extra, tables.normalize_columns AS tables_normalize_columns, tables.created_by_fk AS tables_created_by_fk, tables.changed_by_fk AS tables_changed_by_fk, anon_1.slices_datasource_id AS anon_1_slices_datasource_id, anon_1.slices_datasource_type AS anon_1_slices_datasource_type
superset_app | FROM (SELECT DISTINCT slices.datasource_id AS slices_datasource_id, slices.datasource_type AS slices_datasource_type
superset_app | FROM slices, dashboard_slices
superset_app | WHERE %(param_1)s = dashboard_slices.dashboard_id AND slices.id = dashboard_slices.slice_id) AS anon_1 JOIN tables ON anon_1.slices_datasource_id = tables.id AND anon_1.slices_datasource_type = %(datasource_type_1)s
Hello! Thanks for the reply! |
* fix(sqllab): reinstate "Force trino client async execution" (apache#25680) * fix: remove unnecessary redirect (apache#25679) (cherry picked from commit da42bf2) * fix(chore): dashboard requests to database equal the number of slices it has (apache#24709) (cherry picked from commit 75a7431) * fix: bump to FAB 4.3.9 remove CSP exception (apache#25712) (cherry picked from commit 8fb0c8d) * fix(horizontal filter label): show full tooltip with ellipsis (apache#25732) (cherry picked from commit e4173d9) * fix: Revert "fix(Charts): Set max row limit + removed the option to use an empty row limit value" (apache#25753) (cherry picked from commit e2fe967) * fix: dataset update uniqueness (apache#25756) (cherry picked from commit c7f8d11) * fix(sqllab): slow pop datasource query (apache#25741) (cherry picked from commit 2a2bc82) * fix: allow for backward compatible errors (apache#25640) * fix: DB-specific quoting in Jinja macro (apache#25779) (cherry picked from commit 5659c87) * fix: Revert "fix: Apply normalization to all dttm columns (apache#25147)" (apache#25801) * fix: Resolve issue apache#24195 (apache#25804) (cherry picked from commit 8737a8a) * fix(SQL field in edit dataset modal): display full sql query (apache#25768) (cherry picked from commit 1eba712) * fix(sqllab): infinite fetching status after results are landed (apache#25814) (cherry picked from commit 3f28eeb) * fix: Fires onChange when clearing all values of single select (apache#25853) (cherry picked from commit 8061d5c) * fix: the temporal x-axis results in a none time_range. (apache#25429) Co-authored-by: Elizabeth Thompson <eschutho@gmail.com> (cherry picked from commit ae619b1) * fix(table chart): Show Cell Bars correctly apache#25625 (apache#25707) (cherry picked from commit 916f7bc) * fix: remove `update_charts_owners` (apache#25843) * fix(charts): Time grain is None when dataset uses Jinja (apache#25842) (cherry picked from commit 7536dd1) * fix: Saving Mixed Chart with dashboard filter applied breaks adhoc_filter_b (apache#25877) (cherry picked from commit 268c1dc) * fix: database version field (apache#25898) (cherry picked from commit 06ffcd2) * fix: trino cursor (apache#25897) (cherry picked from commit cdb18e0) * chore: Updates CHANGELOG.md for 3.0.2 * fix(trino): allow impersonate_user flag to be imported (apache#25872) Co-authored-by: John Bodley <4567245+john-bodley@users.noreply.github.com> (cherry picked from commit 458be8c) * fix(table): Double percenting ad-hoc percentage metrics (apache#25857) (cherry picked from commit 784a478) * fix(sqllab): invalid sanitization on comparison symbol (apache#25903) (cherry picked from commit 581d3c7) * fix: update flask-caching to avoid breaking redis cache, solves apache#25339 (apache#25947) Co-authored-by: Ville Brofeldt <33317356+villebro@users.noreply.github.com> * fix: always denorm column value before querying values (apache#25919) * chore(colors): Updating Airbnb brand colors (apache#23619) (cherry picked from commit 6d8424c) * fix: naming denomalized to denormalized in helpers.py (apache#25973) (cherry picked from commit 5def416) * fix(helm): Restart all related deployments when bootstrap script changed (apache#25703) * fix(rls): Update text from tables to datasets in RLS modal (apache#25997) (cherry picked from commit 210f1f8) * fix: Make Select component fire onChange listener when a selection is pasted in (apache#25993) (cherry picked from commit 5fccf67) * fix(explore): redandant force param (apache#25985) (cherry picked from commit e7a1876) * chore: Optimize fetching samples logic (apache#25995) (cherry picked from commit 326ac4a) * fix(native filters): rendering performance improvement by reduce overrendering (apache#25901) (cherry picked from commit e1d73d5) * fix: update FAB to 4.3.10, Azure user info fix (apache#26037) (cherry picked from commit 628cd34) * chore: Updates CHANGELOG.md for 3.0.2 (rc2) --------- Co-authored-by: Rob Moore <giftig@users.noreply.github.com> Co-authored-by: Igor Khrol <igor.khrol@automattic.com> Co-authored-by: Stepan <66589759+Always-prog@users.noreply.github.com> Co-authored-by: Daniel Vaz Gaspar <danielvazgaspar@gmail.com> Co-authored-by: Ross Mabbett <92495987+rtexelm@users.noreply.github.com> Co-authored-by: Geido <60598000+geido@users.noreply.github.com> Co-authored-by: Beto Dealmeida <roberto@dealmeida.net> Co-authored-by: JUST.in DO IT <justin.park@airbnb.com> Co-authored-by: Elizabeth Thompson <eschutho@gmail.com> Co-authored-by: John Bodley <4567245+john-bodley@users.noreply.github.com> Co-authored-by: Michael S. Molina <70410625+michael-s-molina@users.noreply.github.com> Co-authored-by: mapledan <mapledan829@gmail.com> Co-authored-by: Arko <90512504+SA-Ark@users.noreply.github.com> Co-authored-by: Antonio Rivero <38889534+Antonio-RiveroMartnez@users.noreply.github.com> Co-authored-by: Kamil Gabryjelski <kamil.gabryjelski@gmail.com> Co-authored-by: Michael S. Molina <michael.s.molina@gmail.com> Co-authored-by: FGrobelny <150029280+FGrobelny@users.noreply.github.com> Co-authored-by: Giacomo Barone <46573388+ggbaro@users.noreply.github.com> Co-authored-by: Ville Brofeldt <33317356+villebro@users.noreply.github.com> Co-authored-by: Hugh A. Miles II <hughmil3s@gmail.com> Co-authored-by: josedev-union <70741025+josedev-union@users.noreply.github.com> Co-authored-by: yousoph <sophieyou12@gmail.com> Co-authored-by: Jack Fragassi <jfragassi98@gmail.com>
SUMMARY
Hello!
Superset 2.1.0 has an issue where the requests made when getting a dashboard repeat the same number of times as the number of slices. This is a significant problem because, for example, if the dashboard size is 1MB and it contains 100 slices, each user opening this dashboard generates 100MB of requests. Considering logs, endpoint checks, etc., it can reach 500-1000MB per user!
I have fixed this issue by retrieving the dashboard using id, uuid, or slug, and checking the access using the
raise_for_dashboard_access
function, which verifies access to the data sources.TESTING INSTRUCTIONS
ADDITIONAL INFORMATION