Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: set default mysql isolation level to 'READ COMMITTED' #30174

Merged
merged 7 commits into from
Sep 10, 2024

Conversation

mistercrunch
Copy link
Member

@mistercrunch mistercrunch commented Sep 6, 2024

#28628 had the virtuous intent to align mysql and postgres to a consistent isolation_level (READ COMMITTED), which seems fitted for Superset. Though instead of doing this, and because sqlite doesn't support that specific one, it set the default to SERIALIZABLE which seems to exist by many engines.

Now SERIALIZABLE can create issues like this one: #30064 and upping the isolation level can have serious load implications for large deployments (or databases that are fitted "just-right" for their loads, like someone may use a tiny RDS for Superset)

Here I'm realizing we need dynamic defaults for isolation_level based on which engine is in used, which can't be done in config.py as we don't know yet what engine will be set by the admin at that time. So I thought the superset/initialization package might be the right place for this.

Open to other solutions, but I think this one works.

alternative considered

  • using DB_CONNECTION_MUTATOR, but that one applies only to analytics workloads, not the metadata db

#28628 had the virtuous intent to align mysql and postgres to a consistent isolation_level (READ COMMITTED), which seems fitted for Superset. Though instead of doing this, and because sqlite doesn't support that specific one, it set the default to SERIALIZABLE which seems to exist by many engines.

Here I'm realizing we need dynamic defaults for isolation_level based on which engine is in used, which can't be done in config.py as we don't know yet what engine will be set by the admin at that time. So I thought the superset/initialization package might be the right place for this.

Open to other solutions, but I think this one works.

* using DB_CONNECTION_MUTATOR, but that one applies only to analytics workloads, not the metadata db
@request-info request-info bot added the need:more-info Requires more information from author label Sep 6, 2024
@mistercrunch mistercrunch removed the need:more-info Requires more information from author label Sep 6, 2024
@apache apache deleted a comment from request-info bot Sep 6, 2024
@pull-request-size pull-request-size bot added size/M and removed size/S labels Sep 6, 2024
Comment on lines 498 to 500
if not isolation_level and self.config["SQLALCHEMY_DATABASE_URI"].startswith(
"mysql"
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda liked John's cleaner way of extracting the backend name:

backend = make_url(SQLALCHEMY_DATABASE_URI).get_backend_name()

Also, weren't we supposed to do READ COMMITTED by default for Postgres, too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I addressed comments, lover how pre-commit caught something around make_url not being safe to use so I found I can use make_url_safe

Co-authored-by: Ville Brofeldt <33317356+villebro@users.noreply.github.com>
@mistercrunch mistercrunch marked this pull request as ready for review September 6, 2024 19:45
@dosubot dosubot bot added data:connect:mysql Related to MySQL risk:breaking-change Issues or PRs that will introduce breaking changes labels Sep 6, 2024
@sadpandajoe sadpandajoe linked an issue Sep 6, 2024 that may be closed by this pull request
3 tasks
@sadpandajoe
Copy link
Member

Wondering since the original pr is a breaking change if we should pull it out of 4.1 and wait until 5.0 to incorporate it.

@mistercrunch
Copy link
Member Author

@villebro do you think that's mergeable?

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -1 +1,2 @@
docker/**/*.sh text eol=lf
*.svg binary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated bycatch, I assume..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, Ill search for say a model name and the erd.svg muddies my git grep

@sadpandajoe sadpandajoe added the v4.1 Label added by the release manager to track PRs to be included in the 4.1 branch label Sep 9, 2024
@mistercrunch mistercrunch merged commit 6baeb65 into master Sep 10, 2024
35 checks passed
sadpandajoe pushed a commit that referenced this pull request Sep 10, 2024
Co-authored-by: Ville Brofeldt <33317356+villebro@users.noreply.github.com>
(cherry picked from commit 6baeb65)
@mistercrunch mistercrunch deleted the fix_mysql_isolation branch September 11, 2024 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data:connect:mysql Related to MySQL preset-io risk:breaking-change Issues or PRs that will introduce breaking changes size/M v4.1 Label added by the release manager to track PRs to be included in the 4.1 branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql_lab.get_sql_results encountering psycopg2.errors.SerializationFailure
5 participants