-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sqlalchemy use connection pooling for catalog db #710
Conversation
For test failures, it looks like you are missing an import: engine = create_async_engine(
> uri, echo=echo, json_serializer=json_serializer, poolclass=AsyncAdaptedQueuePool
)
E NameError: name 'AsyncAdaptedQueuePool' is not defined For linter issues: pip install pre-commit
pre-commit install # set up git hooks
pre-commit run --all-files
git commit -am "Fix linter issues." |
Sorry, silly blunder. |
I see some pytest failures. They look related but it is not immediately obvious to me what broke. |
Correction: I had misunderstood that asv was running against the version of tiled installed in my environment, and not against the local repository I was working in. Now performing proper benchmarks where the change between pooling / not pooling is actually captured, the speedup looks significant when using pooling:
edit: this improvement scales, here is a run for
The first run was too slow for the default timeout, and I had to increase it:
|
The test failures arose from trying to use a pool with an in-memory ( diff --git a/tiled/catalog/adapter.py b/tiled/catalog/adapter.py
index e1c2f1af..b297155d 100644
--- a/tiled/catalog/adapter.py
+++ b/tiled/catalog/adapter.py
@@ -1293,7 +1293,7 @@ def from_uri(
uri = f"sqlite+aiosqlite:///{uri}"
parsed_url = make_url(uri)
- if (parsed_url.get_dialect().name == "sqlite") and (
+ if False and (parsed_url.get_dialect().name == "sqlite") and (
parsed_url.database != ":memory:"
):
# For file-backed SQLite databases, connection pooling offers a
|
This PR results from the investigation done in #663, as well as a group coding session to implement a new benchmark for the catalog db (especially with input from @genematx, @Kezzsim, @danielballan).
I spent some extra time experimenting with asv to understand better what it does before putting up this PR. The new benchmark can be singled out by running:
asv run --environment existing:same --bench "time_repeated_write"
.While some initial benchmarks seemed to show a modest performance improvement from connection pooling, I found (after running additional benchmarks) that this improvement was really just variance in the benchmark results. Instead, benchmarks seem to show that performance is ~identical regardless of connection pooling. This at least means that there is no apparent regression, which was our primary concern.
asv has several parameters that can adjust how benchmarks are taken, with the ability to customize things like warmup time, rounds per benchmark, samples per round, benchmark timeouts, etc. I played with these settings to see if they had any impact on our results, and it seemed the answer is "no".
It is worth pointing out that asv defaults to
timeit.default_timer
- and so it should measure the time spent in subprocesses. I was worried about this, since aiosqlite drops into a thread (per db connection).