You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When we do filters with related fields, the database is making queries considering the whole related table, instead of only the portion of data from the current tenant. This makes queries very slow when there're many tenants.
One solution is implicitly changing the queryset to filter by the related table tenant too.
SELECT"my_app_mymodel"."field1", "my_app_mymodel"."field2"FROM"my_app_mymodel"WHERE"my_app_mymodel"."tenant_id"= tenant_id
"my_app_mymodel"."my_related_field_id" NOT IN (
SELECT"my_app_myrelatedmodel"."field1", "my_app_myrelatedmodel"."field2"FROM"my_app_myrelatedmodel"WHERE"my_app_myrelatedmodel"."field1"= value1
) AND"my_app_mymodel"."my_related_field_id" NOT IN (
SELECT"my_app_myrelatedmodel"."field1", "my_app_myrelatedmodel"."field2"FROM"my_app_myrelatedmodel"WHERE"my_app_myrelatedmodel"."field2"= value2
);
If we use the same strategy as in filter the result would be something like
SELECT"my_app_mymodel"."field1", "my_app_mymodel"."field2"FROM"my_app_mymodel"WHERE"my_app_mymodel"."tenant_id"= tenant_id AND"my_app_mymodel"."my_related_field_id" NOT IN (
SELECT"my_app_myrelatedmodel"."field1", "my_app_myrelatedmodel"."field2"FROM"my_app_myrelatedmodel"WHERE"my_app_myrelatedmodel"."tenant_id"= tenant_id
) AND"my_app_mymodel"."my_related_field_id" NOT IN (
SELECT"my_app_myrelatedmodel"."field1", "my_app_myrelatedmodel"."field2"FROM"my_app_myrelatedmodel"WHERE"my_app_myrelatedmodel"."field1"= value1
) AND"my_app_mymodel"."my_related_field_id" NOT IN (
SELECT"my_app_myrelatedmodel"."field1", "my_app_myrelatedmodel"."field2"FROM"my_app_myrelatedmodel"WHERE"my_app_myrelatedmodel"."field2"= value2
);
For it to be optimized, the generated code should be something like this:
SELECT"my_app_mymodel"."field1", "my_app_mymodel"."field2"FROM"my_app_mymodel"WHERE"my_app_mymodel"."tenant_id"= tenant_id AND"my_app_mymodel"."my_related_field_id" NOT IN (
SELECT"my_app_myrelatedmodel"."field1", "my_app_myrelatedmodel"."field2"FROM"my_app_myrelatedmodel"WHERE"my_app_myrelatedmodel"."tenant_id"= tenant_id AND"my_app_myrelatedmodel"."field1"= value1
) AND"my_app_mymodel"."my_related_field_id" NOT IN (
SELECT"my_app_myrelatedmodel"."field1", "my_app_myrelatedmodel"."field2"FROM"my_app_myrelatedmodel"WHERE"my_app_myrelatedmodel"."tenant_id"= tenant_id AND"my_app_myrelatedmodel"."field2"= value2
);
The only way I could generate something that looks like the expected result was:
I might have some ideas how to get these queries to come out more optimised, but I'm not 100% sure if I've followed your reasoning correctly.
This definitely seems like something where a concrete example case would be very helpful. If I had a concrete example I could import in a python REPL and work with I could spend some time in the REPL trying the various Query tools in Django (Aggregation functions, Q objects, Prefetch objects, FilteredRelation objects, etc...) to try and find a way to optimise the query. However if I just made up some model classes to try things I won't be sure if I've created an appropriate example and really found a way to fix these query optimisation issues, or if my example is too simplistic and my possible improvements don't fully solve the problems you're talking about here.
@hugobessa Could you add one or two examples that produce this kind of non-optimal queries from the ORM into the included exampleproject Django project, and/or into the test cases for future use in regression testing the optimisations?
When we do filters with related fields, the database is making queries considering the whole related table, instead of only the portion of data from the current tenant. This makes queries very slow when there're many tenants.
One solution is implicitly changing the queryset to filter by the related table tenant too.
Eg.:
sql is something like:
If we add the related field tenant's filter like this we solve the problem
the sql with the fix should something like:
For excludes the situation is even worse because it always uses
NOT IN
operator and subqueries.Eg.:
sql is something like:
If we use the same strategy as in filter the result would be something like
For it to be optimized, the generated code should be something like this:
The only way I could generate something that looks like the expected result was:
The resulting sql looks like this:
The text was updated successfully, but these errors were encountered: