-
Notifications
You must be signed in to change notification settings - Fork 767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with chained filters and Django ORM #745
Comments
Still not yet convinced this is a DF issue, rather than an ORM one — what do the Django people say about this? This behaviour is as old as the ORM itself, so it's come up before.
There's an easy way to check that. 🙂 — Open an PR and see what breaks. |
Also, what do the DB people say? If I make a double join, is that not a no-op? (Maybe it isn't — but, if not, why not? And, why is this a DF issue?) |
If I'm not mistaken, this specifically refers to filtering across to-many relationships, which returns different results for
Nope - the second join is aliased as
There really isn't a trivial way to implement the desired behavior at the moment. |
Couple of side notes:
|
I think that I've finally found the solution for this issue. Inside the model QuerySet exists a method called https://kite.com/docs/python/django.db.models.query.QuerySet._next_is_sticky Here is my Pull Request: #753 |
But it has not passed in django-rest tests. Can someone who understands how it works help me? |
As a quick win it would be really great to adapt this behavior until a more consistent feature was available (as a config parameter to determine if it is desired to work with Current behavior breaks the possibility to use this magnificient plugin in project where is neccessary to filter between relations as these, a real shame : ( |
I agree that the
Having a solution to this to-many chaining issue would greatly simplify my REST API that uses django-filter to filter results across many-to-many relationships. Please let me know how I can lend a hand. |
I'm essentially -1 on making a change here. First off, I'm not going to add anything on the basis of it being "a quick win" — that way lays trouble later. Second, more to the point, it seems to me you should be handling this with a custom filter, custom field using a multi-widget. i.e. a single field expecting multiple values which is then used in a single I'd be interested in seeing what people come up with that way. No doubt there's something we can add, which doesn't involve setting state flags on the core filtering behaviour. |
For what it's worth, I've been making a lot of headway on the nested filtering issue. Basically, queries will go from Blog.objects.filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008) to something like Blog.objects.filter(
entry=Entry.objects.filter(
headline__contains='Lennon',
pub_date__year=2008,
),
) In addition to enabling the correct behavior for to-many relationships, this will allow us to enable request-based filtering on related queries. eg, those entries that are currently published or are in the process of being authored by the user. def visible_entries(request):
return Entry.objects.filter(Q(is_published=True) | Q(author=request.user))
Blog.objects.filter(
entry=visible_entries(request).filter(
headline__contains='Lennon',
pub_date__year=2008,
),
) That said, I'm beating my head against the wall in regards to some potential performance issues. Unsurprisingly, instantiating n |
@rpkilby Push your branch. Let's see what you've got. 🙂 I don't want anything too complex. The point of DF is to avoid common boilerplate. There's nothing wrong in writing the filtering logic by hand in non-core cases. I much prefer that to magic no-one can debug. |
Closing this as per #753 |
@carltongibson, I'm fairly close to pushing out a PR, but it's for the drf-filters project. |
OK. Cool. Drop a note here when it’s ready. I’ll have a look. |
Okay, a first pass is up (see: philipn/django-rest-framework-filters#197). |
I'm also encountering this issue. What was wrong with the dictionary approach? It seems much better than mucking around with next_is_sticky or nested querysets. Edit: Ah, I think I understand the issue rpkilby is describing in the "couple of side notes" post. There are a lot of django ORM features like "distinct" that are locked off by sticking to a kwargs dictionary or even a series of Q objects. |
I'm also blocked by this issue. Including three or more queries of fields on a many-to-many relationship (e.g. Applying @eduardocalil's fix the the current version of |
Hey, I also encountered some issues related to filtering on the relationships. When I want to use 2 filters on a related field I always expect the filters to be applied on the same table. I wasn't able to find a real world example, where django's approach would be justified and seems like this is problematic for many users. As @jacebrowning mentioned sometimes it generates so many JOINs that it's not even possible to run the query (e.g. for multi-word search term). This is still an issue in django admin's search when it uses the relationships. There was even an attempt to change it quite a long time ago, but unfortunately it was reverted. Would it be possible to add this If this sounds reasonable I could prepare a PR with the adjustment + docs update and some tests. |
Ok, so, the approach that would be worth looking at here is a multi-filter that wrapped up the whole multi-widget and single filter call business. That’s the solve it, but folks don’t know how to configure that themselves. It would be a nice addition. |
@carltongibson It's currently pretty difficult to set up because you need to subclass A If I create a filter with I think you basically need to implement all of the following:
I've written some helpers to automatically put the nested field's widgets onto a
This simplifies usage to:
It's not perfect, but it's a lot more usable. Note that I actually subclass I also considered writing a method that could transform a filterset into a single grouped filter. This approach is cool, but we'd need to split the filter kwarg generation logic out into a separate function on all the existing filters for it to really be worth it. |
For anybody who's still looking for a solution of this issue, here's another simple alternative from my gist: from django.db.models import Q
from django.db.models.constants import LOOKUP_SEP
from django_filters import rest_framework as django_filters
from django_filters.constants import EMPTY_VALUES
class GroupFilterSet(django_filters.FilterSet):
def filter_queryset(self, queryset):
"""
Group the fitlers by the first join table to
reduce inner join queries for performance.
This would avoid filter chaining like:
`Model.objects.filter(table_foo__id__in=[xx,xx]).filter(table_foo__name__in=[xx,xx])`
Instead, it would be grouped as:
`Model.objects.filter(table_foo__id__in=[xx,xx], table_foo__name__in=[xx,xx])`
Inspired by discussion at:
https://github.com/carltongibson/django-filter/issues/745
https://github.com/carltongibson/django-filter/pull/1167
"""
groups = {}
is_distincted = False
for name, value in self.form.cleaned_data.items():
if value in EMPTY_VALUES:
continue
f = self.filters[name]
# Do not merge Qs for customized filter method due to complexity.
if f._method or not f.__class__.filter == django_filters.Filter.filter:
queryset = self.filters[name].filter(queryset, value)
continue
# Use the joined table name as key
group_name = f.field_name.split(LOOKUP_SEP)[0]
q = Q(**{LOOKUP_SEP.join([f.field_name, f.lookup_expr]): value})
if f.exclude:
q = ~q
# Check if there's any join query with the same table
if group_name in groups:
groups[group_name] = groups[group_name] & q
else:
groups[group_name] = q
if f.distinct:
is_distincted = True
for q in groups.values():
queryset = queryset.filter(q)
if is_distincted:
queryset = queryset.distinct()
return queryset |
I expected this as default django-filter behaviour. Thank you. |
As related in the issue #537, the Django ORM generate duplicated INNER JOINS for each
.filter()
.For example:
Model.objecs.filter(table1__attr1=value1).filter(table1__attr2=value2)
The built query is:
But the correct query should be:
The first query may return unwanted or unexpected results.
Reading the code I found the method
def qs(self)
in classBaseFilterSet
from the file filterset.py and I realized the following workflow:qs = self.queryset.all()
qs = self.queryset.filter(field_n=value_n)
, what causes the described problem.The solution that I've developed for the chained filters problem is:
Instead of making queries for each field, all of them are stored in a dictionary and the queryset is filtered only one time using this dict.
What do you think about this solution? Would it break other parts of the code?
The text was updated successfully, but these errors were encountered: