Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serializers many to many field not reflecting edits made in PUT/PATCH if prefetch_related used #2442

Closed
chrismcband opened this issue Jan 22, 2015 · 36 comments
Labels
Milestone

Comments

@chrismcband
Copy link

If you use a viewset with a queryset using prefetch_related, any update to those prefetched many to many fields is not reflected in the response to the client, although the changes have been saved to the database.

See below code for very simple example:

from django.conf.urls import patterns, include, url
from django.contrib import admin
from django.contrib.auth.models import User
from rest_framework import serializers, viewsets, routers


class UserSerializer(serializers.ModelSerializer):
    class Meta:
        model = User
        fields = ('id', 'username', 'email', 'groups')


class UserViewSet(viewsets.ModelViewSet):
    queryset = User.objects.all().prefetch_related('groups')
    serializer_class = UserSerializer


router = routers.DefaultRouter()
router.register(r'users', UserViewSet)

urlpatterns = patterns('',
    url(r'^', include(router.urls)),
    url(r'^api-auth/', include('rest_framework.urls',
                               namespace='rest_framework')),
    url(r'^admin/', include(admin.site.urls)),
)

With the above code, assume there is an existing user with 2 groups assigned:

{
    "id": 1, 
    "username": "chris", 
    "email": "chris@example.com", 
    "groups": [
        1,
        2
    ]
}

Submitting a PUT to /users/1/ changing groups to [1], the response will still show groups as [1, 2], but if you then did a GET request for /users/1/ you would see the reflected changes.

You can see this behaviour using the browseable api also.

Removing prefetch_related from the queryset restores normal behaviour and edits are again reflected in the response.

This example is very trivial, but if people are using prefetch_related to try and optimise their database queries they may encounter this issue.

@tomchristie
Copy link
Member

Interesting - nice work tracking that down.
Sounds like it might be pretty gnarly to deal with.

@carltongibson
Copy link
Collaborator

There's a closed Django ticket that looks relevant.

This is not a bug, I'm afraid. The results of DB queries that have already been run are never updated automatically by the Django ORM, and that is by design. ...

Changing this would really require an identity mapper, and a very fundamental change to the way the Django ORM works.

@tomchristie
Copy link
Member

Nice find @carltongibson - I guess we may still want to consider if there's anything more graceful we can do here. This may just fall into documentation, as we're unlikely to want to do try to automagically determine when we hit this case, but I'm open to other proposals that can be demonstrated as feasible and not overly complex.

@carltongibson
Copy link
Collaborator

At first guess I'd suspect it'd hard to do better than "implement get_queryset and check request.method before using prefetch_related"

@tomchristie
Copy link
Member

Good call, yup. Of course this is only an issue on writable nested serializers, so something of an edge case.

@chrismcband
Copy link
Author

Documenting this would solve a lot of potential head scratching for others. The users can avoid this problem by just re-fetching the instance from the database. In the update method in the Serializer as long as they have something like below they will be fine.

def update(self, instance, validated_attrs):
    ...
    instance.save()
    instance = self.Meta.model.objects.get(id=instance.id)
    return instance

@tomchristie
Copy link
Member

Indeed - tho unclear if the extra query there would defeat the point of the prefetch related or not.

@chrismcband
Copy link
Author

True, for my case I was only using prefetch_related to speed up GET on large collections, with some nested serialization, an update on a single resource can do without it, at least in my case.

@pySilver
Copy link

my solution was to to override BaseSerializer (We are still on 2.x.x branch):

def save(self, **kwargs):
        """
        Due to DRF bug with M2M fields we refresh object state from database
        directly if object is models.Model type and it contains m2m fields

        See: https://github.com/tomchristie/django-rest-framework/issues/1556
        """
        self.object = super(BaseSerializer, self).save(**kwargs)

        if isinstance(self, serializers.ModelSerializer):
            model = self.Meta.model
            if model._meta.model._meta.local_many_to_many and self.object.pk:
                self.object = model.objects.get(pk=self.object.pk)
        return self.object

So it only hits database if model had m2m keys. Ideally it should be more complex check, but I was fine with this one due to aggresive caching.

@tomchristie
Copy link
Member

See also #2704, #2727 (and the closed #3028.)
Haven't assessed if both of those should be closed as duplicates of this or not.
Thoughts/review anyone?

@chrismcband
Copy link
Author

It looks like they're all duplicates, except #2704 which seems a slightly different issue. Your suggestion in #3028 of overriding get_queryset to do prefetch only on GET requests, would have worked for my case.

@tomchristie
Copy link
Member

On reflection I think that #2704 and #2727 are duplicates of each other (queryset reevaluation on list field / many=True fields) , but are not the same as this issue. (Output representation does not properly support updates if prefetch_replace has been used.)

@aidanlister
Copy link

This bug just bit us soooo hard! We've had two engineers on it for a week thinking it was a bug in the application not the endpoint.

Example solution:

class InspectionItemViewSet(viewsets.ModelViewSet):
    queryset = InspectionItem.objects.all()

    def get_queryset(self):
        qs = super(InspectionItemViewSet, self).get_queryset()
        if self.action == 'retrieve' or self.action == 'list':
            qs = qs.prefetch_related('tags', 'failures')
            if self.get_depth() >= 1:
                qs = qs.prefetch_related('contractor', 'type')
        return qs

@tomchristie
Copy link
Member

@aidanlister :-/ Prioritizing this as a result. Right now I suppose that the sensible thing to do is just to try to highlight select_related and prefetch_related in the documentation with a suitable warning.

Of course we'll still end up having folks getting hit with this (can't expect that everyone will always have read every last bit of the documentation always, all the time), but at least if we have it documented then it's more likely to be resolved more quickly.

@aidanlister
Copy link

Yeah, given it's not a DRF issue I think a warning in the documentation is probably the best you can do. Keep up the good work Tom!

@aaugustin
Copy link
Contributor

@tomchristie This is the issue we discussed on Tuesday.

@carltongibson
Copy link
Collaborator

@aaugustin @tomchristie — was there any breakthrough?

@aaugustin
Copy link
Contributor

My take on this is -- if a request handler makes changes to an object, it should refetch if from the database before serializing it in the response. Otherwise you're always going to fail on some edge cases like changes performed by database triggers.

@carltongibson
Copy link
Collaborator

OK. That ties in with the conclusion we reached (I think)

@carltongibson
Copy link
Collaborator

Thanks!

@stevelacey
Copy link

@pySilver's solution worked for me in DRF 3.1, just had to switch to setting self.instance rather than self.object (per 2.x.x)

class MyModelSerializer(ModelSerializer):
    def save(self, **kwargs):
        """
        Due to DRF bug with M2M fields we refresh object state from database
        directly if model contains m2m fields

        Avoids ModelSerializers returning old results for m2m relations
        when the queryset had calls to prefetch_related

        See: https://github.com/tomchristie/django-rest-framework/issues/1556
             https://github.com/tomchristie/django-rest-framework/issues/2442
        """
        self.instance = super(MyModelSerializer, self).save(**kwargs)

        model = self.Meta.model
        if model._meta.model._meta.local_many_to_many and self.instance.pk:
            self.instance = model.objects.get(pk=self.instance.pk)

        return self.instance

@manjitkumar
Copy link
Contributor

I am facing similar issue with djangorestframework==3.3.1
http://stackoverflow.com/questions/36397357/how-to-update-values-in-instance-when-have-a-custom-update-to-update-many-to

I have three models Player, Team, and Membership, Where Player and Team have many-to-many relationship using Membership as intermediary model.

class Player(models.Model):
    name = models.CharField(max_length=254)
    rating = models.FloatField(null=True)
    install_ts = models.DateTimeField(auto_now_add=True, blank=True)
    update_ts = models.DateTimeField(auto_now_add=True, blank=True)


class Team(models.Model):
    name = models.CharField(max_length=254)
    rating = models.FloatField(null=True)
    players = models.ManyToManyField(
            Player,
            through='Membership',
            through_fields=('team', 'player'))
    is_active = models.BooleanField(default=True)
    install_ts = models.DateTimeField(auto_now_add=True, blank=True)
    update_ts = models.DateTimeField(auto_now_add=True, blank=True)


class Membership(models.Model):
    team = models.ForeignKey('Team')
    player = models.ForeignKey('Player')
    #date_of_joining = models.DateTimeField()
    install_ts = models.DateTimeField(auto_now_add=True, blank=True)
    update_ts = models.DateTimeField(auto_now_add=True, blank=True)

Now I was required to update this membership using django rest framework. I tried update those using Writable nested serializers by writing a custom .update() of team serializer.

@transaction.atomic
def update(self, instance, validated_data):
    '''
    Cutomize the update function for the serializer to update the
    related_field values.
    '''

    if 'memberships' in validated_data:
        instance = self._update_membership(instance, validated_data)

        # remove memberships key from validated_data to use update method of
        # base serializer class to update model fields
        validated_data.pop('memberships', None)

    return super(TeamSerializer, self).update(instance, validated_data)


def _update_membership(self, instance, validated_data):
    '''
    Update membership data for a team.
    '''
    memberships = self.initial_data.get('memberships')
    if isinstance(membership, list) and len(memberships) >= 1:
        # make a set of incoming membership
        incoming_player_ids = set()

        try:
            for member in memberships:
                incoming_player_ids.add(member['id'])
        except:
            raise serializers.ValidationError(
                'id is required field in memberships objects.'
            )

        Membership.objects.filter(
            team_id=instance.id
        ).delete()

        # add merchant member mappings
        Membership.objects.bulk_create(
            [
                Membership(
                    team_id=instance.id,
                    player_id=player
                )
                for player in incoming_player_ids
            ]
        )
        return instance
    else:
        raise serializers.ValidationError(
                'memberships is not a list of objects'
            )

Now this works well for updating values in database for membership table. Only problem I am facing is that I am not able to update prefetched instance in memory, which on PATCH request to this API updates values in database but API response is showing outdated data.

@mosasiru
Copy link

I have encountered the save problem.
What' the status?

@tomchristie
Copy link
Member

Bumping the milestone on this to re-assess. Not at all clear what we can do, but needs looking at.

@tomchristie
Copy link
Member

So clearly we need to refetch in this case, but... can we determine if a prefetch_related has been applied to the queryset?

@tomchristie
Copy link
Member

@xordoquy
Copy link
Collaborator

?

@carltongibson
Copy link
Collaborator

Mists of time... I recall referencing a closed Django issue on this one... It said to the effect, "User needs to be aware, and prepared to refresh themselves if using prefetch_related"

(Would find issue but Current status is: Ripped front off of iPad whilst fixing screen, so I can't really 😬)

@tomchristie
Copy link
Member

@carltongibson - Yup, but we're doing this in the generic views, so the user isn't able to handle it themselves. We either need to see "don't use prefetch_related in XYZ situation" which is complicated, and users will anyways miss. Or we automatically handle the re-fetch on their behalf.

@tomchristie
Copy link
Member

Ripped front off of iPad whilst fixing screen

"fixing" :-/

@carltongibson
Copy link
Collaborator

carltongibson commented Oct 10, 2016

"Replacing" then :-)

Something like

we_need_to_do_a_refresh = hasattr(obj, '_prefetched_objects_cache')

Is that the thought?

@tomchristie
Copy link
Member

Yup, something along those lines.

@gkpo
Copy link

gkpo commented Oct 10, 2016

How can I achieve this manually in the meantime? Is the right place to handle this in the serializer or in the views?

@tomchristie
Copy link
Member

Simplest solution is don't use prefetch related on PUT/PATCH - you can handle that in get_queryset.

Alternatively write the view manually, and have a second call to get_object which is used for populating the response data.

@xordoquy
Copy link
Collaborator

@karesh The discussion group is the best place to take this discussion and other usage questions. Thanks!

@karesh
Copy link

karesh commented Oct 26, 2016

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests