-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REV-1564: add user-metadata API #25450
Conversation
a1ff34d
to
06ef845
Compare
f705301
to
56b96a0
Compare
95cc38a
to
b3f797d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice improvement to the user metadata
we can test the access together in a few minutes
2d6ae5b
to
e7e6528
Compare
context['forum_roles'] = forum_roles | ||
context['partition_groups'] = user_partitions | ||
|
||
user_metadata = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to break this function into smaller functions to make it easier to follow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes- there is a downside though. Background context: the old code had three different code blocks involved in gathering the user-metadata:
- get_experiment_user_metadata_context (which we'd originally thought would be the whole story)
- get_base_experiment_metadata_context, called by the function above, which then adds authentication/FBE info
- then this other logic living in user_metadata.html which gathers together the pieces we want to dump into the DOM
This separation also resulted in a bit of an inconsistency: most views updated their context from #1, but the dashboard view was only using #3.
When I first ported over this code, I did add it as a separate function that the original one could call, to update the context with the user_metadata item. One clunky side-effect was that we had to add several more arguments: instead of being able to use course and user, we also needed to explicitly pass course_id and course_key. (There were apparently some mixins helping to access those fields on user_metadata.html, and they're available in get_experiment_user_metadata_context, but when you break out into a new function they were no longer defined). But worse than that, it felt differently clunky/confusing to have yet another function helping with the same purpose of just gathering/setting the user-metadata. Especially given the inconsistency and confusion noted above, I think it's better to try and have this logic together in one place.
The existing code is visually long, so I could see one day looking to consolidate/separate parts of it, but the purpose of this ticket is to use the existing code to recreate the user-metadata as an endpoint, so I think that sort of cleanup would be beyond the scope of this ticket, and would add change where this is currently working the way we want.
lms/djangoapps/experiments/utils.py
Outdated
@@ -264,46 +264,118 @@ def get_dashboard_course_info(user, dashboard_enrollments): | |||
|
|||
def get_experiment_user_metadata_context(course, user): | |||
""" | |||
Return a context dictionary with the keys used by the user_metadata.html. | |||
Return a context dictionary with the keys used for Optimizely experiments, exposed via user_metadata.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we be more descriptive with what the expected output should be? For example, listing the keys and a high level description of the expected information for each key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of these fields are super common in lms, but sometimes seeing sample values is the quickest way to understand the context, especially when some terms get repurposed incorrectly in the actual code (cough course key cough).
Since the comments mention that it's exposed via user_metadata.html, I've added a note with the command to view it, so anyone unfamiliar with user-metadata can see it first-hand. That way they can see the exact values. It may be beneficial down the road to have a descriptive glossary the metadata fields (and probably these lms terms in general), but I think that level of documentation-backfill is beyond the scope of the ticket at hand (to expose this existing data to an endpoint for our internal use).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking something like this so it's a little more in line with more modern commenting styles:
Retrieves user + course data used to enrich Optimizely experiments.
Args:
course - course_key
user - username of the currently logged in user
Returns:
{
has_non_audit_enrollments: {},
has_staff_access: boolean,
forum_roles: [],
partition_groups: [],
user_metadata: {}
}
With the return values' keys like this, the next person to look at this doesn't need to either trigger the function to figure out the top level keys or read through the entire function to do so. And if there are things that are confusing, it's might be worth documenting in the description so the next person doesn't need to go through the same troubles you did.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I think my summary was confusing: I was actually suggesting that seeing actual example values gives faster/clearer context (to me at least) than saying an element will contain the "course key"; that in my travels with Matthew it sounds like some of these terms themselves have morphed and created confusion, as a developer getting to this part what I really want to know is "is this value the one that looks like the concatenated string? or the one with three parts?"
To be clear, I agree your format is the normal way to go about this (for normal data), but in this particular case I think the reader will actually get more definitive info from specific values than a short description. If there's developer confusion around these user-metadata fields it may be worth backfilling some of the function comments to describe/differentiate the items in this dictionary, but I think that's beyond the scope of this ticket.
try: | ||
user = get_user_by_username_or_email(username) | ||
except User.DoesNotExist: | ||
message = "Provided user is not found" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be more vague in the error messaging so that we don't give away information about which usernames have accounts and which don't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the intended purpose of Optimizely using this new view for the experiment, an alternate use case could be our internal troubleshooting, in which case knowing whether it was the course vs the user that got a 404 is useful. The user who reaches this code is someone logged in with an internal edX user account.
I think the only case where we'd want to suppress this information would be: a bad actor makes a (non-staff) account, finds this endpoint, and tries to look up another user. This user never reaches the code above; has_permissions is false in the IsStaffOrReadOnlyForSelf permission, so they already have a 403 Forbidden response before anything else happens.
call_args = [lookup_user.username, lookup_course.id] | ||
self.client.login(username=lookup_user.username, password=UserFactory._DEFAULT_PASSWORD) | ||
|
||
call_args_with_bogus_course = [lookup_user.username, 'course-v1:edX+DemoX+Demo_BOGUS'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎱 minor nit: I generally use some sort of random UUID generator or in this case, probably something along the lines of lookup_course.id + "foobar"
to decouple this course.id from whatever the UserFactory() sets for its course.id
, to guarantee that the course.id will be different. That way, if for any reason in the future the lookup_course.id
changes to course-v1:edX+DemoX+Demo_BOGUS
, I don't have to worry about the test failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea for how to be SURE: I remember a case at an old company where we had a candidate/client actually named John Tester, which had unintended consequences :) Updated!
self.client.login(username=staff_user.username, password=UserFactory._DEFAULT_PASSWORD) | ||
|
||
response = self.client.get(reverse('api_experiments:user_metadata', args=call_args)) | ||
self.assertEqual(response.status_code, 200) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth asserting that all the top level keys are returned as well, to make sure it doesn't return a 200 + blank page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure- added some assertions for the presence/values of the top items we expect
@@ -84,3 +91,26 @@ class ExperimentKeyValueViewSet(viewsets.ModelViewSet): | |||
permission_classes = (IsStaffOrReadOnly,) | |||
queryset = ExperimentKeyValue.objects.all() | |||
serializer_class = serializers.ExperimentKeyValueSerializer | |||
|
|||
|
|||
class UserMetaDataView(APIView): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed over Slack - might be worth seeing how other endpoints guard access (maybe there's already a permission / method / pre-access-hook / etc.) so we can follow the same pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The closest-looking permission had been IsStaffOrOwner, but that's used by views that take a viewsets.ModelViewSet object, whereas ours takes an APIView (so we don't have an 'action' attribute that the permission expects). The codebase had several different 'setups' for doing API calls needing different things. It's my understanding from @MatthewPiatetsky that using the ModelViewSet is more geared toward standard crud actions, so the API acts like a database query, returning the specified entry from the database. For this API though, we're doing some processing and getting data from different places, for a get request. So we could probably try to make it work for ModelViewSet but it makes more sense without. (Also IsStaffOrOwner gives permissions to create, while the permission we added was just for reading).
|
||
class IsStaffOrReadOnlyForSelf(BasePermission): | ||
""" | ||
Grants access to staff or to user reading info about their own user |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need staff access if Optimizely makes the request on behalf of the user and any user can access their own data via the new endpoint, including staff users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimizely itself won't need this, but my thinking is:
(a) this is helpful for testing and troubleshooting if we want to call directly to look at results for different users
(b) the neighboring permission class also is used by either a regular user (with one level of access) or admins (with increased access)
6273c0b
to
69649b6
Compare
…iece in the context, using logic formerly done in user_metatdata.html
…pdated tests accordingly
a7ee8cb
to
0e0c223
Compare
jenkins run js |
jenkins run py38 js |
EdX Release Notice: This PR has been deployed to the staging environment in preparation for a release to production. |
EdX Release Notice: This PR has been deployed to the production environment. |
Your PR has finished running tests. The following contexts failed:
|
Our Optimizely experiments make use of values saved to the user-metadata DOM object, which isn't available in the new courseware MFE. We need an endpoint we can call with a course and user, and get back the same user-metadata values typically available in that DOM object.
Changes in this PR to meet that goal:
Today the user-metadata fields are set in user_metadata.html and dumped into the page json. In this PR, I:
The next step when this is merged will be call this from Optimizely, which may involve authentication/permissions updates
Relevant ticket: https://openedx.atlassian.net/browse/REV-1564