-
Notifications
You must be signed in to change notification settings - Fork 17
Changed indexing structure and logic to use a user's program enrollments #811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @@ -1,54 +0,0 @@ | |||
| """ | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved these test cases to search/api_test.py since it's testing search/api.py functionality
search/tasks.py
Outdated
| Args: | ||
| users (iterable of User): | ||
| Iterable of Users | ||
| program_enrollments: Iterable of ProgramEnrollments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a type here for program_enrollments? It should go in parentheses after the name according to the sphinx-napoleon convention: http://edx.readthedocs.io/projects/edx-developer-guide/en/latest/style_guides/python_guidelines.html#docstrings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
dashboard/factories.py
Outdated
| # the given User and Program. Instead of creating a new record with the factory, we | ||
| # will create the necessary objects to trigger its creation. | ||
| user = kwargs['user'] = kwargs.get('user', UserFactory.create()) | ||
| program = kwargs['program'] = kwargs.get('program', ProgramFactory.create()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you assign a copy of the objects to kwargs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
part of an earlier refactor idea. i'll remove
|
I have some comments. I also think all the refactoring looks good, but you could have postponed at least some parts of it. |
|
i'm getting an error in searchkit. might be data-related. i'll look into it |
|
Searchkit works fine for me |
|
yep, i had bad data in the index |
dashboard/signals.py
Outdated
| program = instance.course_run.course.program | ||
| program_enrollment = ProgramEnrollment.objects.filter(user=user, program=program).first() | ||
| if not program_enrollment: | ||
| index_users.delay([user]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just get rid of index_users? It seems like if there is no ProgramEnrollment for that user, the index should not need to know about that user at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep in mind that this is looking for a ProgramEnrollment for a User and a Program. A User could have other ProgramEnrollments. your comment does point out an issue, though. i had this code in place to handle cases where the ProgramEnrollment had already been deleted by an earlier signal, but that in itself should trigger a reindexing. ill get rid of this and change up the enrollment signal as well
search/api_test.py
Outdated
|
|
||
| def test_user_add(self): | ||
| """ | ||
| Test that a newly created User is indexed properly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the docstrings for each of these tests?
| with patch('search.api._index_program_enrolled_users_chunk', autospec=True, return_value=0) as index_chunk: | ||
| index_program_enrolled_users(program_enrollments, chunk_size=4) | ||
| assert index_chunk.call_count == 3 | ||
| index_chunk.assert_any_call(program_enrollments[0:4]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@noisecapella @giocalitri i refactored this test case (test_index_program_enrolled_users, formerly test_index_users). this was taking 2.5 sec with calls to ES and to the database, and all it was uniquely testing was that the 'index_' function could handle an iterable. let me know if you have a problem with this refactor
| """ | ||
| program_enrollments = ProgramEnrollment.objects.filter(user=user).select_related('user', 'program').all() | ||
| for program_enrollment in program_enrollments: | ||
| remove_program_enrolled_user(program_enrollment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the code coverage report it looks like this line is not triggered by tests. Can you add or adjust a test to cover this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. just added a test
|
the functionality looks good to me 👍 for my part of the review |
|
I'm happy too 👍 |
292f496 to
dbf41da
Compare
dbf41da to
663d872
Compare
What are the relevant tickets?
#778
What's this PR do?
Changes our indexing to index one record per user per program. A user enrolled in two programs, for example, will now have two records in the index. The enrollment/certificate/etc data for each of those records would only be those that are associated with the appropriate program.
Where should the reviewer start?
Probably search/api.py. This might be one that you want to call me over to co-review at some point
How should this be manually tested?
Load the realistic users if you haven't already, the run the recreate_index manage command. Once you've done that, you should be able to query ES and see the new indexing format