Skip to content

Conversation

@gsidebo
Copy link
Contributor

@gsidebo gsidebo commented Aug 5, 2016

What are the relevant tickets?

#778

What's this PR do?

Changes our indexing to index one record per user per program. A user enrolled in two programs, for example, will now have two records in the index. The enrollment/certificate/etc data for each of those records would only be those that are associated with the appropriate program.

Where should the reviewer start?

Probably search/api.py. This might be one that you want to call me over to co-review at some point

How should this be manually tested?

Load the realistic users if you haven't already, the run the recreate_index manage command. Once you've done that, you should be able to query ES and see the new indexing format

@bdero bdero temporarily deployed to micromasters-ci-pr-811 August 5, 2016 02:51 Inactive
@@ -1,54 +0,0 @@
"""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved these test cases to search/api_test.py since it's testing search/api.py functionality

@noisecapella noisecapella self-assigned this Aug 5, 2016
search/tasks.py Outdated
Args:
users (iterable of User):
Iterable of Users
program_enrollments: Iterable of ProgramEnrollments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a type here for program_enrollments? It should go in parentheses after the name according to the sphinx-napoleon convention: http://edx.readthedocs.io/projects/edx-developer-guide/en/latest/style_guides/python_guidelines.html#docstrings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

@bdero bdero temporarily deployed to micromasters-ci-pr-811 August 5, 2016 17:27 Inactive
# the given User and Program. Instead of creating a new record with the factory, we
# will create the necessary objects to trigger its creation.
user = kwargs['user'] = kwargs.get('user', UserFactory.create())
program = kwargs['program'] = kwargs.get('program', ProgramFactory.create())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you assign a copy of the objects to kwargs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

part of an earlier refactor idea. i'll remove

@giocalitri
Copy link
Contributor

I have some comments.

I also think all the refactoring looks good, but you could have postponed at least some parts of it.

@giocalitri giocalitri self-assigned this Aug 5, 2016
@bdero bdero temporarily deployed to micromasters-ci-pr-811 August 5, 2016 18:56 Inactive
@bdero bdero temporarily deployed to micromasters-ci-pr-811 August 5, 2016 19:21 Inactive
@gsidebo
Copy link
Contributor Author

gsidebo commented Aug 5, 2016

i'm getting an error in searchkit. might be data-related. i'll look into it

@noisecapella
Copy link
Contributor

Searchkit works fine for me

@gsidebo
Copy link
Contributor Author

gsidebo commented Aug 5, 2016

yep, i had bad data in the index

program = instance.course_run.course.program
program_enrollment = ProgramEnrollment.objects.filter(user=user, program=program).first()
if not program_enrollment:
index_users.delay([user])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just get rid of index_users? It seems like if there is no ProgramEnrollment for that user, the index should not need to know about that user at all

Copy link
Contributor Author

@gsidebo gsidebo Aug 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep in mind that this is looking for a ProgramEnrollment for a User and a Program. A User could have other ProgramEnrollments. your comment does point out an issue, though. i had this code in place to handle cases where the ProgramEnrollment had already been deleted by an earlier signal, but that in itself should trigger a reindexing. ill get rid of this and change up the enrollment signal as well

@bdero bdero temporarily deployed to micromasters-ci-pr-811 August 5, 2016 20:25 Inactive
@bdero bdero temporarily deployed to micromasters-ci-pr-811 August 5, 2016 20:25 Inactive

def test_user_add(self):
"""
Test that a newly created User is indexed properly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the docstrings for each of these tests?

@bdero bdero temporarily deployed to micromasters-ci-pr-811 August 5, 2016 20:56 Inactive
@bdero bdero temporarily deployed to micromasters-ci-pr-811 August 5, 2016 20:58 Inactive
@bdero bdero temporarily deployed to micromasters-ci-pr-811 August 5, 2016 21:43 Inactive
with patch('search.api._index_program_enrolled_users_chunk', autospec=True, return_value=0) as index_chunk:
index_program_enrolled_users(program_enrollments, chunk_size=4)
assert index_chunk.call_count == 3
index_chunk.assert_any_call(program_enrollments[0:4])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@noisecapella @giocalitri i refactored this test case (test_index_program_enrolled_users, formerly test_index_users). this was taking 2.5 sec with calls to ES and to the database, and all it was uniquely testing was that the 'index_' function could handle an iterable. let me know if you have a problem with this refactor

@bdero bdero temporarily deployed to micromasters-ci-pr-811 August 8, 2016 14:45 Inactive
"""
program_enrollments = ProgramEnrollment.objects.filter(user=user).select_related('user', 'program').all()
for program_enrollment in program_enrollments:
remove_program_enrolled_user(program_enrollment)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the code coverage report it looks like this line is not triggered by tests. Can you add or adjust a test to cover this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. just added a test

@bdero bdero temporarily deployed to micromasters-ci-pr-811 August 8, 2016 15:09 Inactive
@giocalitri
Copy link
Contributor

the functionality looks good to me 👍 for my part of the review

@noisecapella
Copy link
Contributor

I'm happy too 👍

@gsidebo gsidebo force-pushed the 778_program_user_indexing branch from dbf41da to 663d872 Compare August 8, 2016 16:47
@bdero bdero requested a deployment to micromasters-ci-pr-811 August 8, 2016 16:47 Pending
@gsidebo gsidebo merged commit 78ccdc6 into master Aug 8, 2016
@gsidebo gsidebo deleted the 778_program_user_indexing branch August 8, 2016 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants