-
Notifications
You must be signed in to change notification settings - Fork 5
Enable Bulk User Creation for Larger Batches of Users #597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
We require contributors to sign our Contributor License Agreement, and we don't have you on file. In order for us to review and merge your code, please complete this form and we'll get you added and review your contribution as soon as possible. |
spec/features/school_student/creating_a_batch_of_school_students_spec.rb
Show resolved
Hide resolved
3f1625e
to
783ea8e
Compare
This commit allows ProfileApiClient to call the new fast validation endpoint in Profile.
This is deliberately a serial queue.
This commit creates a new operation that validates a batch. It moves all of the validation error handling out of ::CreateBatch, leaving that operation quite simple.
This commmit takes the validation and concurrency control out of create_students_job and puts it in school_students_controller. The idea is that instead of validating then committing one job of 50 students, we: 1. validate the entire batch of N students quickly by calling SchoolStudents::ValidateBatch 2. Split the batch into chunks of 50. 3. Enqueue them in the context of a GoodJob::Batch, ensuring atomic enqueue of all `N/50` jobs. 4. Control concurrency by not allowing creation of another Batch whose description field matches the school ID (this makes up for the fact that GoodJob::Batch does not have a concurrency key like Jobs do).
Switching to `filter_map` ensures that a parameter list of `[]` gets through as `[]` and not `[nil]`.
The validation functionality has been refactored and raised to the controller level that calls CreateBatch, so the tests are at a higher level now.
Recent updates in profile made a small number of changes to the structure of errors and parameters in profile. This commit brings editor-api up to match it.
This commit moves the method SchoolStudentsController#batch_in_progress? to School#import_in_progress? and changes the batch identifier from "school_id:#{school.id}" to simply "#{school.id}". This allows us to expose the import_in_progress field through the API so that we can enable/disable front-end UI components to prevent user frustration.
This change includes the state of current imports for a school in the API response.
This commit changes UserJob from having a 1:1 relationship with GoodJob::Job to tracking the ID of a GoodJob::Batch. This means that the UserJobsController had to change to track the state of a batch and emulate the status messages that a Job used to provide (e.g. runnning, queued, etc.). The front end can now poll for the status of this job.
We were only decrypting students at validation and not creation. Fix.
783ea8e
to
5204b95
Compare
(Rebased on latest main this morning to resolve conflicts.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enables Code Editor to handle bulk user creation for larger batches by implementing a new validation endpoint and breaking large uploads into manageable chunks. The implementation validates the entire batch first, then processes it in chunks of 50 users using GoodJob batches for atomic enqueueing.
- Introduces pre-flight validation using a new Profile API endpoint to validate large batches without creating users
- Implements chunking strategy to split large batches into multiple jobs of 50 users each
- Replaces individual job tracking with batch-based tracking using GoodJob::Batch functionality
Reviewed Changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
lib/profile_api_client.rb | Adds new validate_school_students method for pre-flight validation |
lib/concepts/school_student/validate_batch.rb | New operation class for validating entire student batches |
lib/concepts/school_student/create_batch.rb | Simplified to remove validation logic, now focuses only on job creation |
app/controllers/api/school_students_controller.rb | Updated to validate entire batch first, then chunk and enqueue jobs |
app/models/school.rb | Adds import_in_progress? method to check for active batches |
app/jobs/create_students_job.rb | Simplified job with concurrency handling moved to batch level |
db/migrate/20250925135238_change_user_jobs_to_batch_id.rb | Migration to change user job tracking from individual jobs to batches |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
class ChangeUserJobsToBatchId < ActiveRecord::Migration[7.2] | ||
def change | ||
add_column :user_jobs, :good_job_batch_id, :uuid | ||
remove_column :user_jobs, :good_job_id, :uuid | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will the impact of this be on existing jobs in this table? Presumably we're okay with any jobs that are running when this is released failing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we omit the remove_column
that would allow current jobs to keep working?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See: 0eb4a08
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@patch0 Could we ask for your thoughts on this one?
This test suite checks that the ValidateBatch class correctly handles valid students, invalid batches and other unprocessable types of uploads, such as passwords that are not correctly encrypted.
This should allow existing jobs to continue working until completed.
15a5932
to
0eb4a08
Compare
Status
This user story is in Experience CS #1146 - Adding Students - Increase Upload Limit.
The overall goal of this PR is to allow Code Editor to ingest CSV uploads of new users much larger than the current limit of 50 that is imposed by Profile. However, it was also a goal NOT to change the 50-row limit in the
createStudents
endpoint in Profile.Therefore the high level strategy to solving this problem is:
This PR depends on the branch in PR #1948 in Profile and PR #628 in editor-standalone, which removes the 50-row limit check on the front end.
Points for consideration:
Please could reviewers focus on concurrency safety and any issues that could arise if a batch takes a long time to run?
What's changed?
ProfileApiClient
to use the new/preflight-student-upload
endpoint in Profile.SchoolStudent::CreateBatch
and instead validates the entire upload using the newSchoolStudents::ValidateBatch
operation.GoodJob::Batch
to group individualCreateStudentsJob
and ensure that the enqueue of the entire batch of 50-user creation jobs is atomic.description
field of aGoodJob::Batch
as a kind of concurrency key, since batches don't have concurrency keys. The controller will refuse to enqueue a new batch if there exists a batch whose description matches the school ID and which has not been completed or discarded.Steps to perform after deploying to production
I don't believe deploying this PR will require any additional work.