-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ActiveRecord::Batches::BatchEnumerator
collections
#409
Support ActiveRecord::Batches::BatchEnumerator
collections
#409
Conversation
@etiennebarrie I've requested your review on this. CI obviously fails because the Job Iteration changes haven't shipped yet, but tests pass with the changes from Shopify/job-iteration#86. Hoping for another review on the Job Iteration PR from Job Patterns. In the meantime, I figured there was no reason to wait on getting feedback on this 😄 |
I'm not so sure about this one, that will cause an additional COUNT query for each iteration, even if we don't use the increment because the last ticker update was recent enough. Inside |
I think we're actually okay here - when we call I actually implemented it using just the batch size as you'd suggested initially, but then I moved away from that because the margin of error can be pretty big - if your batch size is 1000 and you have 1003 records to process, your final tick count will be 2000, which is weird. I think we should be good with using |
Hum, previously the last_record = records.last
update_from_record(last_record) if last_record or change Today it's one query to load the records (in job-iteration) + one query for updating them (in the app code). Ideally we'd move to one query to get the next cursor value + the app code. Hopefully even if it doesn't help much with the SQL, it helps by not loading so much stuff in memory? |
f39ef68
to
a73fb3a
Compare
a73fb3a
to
736550a
Compare
cc @Shopify/rails @Shopify/core-stewardship |
736550a
to
74f8d30
Compare
74f8d30
to
63ab1b1
Compare
4a259ae
to
76f79f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more wording propositions, feel free to tweak. The rest is great 🎉
c826f58
to
723ee89
Compare
Co-authored-by: Étienne Barrié <etienne.barrie@shopify.com>
723ee89
to
d7b7dc5
Compare
Thank you so much for this! 😃 |
Closes: #390, #401
This PR allows users to specify batch collections via the following API:
Users are expected to use
#in_batches
. We useEnumeratorBuilder#active_record_on_batch_relations
under the hood to build an enumerator that yieldsActiveRecord::Relation
s - we get the ActiveRecord collection via@relation
and the batch size via@of
on the Batch Enumerator.I've also tweaked
Ticker#tick
to take an increment, defaulting to 1. InTaskJob#each_iteration
, I've tweaked the call to#tick
to use input.size if inputis_a?(ActiveRecord::Relation)
. This makes it so that tick count is still based on the number of records, rather than the number of batches, if we're processing batches.