Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch jobs #96

Closed
wants to merge 8 commits into from
Closed

Batch jobs #96

wants to merge 8 commits into from

Conversation

grantr
Copy link
Collaborator

@grantr grantr commented Oct 7, 2014

This is a WIP on batch processing (#93).

Adds a BatchJob class that takes payload arrays. BatchJob implements each for iterating through each payload's args. BatchJob can also take single payloads, in which case it wraps the payload in an array making a batch of 1. Default batch size is 1, but jobs can change it like they change queue name.

class SimpleBatchJob < Qu::BatchJob
  batch_size 50

  def perform
    request = []
    each do |arg1, arg2|
      request << [arg1, arg2]
    end
    submit_request(request)
  end

  def submit_request(request)
  end
end

If this is a direction that works for other people, I'll keep working on other aspects like backend support and failure handling.

/cc @bkeepers @jnunemaker @mauricio

grantr added 4 commits October 6, 2014 13:38
Inherit from BatchJob instead of Job, and use each to process payloads:

class MyJob < Qu::BatchJob
  def perform
    do_something_before_processing
    each do |arg1, arg2|
      do_something_with_payload
    end
    do_something_after_processing
  end
end
Avoid overloading the push verb. Let push mean adding a job to the
queue, and append mean adding a payload to a batch job.
BatchJob shouldn't be the one assembling the batch. The payload should
already be a batch.

This allows a BatchJob to work the same with a single payload or batch
payload. A batch payload can be any object that responds to :each.
@jnunemaker
Copy link
Collaborator

Is batch_size how many jobs to pull off the queue at a time? It almost feels like it controls pushing on to the queue as well, which confused me until I looked over the code. Maybe a more descriptive name for that would be better.

I've got a branch locally that would make this easier. It moves everything to be more around a Queue, rather than a bunch of queues, which simplifies payloads with different queues being pushed and/or popped (by simply removing the need for it).

I really need to get a release cut and then think hard about batch. I'm not sure exactly what I want, but I know I'll know it when I see it. Haha. Helpful, I know...

I think what you are doing makes sense, but I haven't thought through it enough to really know.

@grantr
Copy link
Collaborator Author

grantr commented Oct 9, 2014

Is batch_size how many jobs to pull off the queue at a time?

Correct. It's how many payloads the BatchJob can accept at once. If you have a better naming scheme for this I'm all ears. I'm not a huge fan of the names here honestly.

I haven't had to deal with queues much in this branch, but in the batch-pop branch I came to a point where I needed a Queue object that would store queue-specific attributes so backends could get hints about when to use batch pop. Seeing your branch would be very helpful.

This branch is still WIP because I haven't worked out how batch processing will integrate with backends, and what abstractions make sense. I expect it will change a few more times before I'm happy with it.

I do like the user-facing interface though, with each as the primary way to iterate through batch payloads. I don't think that will change much. I may add helper methods that allow users to abort or fail(exception) specific payloads inside the loop.

Planning to work more on this tomorrow.

@jnunemaker
Copy link
Collaborator

This branch is still WIP because I haven't worked out how batch processing will integrate with backends, and what abstractions make sense.

Totally get this. Its tough.

A BatchJob can't be appended to
It's not clear how this will work with backends, so just remove it for
now.
Instead of a Qu::BatchJob subclass, batch processing is now a
Qu::Job::BatchProcessing module. Include the module to get batch
processing primitives.
A PayloadBatch is a collection of payloads intended to be created by a
backend supporting batch pop.

For now, PayloadBatch executes #perform on all payloads in its batch.
@jnunemaker
Copy link
Collaborator

Stale.

@jnunemaker jnunemaker closed this May 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants