-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow manual flushing of a batcher with flushBatch method #109
Allow manual flushing of a batcher with flushBatch method #109
Conversation
@@ -501,6 +501,10 @@ def join[%s](%s): Future[(%s)] = join(Seq(%s)) map { _ => (%s) }""".format( | |||
* ... | |||
* } | |||
* | |||
* To force the batcher to immediately process all unprocessed requests: | |||
* | |||
* batcher.flushBatch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add parens for mutating no-arg methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
As far as I can tell, Batcher is a thin shim around BatchExecutor. Maybe we want to just expose BatchExecutor directly? Also, can you add some tests? We should maybe consider rewriting some of BatchExecutor too . . . some bits of it are pretty weird. |
@iceberg901 do you want to continue with this PR? |
@mosesn Yes I do, I'll start working on addressing your comments. |
Awesome, let me know if you need any help. |
@mosesn No, I don't think we want to return BatchExecutor. Returning a wrapper of some kind gives us the luxury of providing a clean, simple interface not clouded by scary implementation details. I assume that's why the original implementer(s) didn't return BatchExecutor in the first place. However, I think the current interface doesn't provide enough functionality, so I think the right answer is to return Batcher and to keep augmenting it if ever we want to expose additional functionality. |
7f4b06f
to
f667193
Compare
Ok, I've addressed all your comments except the one about not returning a new type from Future.batched. Please let me know how you want to proceed there. And give me feedback on the tests please. Finally, all of my tests are green but there seems to be some problems with your test setup in general - red tests, exceptions getting thrown, etc. Should I worry about this or does it all get taken care of when you integrate? |
Don't worry about the tests, it's something wrong with a new feature of travis-ci that we're trying out. I agree with you that exposing BatchExecutor directly would be a mistake, but changing the return type of Future#batched also definitely breaks binary compatibility. With that said, we've relaxed how we feel about breaking binary compatibility in the last few months, so we can try merging it in and see how difficult it is. |
|
||
import scala.collection.mutable | ||
|
||
/** Provides a clean, lightweight interface for controlling a BatchExecutor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be formatted like so:
/**
* Provides ....
* ...
*/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you mentioned before, BatchExecutor
is an internal implementation detail, so when giving folks instructions in how to use Batcher
, we shouldn't ask them to understand BatchExecutor
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So passing the BatchExecutor as a constructor parameter was your idea to reduce the clutter of having to pass all of the BatchExecutor's constructor params through two consecutive constructor methods. And no one's creating Batchers directly, they're getting them from Future.batched.
So I guess I'm wondering what you think now: should we back off having the BatchExecutor as a parameter to the Batcher constructor, or should we just make the constructor private[util] and leave it at that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think BatchExecutor as a parameter to the Batcher constructor is still a good idea, but Future.batched should be the only way to get a Batcher. Let's make the constructor private[util]. 🚢
LGTM, thanks for bearing with me! I'll try to get more eyes on this and then we can merge it in. |
Thanks! Nice combing through the details with you :) |
executor: BatchExecutor[In, Out] | ||
)( | ||
implicit timer: Timer | ||
) extends Function1[In, Future[Out]] { batcher => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we extend In => Future[Out]
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong feeling about this. I think the reason I went with Function1 was that I was having trouble getting it to compile the other way. I've since found that I can get it to work only with parens:
) extends (In => Future[Out]) { batcher =>
@mosesn @vkostyukov please let me know if you want me to make this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think its more idiomatic as @vkostyukov wrote it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok two votes for In => Future[Out]
, so I changed it.
Looks good to me! @iceberg901 do you mind to squash the commits of this PR into one? You can use |
@vkostyukov git review submit squashes the changes for us, so I don't think it should matter? |
@mosesn cool! Nevermind then. |
Please let me know if you want me to make the change on Batcher above. I take it I don't need to squash the commits? Anything else I need to do? |
Could you make the change to the ChangeLog that Steve mentioned? |
Sure, would you put it under New Features or API Changes? Or do I describe the flushBatch method under New Features and note the change of return type of Future.batched under API Changes? |
Yeah, I think both is a good idea. |
batcher(5) | ||
batcher.flushBatch() | ||
|
||
verify(f, times(1)).apply(Seq(1,2,3,4)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move the first verify (ln 334) above the flushBatch call? that would make the behavior clearer because as-is it looks like neither batch is dispatched until flushing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@mosesn Changelog updated, what else? |
this LGTM |
def apply(t: In): Future[Out] = executor.enqueue(t) | ||
|
||
/** Immediately processes all unprocessed requests */ | ||
def flushBatch(): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there are guarantees regarding ordering and such, they should be documented. its not obvious from reading the diff why you synchronize on executor here.
and if there are guarantees, what prevents others callers from using the executor incorrectly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I did not implement the BatchExecutor, but I make no assumptions about order of execution of the individual requests within a batch. Since there is a one-to-one correlation between the requests I submit and the Futures returned, I as a client can at least decide what order I want to process the results in, regardless of what order they complete.
- As far as why I synchronize, the inline documentation for BatchExecutor tells me I must do so when calling .flushBatch() (BatchExecutor.scala line 107).
- Finally, how do we prevent other callers from using the executor incorrectly: by making BatchExecutor private[util] (this was done already by whoever implemented BatchExecutor) and only providing access to it through the Batcher interface (which is what I am adding).
If there is some part of what I've just explained here that you would like me to document in the code, please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the explanation, i'm new to BatchExecutor — that's a funky interface.
i think a comment in your code as to why you synchronize there would be quite helpful for future maintainers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's how I decided to address this:
I moved the synchronization inside a new method in the BatchExecutor called .flushNow(). So, Batcher.flushBatch() just calls BatchExecutor.flushNow()
This is good IMO because:
- A person new to the code will find all synchronization logic inside of one file, BatchExecutor.scala, and they will easily be able to see that all calls to BatchExecutor.flushBatch are wrapped in synchronize blocks.
- If that person still doesn't understand what's going on, the comments explaining that synchronization is required are in the same file, not a a different file.
- Batcher.scala stays simple
Does this address your concerns?
LGTM |
lgtm too, thanks for your patience working with us. i'll try to get this patch rolling internally today. |
Ok, this just got merged locally and will show up on the develop branch soon, once we get some issues on our side sorted out. Thanks again for the patch and your patience. |
Great thanks! For future reference, should I open PRs against the develop branch? |
Yes, but this PR predates the Edit: I just checked the CONTRIBUTING.md file, and looks like we still ask people to make a PR against master. I'll fix it. |
🍨 this hit develop woo 616479c |
Awesome, thanks! Do you have a general timetable for your next release? |
Not yet, but we'll send out an email on the finaglers listserv when we do it. |
Implements part of a TODO item listed in comments in BatchExecutor.scala
Future.batched now returns an instance of class Batcher, which has the added method flushBatch to allow manual flushing/execution of remaining items in the batcher.