Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow manual flushing of a batcher with flushBatch method #109

Closed

Conversation

iceberg901
Copy link
Contributor

Implements part of a TODO item listed in comments in BatchExecutor.scala

Future.batched now returns an instance of class Batcher, which has the added method flushBatch to allow manual flushing/execution of remaining items in the batcher.

@@ -501,6 +501,10 @@ def join[%s](%s): Future[(%s)] = join(Seq(%s)) map { _ => (%s) }""".format(
* ...
* }
*
* To force the batcher to immediately process all unprocessed requests:
*
* batcher.flushBatch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add parens for mutating no-arg methods

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@mosesn
Copy link
Contributor

mosesn commented Sep 4, 2014

As far as I can tell, Batcher is a thin shim around BatchExecutor. Maybe we want to just expose BatchExecutor directly? Also, can you add some tests?

We should maybe consider rewriting some of BatchExecutor too . . . some bits of it are pretty weird.

@mosesn
Copy link
Contributor

mosesn commented Feb 3, 2015

@iceberg901 do you want to continue with this PR?

@iceberg901
Copy link
Contributor Author

@mosesn Yes I do, I'll start working on addressing your comments.

@mosesn
Copy link
Contributor

mosesn commented Feb 8, 2015

Awesome, let me know if you need any help.

@iceberg901
Copy link
Contributor Author

@mosesn No, I don't think we want to return BatchExecutor. Returning a wrapper of some kind gives us the luxury of providing a clean, simple interface not clouded by scary implementation details. I assume that's why the original implementer(s) didn't return BatchExecutor in the first place. However, I think the current interface doesn't provide enough functionality, so I think the right answer is to return Batcher and to keep augmenting it if ever we want to expose additional functionality.

@iceberg901 iceberg901 force-pushed the batched-futures-manual-flush branch from 7f4b06f to f667193 Compare February 9, 2015 10:30
@iceberg901
Copy link
Contributor Author

Ok, I've addressed all your comments except the one about not returning a new type from Future.batched. Please let me know how you want to proceed there. And give me feedback on the tests please. Finally, all of my tests are green but there seems to be some problems with your test setup in general - red tests, exceptions getting thrown, etc. Should I worry about this or does it all get taken care of when you integrate?

@mosesn
Copy link
Contributor

mosesn commented Feb 9, 2015

Don't worry about the tests, it's something wrong with a new feature of travis-ci that we're trying out.

I agree with you that exposing BatchExecutor directly would be a mistake, but changing the return type of Future#batched also definitely breaks binary compatibility. With that said, we've relaxed how we feel about breaking binary compatibility in the last few months, so we can try merging it in and see how difficult it is.


import scala.collection.mutable

/** Provides a clean, lightweight interface for controlling a BatchExecutor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be formatted like so:

/**
  * Provides ....
  * ...
  */

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you mentioned before, BatchExecutor is an internal implementation detail, so when giving folks instructions in how to use Batcher, we shouldn't ask them to understand BatchExecutor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So passing the BatchExecutor as a constructor parameter was your idea to reduce the clutter of having to pass all of the BatchExecutor's constructor params through two consecutive constructor methods. And no one's creating Batchers directly, they're getting them from Future.batched.

So I guess I'm wondering what you think now: should we back off having the BatchExecutor as a parameter to the Batcher constructor, or should we just make the constructor private[util] and leave it at that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think BatchExecutor as a parameter to the Batcher constructor is still a good idea, but Future.batched should be the only way to get a Batcher. Let's make the constructor private[util]. 🚢

@mosesn
Copy link
Contributor

mosesn commented Feb 10, 2015

LGTM, thanks for bearing with me! I'll try to get more eyes on this and then we can merge it in.

@iceberg901
Copy link
Contributor Author

Thanks! Nice combing through the details with you :)

executor: BatchExecutor[In, Out]
)(
implicit timer: Timer
) extends Function1[In, Future[Out]] { batcher =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we extend In => Future[Out] instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong feeling about this. I think the reason I went with Function1 was that I was having trouble getting it to compile the other way. I've since found that I can get it to work only with parens:

) extends (In => Future[Out]) { batcher =>

@mosesn @vkostyukov please let me know if you want me to make this change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think its more idiomatic as @vkostyukov wrote it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok two votes for In => Future[Out], so I changed it.

@vkostyukov
Copy link
Contributor

Looks good to me! @iceberg901 do you mind to squash the commits of this PR into one? You can use git rebase -i and then git --force push origin your-branch.

@mosesn
Copy link
Contributor

mosesn commented Feb 10, 2015

@vkostyukov git review submit squashes the changes for us, so I don't think it should matter?

@vkostyukov
Copy link
Contributor

@mosesn cool! Nevermind then.

@iceberg901
Copy link
Contributor Author

Please let me know if you want me to make the change on Batcher above. I take it I don't need to squash the commits? Anything else I need to do?

@mosesn
Copy link
Contributor

mosesn commented Feb 11, 2015

Could you make the change to the ChangeLog that Steve mentioned?

@iceberg901
Copy link
Contributor Author

Sure, would you put it under New Features or API Changes? Or do I describe the flushBatch method under New Features and note the change of return type of Future.batched under API Changes?

@mosesn
Copy link
Contributor

mosesn commented Feb 11, 2015

Yeah, I think both is a good idea.

batcher(5)
batcher.flushBatch()

verify(f, times(1)).apply(Seq(1,2,3,4))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move the first verify (ln 334) above the flushBatch call? that would make the behavior clearer because as-is it looks like neither batch is dispatched until flushing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@iceberg901
Copy link
Contributor Author

@mosesn Changelog updated, what else?

@dschobel
Copy link
Contributor

this LGTM

def apply(t: In): Future[Out] = executor.enqueue(t)

/** Immediately processes all unprocessed requests */
def flushBatch(): Unit = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there are guarantees regarding ordering and such, they should be documented. its not obvious from reading the diff why you synchronize on executor here.

and if there are guarantees, what prevents others callers from using the executor incorrectly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I did not implement the BatchExecutor, but I make no assumptions about order of execution of the individual requests within a batch. Since there is a one-to-one correlation between the requests I submit and the Futures returned, I as a client can at least decide what order I want to process the results in, regardless of what order they complete.
  • As far as why I synchronize, the inline documentation for BatchExecutor tells me I must do so when calling .flushBatch() (BatchExecutor.scala line 107).
  • Finally, how do we prevent other callers from using the executor incorrectly: by making BatchExecutor private[util] (this was done already by whoever implemented BatchExecutor) and only providing access to it through the Batcher interface (which is what I am adding).

If there is some part of what I've just explained here that you would like me to document in the code, please let me know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explanation, i'm new to BatchExecutor — that's a funky interface.

i think a comment in your code as to why you synchronize there would be quite helpful for future maintainers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's how I decided to address this:

I moved the synchronization inside a new method in the BatchExecutor called .flushNow(). So, Batcher.flushBatch() just calls BatchExecutor.flushNow()

This is good IMO because:

  • A person new to the code will find all synchronization logic inside of one file, BatchExecutor.scala, and they will easily be able to see that all calls to BatchExecutor.flushBatch are wrapped in synchronize blocks.
  • If that person still doesn't understand what's going on, the comments explaining that synchronization is required are in the same file, not a a different file.
  • Batcher.scala stays simple

Does this address your concerns?

@mosesn
Copy link
Contributor

mosesn commented Feb 19, 2015

LGTM

@kevinoliver
Copy link
Contributor

lgtm too, thanks for your patience working with us. i'll try to get this patch rolling internally today.

@kevinoliver
Copy link
Contributor

Ok, this just got merged locally and will show up on the develop branch soon, once we get some issues on our side sorted out. Thanks again for the patch and your patience.

@iceberg901
Copy link
Contributor Author

Great thanks! For future reference, should I open PRs against the develop branch?

@mosesn
Copy link
Contributor

mosesn commented Feb 20, 2015

Yes, but this PR predates the develop branch so no worries! 👍

Edit: I just checked the CONTRIBUTING.md file, and looks like we still ask people to make a PR against master. I'll fix it.

@mosesn
Copy link
Contributor

mosesn commented Mar 3, 2015

🍨 this hit develop woo 616479c

@mosesn mosesn closed this Mar 3, 2015
@iceberg901
Copy link
Contributor Author

Awesome, thanks! Do you have a general timetable for your next release?

@mosesn
Copy link
Contributor

mosesn commented Mar 4, 2015

Not yet, but we'll send out an email on the finaglers listserv when we do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants