Cap scaling increase/decrease by X% of existing parallelism #1422

billonahill · 2016-09-23T23:12:14Z

While testing I almost increased parallelism by an order of magnitude more than intended when I had a typo in my command. We should put some guardrails in place to limit just how much we can scale in one shot.

Relates to #1292.

wangli1426 · 2016-09-26T08:46:27Z

Hi @billonahill,
This feature is very interesting. Actually I have done something similar thing on Apache Storm, to adjust the parallelism of a bolt according to its instantaneous workload. If you are interested, we may have discussion for more details.

Thanks.

billonahill · 2016-09-26T17:13:23Z

@wangli1426 this ticket is related to a user manually issuing a parallelism change using the heron update command. Are you referring to auto-scaling functionality, where component parallelism is dynamically adjusted based on current load? I'd be interested in learning the approach used to algorithmically determine that a component should be scaled up or down. This is something we plan to tackle.

wangli1426 · 2016-10-17T03:05:37Z

Hi @billonahill,

Yes, I was referring to the auto-scaling functionalities, which consists of two parts: (1) the mechanism achieving scaling operators and (2) the algorithm to determine the optimal parallelism of the operators.

For (1), to support scaling of stateful operators, we model the operator state as key-value pairs. And the scaling is done by re-partitioning the operator state. I will send you our paper about run-time operator scaling if you are interested. I am wondering how you plan to deal with operator state when scaling up or down. Will you store the operator state to a persistent store before scaling? Will you consider live scaling without deactivating the current topology?

For (2), we have a paper that utilizes the Queueing Theory to determine the optimal parallelism of each operator given the user-defined tuple processing latency constraint. The first step of the algorithm is to reason about the minimal parallelism of each operator to avoid being the performance bottleneck. This is can be easily computed by $\lambda/u$, where $\lambda$ and $u$ is the arrival rate and the operator currently processing rate respectively.

I have implemented the auto-scaling functionality based on Storm. And I am willing to make contributions to Heron on this feature.

Thanks.
Li

kramasamy · 2016-10-17T03:26:53Z

@wangli1426 - this is definitely interesting. @billonahill and @avflor - can you please take a look at this proposal?

avflor · 2016-10-17T03:31:18Z

@kramasamy Sure, I can definitely read the paper an provide feedback.
@wangli1426 Thanks for pointing us to the paper.

wangli1426 · 2016-10-17T03:33:57Z

@avflor You are welcome.

billonahill · 2016-10-17T18:38:09Z

Thanks @wangli1426. For (1) we currently do not provide guarantees about local state during scaling events. This is something we'd like to tackle though as a general effort to provide stateful durability. It would be useful during scaling but also during routine failures.

For (2) we'll certainly check out that paper. Let's move further discussion on that topic to #1389, which tracks the auto-scalding algorithmic work.

wangli1426 · 2016-10-18T03:37:37Z

Hi @billonahill @kramasamy @avflor,

I proposed a new operator for live scaling in #1499, please review.

Thanks.

billonahill mentioned this issue Oct 17, 2016

Autoscaling algorithm to automatically adjust parallelism depending on the load #1389

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cap scaling increase/decrease by X% of existing parallelism #1422

Cap scaling increase/decrease by X% of existing parallelism #1422

billonahill commented Sep 23, 2016

wangli1426 commented Sep 26, 2016

billonahill commented Sep 26, 2016

wangli1426 commented Oct 17, 2016 •

edited

Loading

kramasamy commented Oct 17, 2016

avflor commented Oct 17, 2016

wangli1426 commented Oct 17, 2016

billonahill commented Oct 17, 2016

wangli1426 commented Oct 18, 2016

Cap scaling increase/decrease by X% of existing parallelism #1422

Cap scaling increase/decrease by X% of existing parallelism #1422

Comments

billonahill commented Sep 23, 2016

wangli1426 commented Sep 26, 2016

billonahill commented Sep 26, 2016

wangli1426 commented Oct 17, 2016 • edited Loading

kramasamy commented Oct 17, 2016

avflor commented Oct 17, 2016

wangli1426 commented Oct 17, 2016

billonahill commented Oct 17, 2016

wangli1426 commented Oct 18, 2016

wangli1426 commented Oct 17, 2016 •

edited

Loading