Support topology deactivation period before scaling #1390

billonahill · 2016-09-14T21:40:28Z

This is a simple way to ensure stateful data correctness for many topologies, via @pankajgupta.

The approach used for scaling at Twitter currently is the following:

Deactivate the topology for X seconds. This halts new tuples from being emitted by spouts and causes time-based caches to be drained, where X > the cache interval.
Redeploy the topology as activated. New tuples are emitted and processed by the correct instances when using fields grouping. Caches are initially empty (i.e. drained) and correct.

A similar feature should be built into the scaling event. The topology update command would take an optional parameter to indicate that before the scaling event occurs, deactivate the topology for X seconds. The default deactivation interval could also be provided in the topology config, which would allow frameworks built on top of Heron to inject this setting based on their configured cache settings. This would eliminate the need for the user to set this properly.

Related to #1292.

billonahill added this to the 0.14.4 milestone Sep 14, 2016

billonahill self-assigned this Sep 14, 2016

billonahill added the new feature label Sep 16, 2016

billonahill mentioned this issue Sep 21, 2016

Deactivate topology before scaling #1412

Merged

billonahill closed this as completed in #1412 Sep 30, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support topology deactivation period before scaling #1390

Support topology deactivation period before scaling #1390

billonahill commented Sep 14, 2016

Support topology deactivation period before scaling #1390

Support topology deactivation period before scaling #1390

Comments

billonahill commented Sep 14, 2016