You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.
This is a simple way to ensure stateful data correctness for many topologies, via @pankajgupta.
The approach used for scaling at Twitter currently is the following:
Deactivate the topology for X seconds. This halts new tuples from being emitted by spouts and causes time-based caches to be drained, where X > the cache interval.
Redeploy the topology as activated. New tuples are emitted and processed by the correct instances when using fields grouping. Caches are initially empty (i.e. drained) and correct.
A similar feature should be built into the scaling event. The topology update command would take an optional parameter to indicate that before the scaling event occurs, deactivate the topology for X seconds. The default deactivation interval could also be provided in the topology config, which would allow frameworks built on top of Heron to inject this setting based on their configured cache settings. This would eliminate the need for the user to set this properly.
This is a simple way to ensure stateful data correctness for many topologies, via @pankajgupta.
The approach used for scaling at Twitter currently is the following:
A similar feature should be built into the scaling event. The topology update command would take an optional parameter to indicate that before the scaling event occurs, deactivate the topology for X seconds. The default deactivation interval could also be provided in the topology config, which would allow frameworks built on top of Heron to inject this setting based on their configured cache settings. This would eliminate the need for the user to set this properly.
Related to #1292.
The text was updated successfully, but these errors were encountered: