Yelp · EvanKrall · Sep 11, 2024 · Sep 11, 2024 · Sep 11, 2024 · Sep 11, 2024
diff --git a/docs/source/autoscaling.rst b/docs/source/autoscaling.rst
@@ -67,8 +67,11 @@ The currently available metrics providers are:
 
 :uwsgi:
   With the ``uwsgi`` metrics provider, Paasta will configure your pods to be scraped from your uWSGI master via its `stats server <http://uwsgi-docs.readthedocs.io/en/latest/StatsServer.html>`_.
+  Setpoint refers to the worker utilization, which is the percentage of workers that are busy.
   We currently only support uwsgi stats on port 8889, and Prometheus will attempt to scrape that port.
 
+  You can specify ``moving_average_window_seconds`` (default ``1800``, or 30 minutes) to adjust how long of a time period your worker utilization is averaged over: set a smaller value to autoscale more quickly, or set a larger value to ignore spikes.
+
   .. note::
 
     If you have configured your service to use a non-default stats port (8889), PaaSTA will not scale your service correctly!
@@ -78,15 +81,21 @@ The currently available metrics providers are:
   With the ``gunicorn`` metrics provider, Paasta will configure your pods to run an additional container with the `statsd_exporter <https://github.com/prometheus/statsd_exporter>`_ image.
   This sidecar will listen on port 9117 and receive stats from the gunicorn service. The ``statsd_exporter`` will translate the stats into Prometheus format, which Prometheus will scrape.
 
+  You can specify ``moving_average_window_seconds`` (default ``1800``, or 30 minutes) to adjust how long of a time period your worker utilization is averaged over: set a smaller value to autoscale more quickly, or set a larger value to ignore spikes.
+
 :active-requests:
   With the ``active-requests`` metrics provider, Paasta will use Envoy metrics to scale your service based on the amount
   of incoming traffic.  Note that, instead of using ``setpoint``, the active requests provider looks at the
   ``desired_active_requests_per_replica`` field of the autoscaling configuration to determine how to scale.
 
+  You can specify ``moving_average_window_seconds`` (default ``1800``, or 30 minutes) to adjust how long of a time period the number of active requests is averaged over: set a smaller value to autoscale more quickly, or set a larger value to ignore spikes.
+
 :piscina:
   This metrics provider is only valid for the Yelp-internal server-side-rendering (SSR) service. With the ``piscina``
   metrics provider, Paasta will scale your SSR instance based on how many Piscina workers are busy.
 
+  You can specify ``moving_average_window_seconds`` (default ``1800``, or 30 minutes) to adjust how long of a time period your worker utilization is averaged over: set a smaller value to autoscale more quickly, or set a larger value to ignore spikes.
+
 :arbitrary_promql:
   The ``arbitrary_promql`` metrics provider allows you to specify any Prometheus query you want using the `Prometheus
   query language (promql) <https://prometheus.io/docs/prometheus/latest/querying/basics/>`.  The autoscaler will attempt
@@ -99,25 +108,17 @@ The currently available metrics providers are:
 Decision policies
 ^^^^^^^^^^^^^^^^^
 
-The currently available decicion policies are:
-
-:proportional:
-  (This is the default policy.)
-  Uses a simple proportional model to decide the correct number of instances
-  to scale to, i.e. if load is 110% of the setpoint, scales up by 10%.
-
-  Extra parameters:
-
-  :moving_average_window_seconds:
-    The number of seconds to load data points over in order to calculate the average.
-    Defaults to 1800s (30m).
-    Currently, this is only supported for ``metrics_provider: uwsgi``.
-
 :bespoke:
   Allows a service author to implement their own autoscaling.
   This policy results in no HPA being configured.
   An external process should periodically decide how many replicas this service needs to run, and use the Paasta API to tell Paasta to scale.
   See the :ref:`How to create a custom (bespoke) autoscaling method` section for details.
+  This is most commonly used by the Kew autoscaler.
+
+:Anything other value:
-:Anything other value:
+:Any other value:
-:Anything other value:
+:Any other value:
+  The default autoscaling method.
+  Paasta will configure a Kubernetes HPA to scale the service based on the metrics providers and setpoints.
+
 
 Using multiple metrics providers
 --------------------------------

diff --git a/docs/source/generated/paasta_tools.autoscaling.forecasting.rst b/docs/source/generated/paasta_tools.autoscaling.forecasting.rst
diff --git a/docs/source/generated/paasta_tools.autoscaling.rst b/docs/source/generated/paasta_tools.autoscaling.rst
@@ -7,7 +7,6 @@ Submodules
 .. toctree::
 
    paasta_tools.autoscaling.autoscaling_service_lib
-   paasta_tools.autoscaling.forecasting
    paasta_tools.autoscaling.max_all_k8s_services
    paasta_tools.autoscaling.pause_service_autoscaler
    paasta_tools.autoscaling.utils

diff --git a/docs/source/yelpsoa_configs.rst b/docs/source/yelpsoa_configs.rst
@@ -393,8 +393,9 @@ instance MAY have:
         * ``type``: Which method the autoscaler will use to determine a service's utilization.
           Should be ``cpu``, ``uwsgi``, ``active-reqeusts``, ``piscina``, ``gunicorn``, or ``arbitrary_promql``.
 
-        * ``decision_policy``: Which method the autoscaler will use to determine when to autoscale a service.
-          Should be ``proportional`` or ``bespoke``.
+        * ``decision_policy``: If you want to autoscale with a separate system which calls ``paasta autoscale`` (such as the Kew autoscaler), rather than an HPA, then set this to ``bespoke`` on the first ``metrics_provider`` entry.
+          Otherwise, leave this unset.
+          (It does not make sense to have more than one ``metrics_provider`` with ``decision_policy: bespoke``.)
 
         * ``setpoint``: The target utilization (as measured by your ``metrics_provider``) that the autoscaler will try to achieve.
           Default value is 0.8.

@@ -12,35 +12,16 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-from collections import defaultdict
-from typing import Callable
-from typing import Dict
 from typing import List
 from typing import Optional
 from typing import TypedDict
 
 
-_autoscaling_components: Dict[str, Dict[str, Callable]] = defaultdict(dict)
-
-
-def register_autoscaling_component(name, method_type):
-    def outer(autoscaling_method):
-        _autoscaling_components[method_type][name] = autoscaling_method
-        return autoscaling_method
-
-    return outer
-
-
-def get_autoscaling_component(name, method_type):
-    return _autoscaling_components[method_type][name]
-
-
 class MetricsProviderDict(TypedDict, total=False):
     type: str
     decision_policy: str
     setpoint: float
     desired_active_requests_per_replica: int
-    forecast_policy: Optional[str]
     moving_average_window_seconds: Optional[int]
     use_resource_metrics: bool
     prometheus_adapter_config: Optional[dict]

@@ -25,12 +25,6 @@
             "max_instances_alert_threshold": {
                 "type": "number"
             },
-            "forecast_policy": {
-                "enum": [
-                    "moving_average",
-                    "current"
-                ]
-            },
             "moving_average_window_seconds": {
                 "type": "integer"
             },

@@ -888,7 +888,10 @@ def get_autoscaling_metric_spec(
             return None
 
         autoscaling_params = self.get_autoscaling_params()
-        if autoscaling_params["metrics_providers"][0]["decision_policy"] == "bespoke":
+        if (
+            autoscaling_params["metrics_providers"][0].get("decision_policy", "")
+            == "bespoke"
+        ):
             return None
 
         min_replicas = self.get_min_instances()

@@ -37,6 +37,8 @@
 
 DEFAULT_AUTOSCALING_SETPOINT = 0.8
 DEFAULT_DESIRED_ACTIVE_REQUESTS_PER_REPLICA = 1
+
+# If you change any of these, make sure to update docs/source/autoscaling.rst
 DEFAULT_ACTIVE_REQUESTS_AUTOSCALING_MOVING_AVERAGE_WINDOW = 1800
 DEFAULT_UWSGI_AUTOSCALING_MOVING_AVERAGE_WINDOW = 1800
 DEFAULT_PISCINA_AUTOSCALING_MOVING_AVERAGE_WINDOW = 1800
@@ -345,7 +347,6 @@ def limit_instance_count(self, instances: int) -> int:
     def get_autoscaling_params(self) -> AutoscalingParamsDict:
         default_provider_params: MetricsProviderDict = {
             "type": METRICS_PROVIDER_CPU,
-            "decision_policy": "proportional",
             "setpoint": DEFAULT_AUTOSCALING_SETPOINT,
         }
 

diff --git a/tests/autoscaling/test_forecasting.py b/tests/autoscaling/test_forecasting.py
diff --git a/tests/test_kubernetes_tools.py b/tests/test_kubernetes_tools.py
@@ -2402,7 +2402,6 @@ def test_get_autoscaling_metric_spec_uwsgi_prometheus(
                         {
                             "type": METRICS_PROVIDER_UWSGI,
                             "setpoint": 0.4,
-                            "forecast_policy": "moving_average",
                             "moving_average_window_seconds": 300,
                         }
                     ]
@@ -2486,7 +2485,6 @@ def test_get_autoscaling_metric_spec_gunicorn_prometheus(
                         {
                             "type": METRICS_PROVIDER_GUNICORN,
                             "setpoint": 0.5,
-                            "forecast_policy": "moving_average",
                             "moving_average_window_seconds": 300,
                         }
                     ]