New rate_limit plugin for simple resource limitations (#7623)

* New rate_limit plugin for simple resource limitations This current version has only one limiter, which implements a basic active_in limitation. However, at least for now, these connections that are queued will still count against the active connection metrics and limits that are in the core. I'm open for a refactoring of this, if or when we want to have different types of limiters. Similar to the policies for the cache_promote plugin. (cherry picked from commit a8e98d6)
apache · Apr 9, 2021 · d540ecc · d540ecc
1 parent d33db23
commit d540ecc
Show file tree

Hide file tree

Showing 7 changed files with 617 additions and 0 deletions.
diff --git a/doc/admin-guide/plugins/index.en.rst b/doc/admin-guide/plugins/index.en.rst
@@ -164,6 +164,7 @@ directory of the |TS| source tree. Experimental plugins can be compiled by passi
    MP4 <mp4.en>
    Multiplexer <multiplexer.en>
    MySQL Remap <mysql_remap.en>
+   Rate Limit <rate_limit.en>
    Signed URLs <url_sig.en>
    Slice <slice.en>
    SSL Headers <sslheaders.en>
@@ -228,6 +229,9 @@ directory of the |TS| source tree. Experimental plugins can be compiled by passi
 :doc:`Prefetch <prefetch.en>`
    Pre-fetch objects based on the requested URL path pattern.
 
+:doc:`Rate Limit <rate_limit.en>`
+   Simple transaction rate limiting.
+
 :doc:`Remap Purge <remap_purge.en>`
    This remap plugin allows the administrator to easily setup remotely
    controlled ``PURGE`` for the content of an entire remap rule.

diff --git a/doc/admin-guide/plugins/rate_limit.en.rst b/doc/admin-guide/plugins/rate_limit.en.rst
@@ -0,0 +1,115 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+   or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+
+.. _admin-plugins-rate-limit:
+
+Rate Limit Plugin
+********************
+
+The :program:`rate_limit` plugin provides basic mechanism for how much
+traffic a particular service (remap rule) is allowed to do. Currently,
+the only implementation is a limit on how many active client transactions
+a service can have. However, it would be easy to refactor this plugin to
+allow for adding new limiter policies later on.
+
+The limit counters and queues are per remap rule only, i.e. there is
+(currently) no way to group transaction limits from different remap rules
+into a single rate limiter.
+
+All configuration is done via :file:`remap.config`, and the following options
+are available:
+
+.. program:: rate-limit
+
+.. option:: --limit
+
+   The maximum number of active client transactions.
+
+.. option:: --queue
+
+   When the limit (above) has been reached, all new transactions are placed
+   on a FIFO queue. This option (optional) sets an upper bound on how many
+   queued transactions we will allow. When this threshold is reached, all
+   additional transactions are immediately served with an error message.
+
+   The queue is effectively disabled if this is set to `0`, which implies
+   that when the transaction limit is reached, we immediately start serving
+   error responses.
+
+   The default queue size is `UINT_MAX`, which is essentially unlimited.
+
+.. option:: --error
+
+   An optional HTTP status error code, to be used together with the
+   :option:`--queue` option above. The default is `429`.
+
+.. option:: --retry
+
+   An optional retry-after value, which if set will cause rejected (e.g. `429`)
+   responses to also include a header `Retry-After`.
+
+.. option:: --header
+
+   This is an optional HTTP header name, which will be added to the client
+   request header IF the transaction was delayed (queued). The value of the
+   header is the delay, in milliseconds. This can be useful to for example
+   log the delays for later analysis.
+
+   It is recommended that an `@` header is used here, e.g. `@RateLimit-Delay`,
+   since this header will not leave the ATS server instance.
+
+.. option:: --maxage
+
+   An optional `max-age` for how long a transaction can sit in the delay queue.
+   The value (default 0) is the age in milliseconds.
+
+Examples
+--------
+
+This example shows a simple rate limiting of `128` concurrently active client
+transactions, with a maximum queue size of `256`. The default of HTTP status
+code `429` is used when queue is full.
+
+    map http://cdn.example.com/ http://some-server.example.com \
+      @plugin=rate_limit.so @pparam=--limit=128 @pparam=--queue=256
+
+
+This example would put a hard transaction (in) limit to 256, with no backoff
+queue, and add a header with the transaction delay if it was queued:
+
+    map http://cdn.example.com/ http://some-server.example.com \
+      @plugin=rate_limit.so @pparam=--limit=256 @pparam=--queue=0 \
+      @pparam=--header=@RateLimit-Delay
+
+This final example will limit the active transaction, queue size, and also
+add a `Retry-After` header once the queue is full and we return a `429` error:
+
+    map http://cdn.example.com/ http://some-server.example.com \
+      @plugin=rate_limit.so @pparam=--limit=256 @pparam=--queue=1024 \
+      @pparam=--retry=3600 @pparam=--header=@RateLimit-Delay
+
+In this case, the response would look like this when the queue is full:
+
+    HTTP/1.1 429 Too Many Requests
+    Date: Fri, 26 Mar 2021 22:42:38 GMT
+    Connection: keep-alive
+    Server: ATS/10.0.0
+    Cache-Control: no-store
+    Content-Type: text/html
+    Content-Language: en
+    Retry-After: 3600
+    Content-Length: 207
diff --git a/plugins/Makefile.am b/plugins/Makefile.am
@@ -79,6 +79,7 @@ include experimental/memory_profile/Makefile.inc
 include experimental/metalink/Makefile.inc
 include experimental/money_trace/Makefile.inc
 include experimental/mp4/Makefile.inc
+include experimental/rate_limit/Makefile.inc
 include experimental/redo_cache_lookup/Makefile.inc
 include experimental/remap_stats/Makefile.inc
 include experimental/slice/Makefile.inc

diff --git a/plugins/experimental/rate_limit/Makefile.inc b/plugins/experimental/rate_limit/Makefile.inc
@@ -0,0 +1,21 @@
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+
+pkglib_LTLIBRARIES += experimental/rate_limit/rate_limit.la
+
+experimental_rate_limit_rate_limit_la_SOURCES = \
+  experimental/rate_limit/rate_limit.cc \
+  experimental/rate_limit/limiter.cc
diff --git a/plugins/experimental/rate_limit/limiter.cc b/plugins/experimental/rate_limit/limiter.cc
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+#include "limiter.h"
+
+///////////////////////////////////////////////////////////////////////////////
+// This is the continuation that gets scheduled periodically to process the
+// deque of waiting TXNs.
+//
+int
+RateLimiter::queue_process_cont(TSCont cont, TSEvent event, void *edata)
+{
+  RateLimiter *limiter = static_cast<RateLimiter *>(TSContDataGet(cont));
+  QueueTime now        = std::chrono::system_clock::now(); // Only do this once per "loop"
+
+  // Try to enable some queued txns (if any) if there are slots available
+  while (limiter->size() > 0 && limiter->reserve()) {
+    auto [txnp, contp, start_time]  = limiter->pop();
+    std::chrono::microseconds delay = std::chrono::duration_cast<std::chrono::milliseconds>(now - start_time);
+
+    limiter->delayHeader(txnp, delay);
+    TSDebug(PLUGIN_NAME, "Enabling queued txn after %ldms", static_cast<long>(delay.count()));
+    // Since this was a delayed transaction, we need to add the TXN_CLOSE hook to free the slot when done
+    TSHttpTxnHookAdd(txnp, TS_HTTP_TXN_CLOSE_HOOK, contp);
+    TSHttpTxnReenable(txnp, TS_EVENT_HTTP_CONTINUE);
+  }
+
+  // Kill any queued txns if they are too old
+  if (limiter->max_age > std::chrono::milliseconds::zero() && limiter->size() > 0) {
+    now = std::chrono::system_clock::now(); // Update the "now", for some extra accuracy
+
+    while (limiter->size() > 0 && limiter->hasOldTxn(now)) {
+      // The oldest object on the queue is too old on the queue, so "kill" it.
+      auto [txnp, contp, start_time] = limiter->pop();
+      std::chrono::milliseconds age  = std::chrono::duration_cast<std::chrono::milliseconds>(now - start_time);
+
+      limiter->delayHeader(txnp, age);
+      TSDebug(PLUGIN_NAME, "Queued TXN is too old (%ldms), erroring out", static_cast<long>(age.count()));
+      TSHttpTxnStatusSet(txnp, static_cast<TSHttpStatus>(limiter->error));
+      TSHttpTxnHookAdd(txnp, TS_HTTP_SEND_RESPONSE_HDR_HOOK, contp);
+      TSHttpTxnReenable(txnp, TS_EVENT_HTTP_ERROR);
+    }
+  }
+
+  return TS_EVENT_NONE;
+}
+
+///////////////////////////////////////////////////////////////////////////////
+// The main rate limiting continuation.
+//
+int
+RateLimiter::rate_limit_cont(TSCont cont, TSEvent event, void *edata)
+{
+  RateLimiter *limiter = static_cast<RateLimiter *>(TSContDataGet(cont));
+
+  switch (event) {
+  case TS_EVENT_HTTP_TXN_CLOSE:
+    limiter->release();
+    TSContDestroy(cont); // We are done with this continuation now
+    TSHttpTxnReenable(static_cast<TSHttpTxn>(edata), TS_EVENT_HTTP_CONTINUE);
+    return TS_EVENT_CONTINUE;
+    break;
+
+  case TS_EVENT_HTTP_POST_REMAP:
+    limiter->push(static_cast<TSHttpTxn>(edata), cont);
+    return TS_EVENT_NONE;
+    break;
+
+  case TS_EVENT_HTTP_SEND_RESPONSE_HDR: // This is only applicable when we set an error in remap
+    limiter->retryAfter(static_cast<TSHttpTxn>(edata), limiter->retry);
+    TSContDestroy(cont); // We are done with this continuation now
+    TSHttpTxnReenable(static_cast<TSHttpTxn>(edata), TS_EVENT_HTTP_CONTINUE);
+    return TS_EVENT_CONTINUE;
+    break;
+
+  default:
+    TSDebug(PLUGIN_NAME, "Unknown event %d", static_cast<int>(event));
+    TSError("Unknown event in %s", PLUGIN_NAME);
+    break;
+  }
+  return TS_EVENT_NONE;
+}
+
+///////////////////////////////////////////////////////////////////////////////
+// Setup the continuous queue processing continuation
+//
+void
+RateLimiter::setupQueueCont()
+{
+  _queue_cont = TSContCreate(queue_process_cont, TSMutexCreate());
+  TSReleaseAssert(_queue_cont);
+  TSContDataSet(_queue_cont, this);
+  _action = TSContScheduleEveryOnPool(_queue_cont, QUEUE_DELAY_TIME.count(), TS_THREAD_POOL_TASK);
+}
+
+///////////////////////////////////////////////////////////////////////////////
+// Add a header with the delay imposed on this transaction. This can be used
+// for logging, and other types of metrics.
+//
+void
+RateLimiter::delayHeader(TSHttpTxn txnp, std::chrono::microseconds delay) const
+{
+  if (header.size() > 0) {
+    TSMLoc hdr_loc   = nullptr;
+    TSMBuffer bufp   = nullptr;
+    TSMLoc field_loc = nullptr;
+
+    if (TS_SUCCESS == TSHttpTxnClientReqGet(txnp, &bufp, &hdr_loc)) {
+      if (TS_SUCCESS == TSMimeHdrFieldCreateNamed(bufp, hdr_loc, header.c_str(), header.size(), &field_loc)) {
+        if (TS_SUCCESS == TSMimeHdrFieldValueIntSet(bufp, hdr_loc, field_loc, -1, static_cast<int>(delay.count()))) {
+          TSMimeHdrFieldAppend(bufp, hdr_loc, field_loc);
+        }
+        TSHandleMLocRelease(bufp, hdr_loc, field_loc);
+      }
+      TSHandleMLocRelease(bufp, TS_NULL_MLOC, hdr_loc);
+    }
+  }
+}
+
+///////////////////////////////////////////////////////////////////////////////
+// Add a header with the delay imposed on this transaction. This can be used
+// for logging, and other types of metrics.
+//
+void
+RateLimiter::retryAfter(TSHttpTxn txnp, unsigned retry) const
+{
+  if (retry > 0) {
+    TSMLoc hdr_loc   = nullptr;
+    TSMBuffer bufp   = nullptr;
+    TSMLoc field_loc = nullptr;
+
+    if (TS_SUCCESS == TSHttpTxnClientRespGet(txnp, &bufp, &hdr_loc)) {
+      if (TS_SUCCESS == TSMimeHdrFieldCreateNamed(bufp, hdr_loc, "Retry-After", 11, &field_loc)) {
+        if (TS_SUCCESS == TSMimeHdrFieldValueIntSet(bufp, hdr_loc, field_loc, -1, retry)) {
+          TSDebug(PLUGIN_NAME, "Added a Retry-After: %u", retry);
+          TSMimeHdrFieldAppend(bufp, hdr_loc, field_loc);
+        }
+        TSHandleMLocRelease(bufp, hdr_loc, field_loc);
+      }
+      TSHandleMLocRelease(bufp, TS_NULL_MLOC, hdr_loc);
+    }
+  }
+}