feat(gateway): add token-bucket rate limiting middleware by eclipse0922 · Pull Request #220 · selfpatch/ros2_medkit

eclipse0922 · 2026-02-17T12:47:54Z

Summary

Add configurable token-bucket rate limiting to protect the gateway from abuse and resource exhaustion. Implements two-layer limiting (global + per-client IP) as a pre-routing middleware that runs before CORS and auth checks.

Key design decisions:

Token bucket algorithm with lazy refill — no background threads needed
Disabled by default for backward compatibility
Endpoint-specific overrides via glob patterns (e.g. /api/v1/*/operations/*:10)
SOVD-compliant 429 error responses with Retry-After header
OPTIONS excluded from rate limiting (CORS preflight)

Issue

closes Rate Limiting and Request Throttling #91

Type

Bug fix
New feature or tests
Breaking change
Documentation only

What Changed

New files

File	Description
`include/.../http/rate_limiter.hpp`	`RateLimitConfig`, `RateLimitConfigBuilder`, `TokenBucket`, `RateLimitResult`, `RateLimiter`
`src/http/rate_limiter.cpp`	Token bucket implementation, path matching, cleanup logic
`test/test_rate_limiter.cpp`	24 unit tests (config validation, bucket behavior, per-client isolation, endpoint overrides, pattern matching, cleanup)

Modified files

File	Change
`error_codes.hpp`	Added `ERR_RATE_LIMIT_EXCEEDED`
`gateway_node.hpp/cpp`	`RateLimitConfig` member + ROS 2 parameter declarations + builder
`rest_server.hpp/cpp`	Constructor accepts `RateLimitConfig`, rate limiter check in pre-routing handler
`gateway_params.yaml`	`rate_limiting:` configuration section
`CMakeLists.txt`	Source file + test target + coverage list

Configuration

rate_limiting:
  enabled: false                     # disabled by default
  global_requests_per_minute: 600    # total across all clients
  client_requests_per_minute: 60     # per IP
  endpoint_limits: [""]              # "pattern:rpm" overrides
  client_cleanup_interval_seconds: 300
  client_max_idle_seconds: 600

Response headers (all responses when enabled)

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 1708200060

Rejection response (429)

{
  "error_code": "rate-limit-exceeded",
  "message": "Too many requests. Please retry after 2 seconds.",
  "parameters": { "retry_after": 2, "limit": 60, "reset": 1708200060 }
}

Testing

colcon build — all 6 packages pass
colcon test --ctest-args -R test_rate_limiter — 24/24 tests pass
clang-format-18 --dry-run -Werror — clean
ament_copyright — clean

Test categories:

Config builder validation (10 tests)
Token bucket behavior — within-limit, over-limit, refill (4 tests)
Per-client isolation, global limit (2 tests)
Endpoint override matching (2 tests)
Glob pattern matching (4 tests)
Result field validation (2 tests)

Checklist

Breaking changes are clearly described (and announced in docs / changelog if needed)
Tests were added or updated if needed
Docs were updated if behavior or public API changed

Copilot

Pull request overview

Adds token-bucket rate limiting to ros2_medkit_gateway to protect the REST API from abuse/resource exhaustion, wiring the limiter into the HTTP server’s pre-routing flow and exposing configuration via ROS 2 parameters/YAML.

Changes:

Introduces RateLimiter (+ config builder, token bucket, path pattern overrides) and unit tests.
Integrates rate limiting into RESTServer pre-routing handler and exposes rate-limit headers via CORS.
Adds gateway parameters + YAML defaults and updates build/test targets.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
src/ros2_medkit_gateway/include/ros2_medkit_gateway/http/rate_limiter.hpp	Adds public rate limiter API/types (config, token bucket, result model).
src/ros2_medkit_gateway/src/http/rate_limiter.cpp	Implements token-bucket logic, endpoint pattern matching, and cleanup.
src/ros2_medkit_gateway/src/http/rest_server.cpp	Runs rate limiting in pre-routing handler and exposes rate limit headers for CORS.
src/ros2_medkit_gateway/include/ros2_medkit_gateway/http/rest_server.hpp	Plumbs `RateLimitConfig` into RESTServer and stores `RateLimiter`.
src/ros2_medkit_gateway/src/gateway_node.cpp	Declares rate limiting parameters and builds `RateLimitConfig` from ROS params.
src/ros2_medkit_gateway/include/ros2_medkit_gateway/gateway_node.hpp	Stores `RateLimitConfig` in node state.
src/ros2_medkit_gateway/include/ros2_medkit_gateway/http/error_codes.hpp	Adds `ERR_RATE_LIMIT_EXCEEDED` constant for 429 responses.
src/ros2_medkit_gateway/config/gateway_params.yaml	Adds YAML configuration section for rate limiting.
src/ros2_medkit_gateway/test/test_rate_limiter.cpp	Adds unit tests for config validation, token bucket behavior, overrides, and cleanup.
src/ros2_medkit_gateway/CMakeLists.txt	Adds rate limiter source + unit test target to the build.

Copilot · 2026-02-17T14:25:32Z

src/ros2_medkit_gateway/src/http/rest_server.cpp

+    // Rate limiting check (before CORS and Auth for early rejection)
+    if (rate_limiter_ && rate_limiter_->is_enabled() && req.method != "OPTIONS") {
+      auto rl_result = rate_limiter_->check(req.remote_addr, req.path);
+      RateLimiter::apply_headers(rl_result, res);
+      if (!rl_result.allowed) {
+        RateLimiter::apply_rejection(rl_result, res);
+        return httplib::Server::HandlerResponse::Handled;
+      }
+    }


The PR description says X-RateLimit-* headers are included on all responses when enabled and shows X-RateLimit-Limit: 60 (per-client). With the current implementation, apply_headers() uses whatever RateLimitResult.limit was set to, which can be the global RPM (e.g., 600) on global throttling. Either update the implementation to always report per-client limits in headers (and report global limiting differently), or update the documented header semantics/examples to match actual behavior.

The implementation intentionally reports the limit that was actually applied: global RPM when the global bucket rejects, per-client (or endpoint-override) RPM when the client bucket rejects. This makes the header actionable—the client knows which limit it hit. Agreed the PR description example was misleading; updated the comment in the code.

Copilot · 2026-02-17T14:25:32Z

src/ros2_medkit_gateway/src/http/rest_server.cpp

  srv->set_pre_routing_handler([this](const httplib::Request & req, httplib::Response & res) {
+    // Rate limiting check (before CORS and Auth for early rejection)
+    if (rate_limiter_ && rate_limiter_->is_enabled() && req.method != "OPTIONS") {
+      auto rl_result = rate_limiter_->check(req.remote_addr, req.path);
+      RateLimiter::apply_headers(rl_result, res);
+      if (!rl_result.allowed) {
+        RateLimiter::apply_rejection(rl_result, res);
+        return httplib::Server::HandlerResponse::Handled;
+      }
+    }


Rate limiting is a cross-cutting HTTP behavior, but the PR only adds unit tests. The repository already has launch-testing integration tests that exercise HTTP middleware (e.g., CORS/auth). Please add an integration test that starts the gateway with rate limiting enabled and asserts (1) 429 + Retry-After + X-RateLimit-* headers on over-limit requests and (2) expected behavior for OPTIONS and CORS-enabled responses, to prevent regressions.

The unit tests already cover the middleware behavior through RateLimiterTest (all 24 cases). A full launch-test integration that boots the gateway node and sends HTTP requests is beyond the scope of this PR; that can be tracked as a separate improvement if needed.

Addressed in 5c784e7. Added launch-level integration coverage for rate limiting and wired it into CMake test targets.

What is now validated:

Over-limit requests return 429 with Retry-After and X-RateLimit-* headers.

OPTIONS preflight is not rate-limited.

CORS headers remain present on 429 responses (including exposed rate-limit headers).

Files:

src/ros2_medkit_gateway/test/test_rate_limiting.test.py

src/ros2_medkit_gateway/CMakeLists.txt

Validation:

Ran in Docker (ROS Jazzy dev container) with colcon build + targeted tests: test_rate_limiter, test_rate_limiting, test_cors (all passed).

Copilot · 2026-02-17T14:25:32Z

src/ros2_medkit_gateway/src/http/rest_server.cpp

  srv->set_pre_routing_handler([this](const httplib::Request & req, httplib::Response & res) {
+    // Rate limiting check (before CORS and Auth for early rejection)
+    if (rate_limiter_ && rate_limiter_->is_enabled() && req.method != "OPTIONS") {
+      auto rl_result = rate_limiter_->check(req.remote_addr, req.path);
+      RateLimiter::apply_headers(rl_result, res);
+      if (!rl_result.allowed) {
+        RateLimiter::apply_rejection(rl_result, res);
+        return httplib::Server::HandlerResponse::Handled;
+      }
+    }


Early-returning on rate-limit rejection bypasses the CORS handling below, so 429 responses won’t include Access-Control-Allow-* headers when CORS is enabled. That’s a functional regression for browser clients (they may see the request blocked and be unable to read the error/Retry-After). Consider applying CORS headers before returning (or running rate limiting after CORS header setup, while still skipping OPTIONS).

Fixed in c762f8f. CORS headers (including Access-Control-Allow-Headers: Authorization when auth is enabled) are now applied before the rate-limit rejection return, so cross-origin clients can read 429 responses.

Copilot · 2026-02-17T14:25:33Z

src/ros2_medkit_gateway/test/test_rate_limiter.cpp

+  auto config = make_config(600, 60);
+  RateLimiter limiter(config);
+
+  // Create a client entry
+  limiter.check("192.168.1.100", "/api/v1/health");
+
+  // Wait for the client to become stale (> 2 seconds)
+  std::this_thread::sleep_for(std::chrono::milliseconds(2500));
+
+  // Trigger cleanup via another check (cleanup runs on interval)
+  limiter.check("192.168.1.200", "/api/v1/health");
+
+  // The stale client should get a fresh bucket (full tokens)
+  // If cleanup worked, client 100 should have been removed and get a new bucket
+  auto result = limiter.check("192.168.1.100", "/api/v1/health");
+  EXPECT_TRUE(result.allowed);
+  // remaining should be near max since it's a fresh bucket
+  EXPECT_GE(result.remaining, 50);


StaleClientsAreCleaned doesn’t actually validate that cleanup removed the stale client: with client_rpm=60, the bucket naturally refills to near-full after 2.5s even if the client entry is never erased, so the test can pass without cleanup working. To make this test meaningful, use a much lower RPM (so refill alone can’t recover) and assert the request is rejected without cleanup but allowed after cleanup is triggered (or otherwise expose client count/state in a test-only way).

Suggested change

auto config = make_config(600, 60);

RateLimiter limiter(config);

// Create a client entry

limiter.check("192.168.1.100", "/api/v1/health");

// Wait for the client to become stale (> 2 seconds)

std::this_thread::sleep_for(std::chrono::milliseconds(2500));

// Trigger cleanup via another check (cleanup runs on interval)

limiter.check("192.168.1.200", "/api/v1/health");

// The stale client should get a fresh bucket (full tokens)

// If cleanup worked, client 100 should have been removed and get a new bucket

auto result = limiter.check("192.168.1.100", "/api/v1/health");

EXPECT_TRUE(result.allowed);

// remaining should be near max since it's a fresh bucket

EXPECT_GE(result.remaining, 50);

// Use a very low client RPM so refill alone cannot recover within 2.5s.

auto config = make_config(600, 2);

RateLimiter limiter(config);

// Create a client entry and exhaust its bucket.

auto first = limiter.check("192.168.1.100", "/api/v1/health");

EXPECT_TRUE(first.allowed);

auto second = limiter.check("192.168.1.100", "/api/v1/health");

EXPECT_TRUE(second.allowed);

auto rejected_before_cleanup = limiter.check("192.168.1.100", "/api/v1/health");

EXPECT_FALSE(rejected_before_cleanup.allowed);

EXPECT_EQ(rejected_before_cleanup.remaining, 0);

// Wait for the client to become stale (> 2 seconds). At 2 RPM this is

// not enough time for natural refill to make the request allowed again.

std::this_thread::sleep_for(std::chrono::milliseconds(2500));

// Trigger cleanup via another check (cleanup runs on interval)

limiter.check("192.168.1.200", "/api/v1/health");

// The stale client should now get a fresh bucket.

// If cleanup worked, client 100 should have been removed and get a new bucket.

auto result = limiter.check("192.168.1.100", "/api/v1/health");

EXPECT_TRUE(result.allowed);

// remaining should be > 0 since it's a fresh (or substantially refilled) bucket,

// and not merely the partially refilled exhausted bucket from before cleanup.

EXPECT_GT(result.remaining, 0);

Fixed in c762f8f. Rewrote the test to use 2 RPM: at that rate natural refill in 2.5 s adds only ~0.08 tokens (2/60 × 2.5), which is below the 1.0 threshold needed to allow a request. So the test can only pass if cleanup_stale_clients() actually removes the client entry and resets its bucket.

Copilot · 2026-02-17T14:25:33Z

src/ros2_medkit_gateway/test/test_rate_limiter.cpp

+TEST_F(RateLimiterTest, TokensRefillOverTime) {
+  // 60 RPM = 1 token per second
+  auto config = make_config(600, 60);
+  RateLimiter limiter(config);
+
+  // Exhaust all tokens
+  for (int i = 0; i < 60; ++i) {
+    limiter.check("192.168.1.1", "/api/v1/health");
+  }
+
+  // Should be rejected now
+  auto rejected = limiter.check("192.168.1.1", "/api/v1/health");
+  EXPECT_FALSE(rejected.allowed);
+
+  // Wait for tokens to refill (1 token per second, need at least 1 second)
+  std::this_thread::sleep_for(std::chrono::milliseconds(1100));
+
+  // Should be allowed again
+  auto allowed = limiter.check("192.168.1.1", "/api/v1/health");
+  EXPECT_TRUE(allowed.allowed);
+}


TokensRefillOverTime can be timing/flakiness-prone: after consuming 60 tokens in a loop, enough wall time may have elapsed (especially under slow CI/sanitizers) that the bucket refilled and the immediate “Should be rejected now” assertion can intermittently fail. Consider structuring the test to avoid relying on real sleep/elapsed time (e.g., inject a clock into the rate limiter/token bucket), or use parameters that make the refill negligible during the loop and add generous margins.

Noted. In practice, 60 sequential in-memory unordered_map lookups complete in well under 1 ms, so the partial-refill window is negligible. The test is not meaningfully flaky. If CI environments prove otherwise, injecting a clock into the token bucket would be the right fix, but that is a larger change than warranted here.

Copilot · 2026-02-17T14:25:33Z

src/ros2_medkit_gateway/src/http/rate_limiter.cpp

+  // 2. Per-client bucket check
+  int effective_rpm = get_effective_rpm(path);
+  {
+    std::lock_guard<std::mutex> lock(clients_mutex_);
+    auto it = client_buckets_.find(client_ip);
+    if (it == client_buckets_.end()) {
+      // New client — create bucket with effective RPM
+      ClientState state;
+      state.bucket = make_bucket(effective_rpm);
+      state.last_seen = std::chrono::steady_clock::now();
+      // Consume one token for this request
+      state.bucket.tokens -= 1.0;
+      client_buckets_[client_ip] = state;
+
+      result.allowed = true;
+      result.limit = effective_rpm;
+      result.remaining = static_cast<int>(state.bucket.tokens);
+      auto reset_time = std::chrono::system_clock::now() + std::chrono::seconds(60);
+      result.reset_epoch_seconds =
+          std::chrono::duration_cast<std::chrono::seconds>(reset_time.time_since_epoch()).count();
+      return result;
+    }


Per-endpoint overrides don’t actually take effect once a client bucket already exists: buckets are keyed only by client_ip and are created with the RPM from the first request seen for that IP. Subsequent requests compute effective_rpm but reuse the existing bucket’s max_tokens/refill_rate, so a client can bypass stricter endpoint limits (or be unfairly throttled after hitting a strict endpoint first). Consider keying buckets by (client_ip, effective_rpm/pattern) or recreating/updating the bucket when the effective RPM for the current path differs from the stored bucket’s configuration (while preserving current token level appropriately).

Correct. The current design uses a single per-client bucket created on first contact; subsequent requests to endpoints with a different RPM reuse that bucket. Fixing this properly requires keying buckets by (client_ip, endpoint_pattern), which is a non-trivial redesign. Tracked as a known limitation—opening a follow-up issue to address this in a future PR.

Copilot · 2026-02-17T14:25:33Z

src/ros2_medkit_gateway/src/http/rate_limiter.cpp

+  // Periodic cleanup of stale client entries
+  {
+    auto now = std::chrono::steady_clock::now();
+    double elapsed = std::chrono::duration<double>(now - last_cleanup_).count();
+    if (elapsed >= static_cast<double>(config_.client_cleanup_interval_seconds)) {
+      cleanup_stale_clients();
+    }
+  }


last_cleanup_ is read in check() without any synchronization, but is written in cleanup_stale_clients() under clients_mutex_. Since check() can run concurrently in cpp-httplib worker threads, this is a data race/UB. Guard both the read and write of last_cleanup_ with the same mutex (or use an atomic representation such as an std::atomic<int64_t> storing steady-clock ticks).

Fixed in c762f8f. Replaced last_cleanup_ (a steady_clock::time_point read without a lock) with std::atomic<int64_t> last_cleanup_ns_ storing nanoseconds since epoch. The load in check() uses memory_order_relaxed—ordering guarantees are not needed here because cleanup_stale_clients() is internally serialized by clients_mutex_.

Copilot · 2026-02-17T14:25:34Z

src/ros2_medkit_gateway/src/http/rate_limiter.cpp

+  // 1. Global bucket check
+  {
+    std::lock_guard<std::mutex> lock(global_mutex_);
+    if (!try_consume(global_bucket_)) {
+      result.allowed = false;
+      result.limit = config_.global_requests_per_minute;
+      result.remaining = 0;
+      double wait = seconds_until_next_token(global_bucket_);
+      result.retry_after_seconds = std::max(1, static_cast<int>(std::ceil(wait)));
+      auto reset_time = std::chrono::system_clock::now() + std::chrono::seconds(result.retry_after_seconds);
+      result.reset_epoch_seconds =
+          std::chrono::duration_cast<std::chrono::seconds>(reset_time.time_since_epoch()).count();
+      return result;
+    }
+  }
+
+  // 2. Per-client bucket check
+  int effective_rpm = get_effective_rpm(path);
+  {
+    std::lock_guard<std::mutex> lock(clients_mutex_);
+    auto it = client_buckets_.find(client_ip);
+    if (it == client_buckets_.end()) {
+      // New client — create bucket with effective RPM
+      ClientState state;
+      state.bucket = make_bucket(effective_rpm);
+      state.last_seen = std::chrono::steady_clock::now();
+      // Consume one token for this request
+      state.bucket.tokens -= 1.0;
+      client_buckets_[client_ip] = state;
+
+      result.allowed = true;
+      result.limit = effective_rpm;
+      result.remaining = static_cast<int>(state.bucket.tokens);
+      auto reset_time = std::chrono::system_clock::now() + std::chrono::seconds(60);
+      result.reset_epoch_seconds =
+          std::chrono::duration_cast<std::chrono::seconds>(reset_time.time_since_epoch()).count();
+      return result;
+    }
+
+    auto & state = it->second;
+    state.last_seen = std::chrono::steady_clock::now();
+
+    if (!try_consume(state.bucket)) {
+      result.allowed = false;
+      result.limit = effective_rpm;
+      result.remaining = 0;
+      double wait = seconds_until_next_token(state.bucket);
+      result.retry_after_seconds = std::max(1, static_cast<int>(std::ceil(wait)));
+      auto reset_time = std::chrono::system_clock::now() + std::chrono::seconds(result.retry_after_seconds);
+      result.reset_epoch_seconds =
+          std::chrono::duration_cast<std::chrono::seconds>(reset_time.time_since_epoch()).count();
+      return result;
+    }
+


A global token is consumed before the per-client bucket check. If the per-client check rejects, the request was not served but the global bucket has already been decremented, effectively reducing capacity for other clients and making the global limit stricter than intended. Consider checking per-client first and only consuming global on success, or implementing a rollback/2-phase consume so global capacity reflects only allowed requests.

Known trade-off. Checking global first allows early rejection without a per-client map lookup, which is cheaper under sustained global overload. The overcounting is bounded by the number of concurrent per-client rejections—a small fraction of total traffic in practice. Reversing the order (per-client first, then global) would be more accurate but adds map overhead on every globally-rejected request.

Copilot · 2026-02-17T14:25:34Z

src/ros2_medkit_gateway/src/http/rate_limiter.cpp

+    result.allowed = true;
+    result.limit = effective_rpm;
+    result.remaining = std::max(0, static_cast<int>(state.bucket.tokens));
+    auto reset_time = std::chrono::system_clock::now() + std::chrono::seconds(60);


X-RateLimit-Reset is currently set to now + 60s for allowed requests regardless of current token state or configured RPM, which makes the header misleading for a token-bucket implementation (and can’t be interpreted as “when the limit resets”). Consider computing reset based on the bucket’s refill schedule (e.g., time until the bucket is full again, or at least time until the next token becomes available), and aligning it consistently with the semantics you expect clients to use.

Suggested change

auto reset_time = std::chrono::system_clock::now() + std::chrono::seconds(60);

double wait = seconds_until_next_token(state.bucket);

int reset_in_seconds = std::max(1, static_cast<int>(std::ceil(wait)));

auto reset_time = std::chrono::system_clock::now() + std::chrono::seconds(reset_in_seconds);

Agreed the now + 60s approximation is coarse. For a token bucket, the most meaningful value for allowed requests is "when will one more token be available", which is seconds_until_next_token() = 0 when tokens remain—also not very useful. Deferred to a follow-up: the suggested fix (compute reset from actual refill schedule) is correct but requires passing bucket state through to this return path.

eclipse0922 · 2026-02-17T14:36:47Z

lot of reviews,, I wil take a look.

bburda · 2026-02-19T08:20:23Z

@eclipse0922 Is this PR ready for review? It's still in a draft, but I see you've already fixed a few issues.

eclipse0922 · 2026-02-19T08:28:51Z

@eclipse0922 Is this PR ready for review? It's still in a draft, but I see you've already fixed a few issues.

Yes, ready for review. All critical Copilot findings (CORS on 429, data race, broken endpoint overrides, global token waste) are fixed. One minor item (clock injection for test determinism) is deferred — not flaky in practice. Integration launch test added in 5c784e7.

bburda

Well-structured rate limiter - good design choices: lazy-refill token bucket (no background threads), two-layer global+per-client, composite bucket key for endpoint overrides, CAS-based cleanup trigger, per-client check before global to prevent DoS on global pool. 24 unit tests + integration test.

1 bug, 1 missing docs, 3 minor comments inline.

Docs: The PR checklist has "Docs were updated if behavior or public API changed" unchecked. Rate limiting is a user-facing feature - docs/api/rest.rst (or a new docs/configuration/rate_limiting.rst) should document:

Configuration reference (rate_limiting.* parameters)
Rate limit response headers (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset)
429 error response format and Retry-After header
Endpoint override pattern syntax

This can be a follow-up issue if you prefer not to block the PR on it, but it should be tracked.

bburda · 2026-02-19T19:55:57Z

src/ros2_medkit_gateway/src/http/rest_server.cpp

+      if (cors_config_.enabled) {
+        std::string origin = req.get_header_value("Origin");
+        if (!origin.empty() && is_origin_allowed(origin)) {
+          set_cors_headers(res, origin);


Bug: CORS headers duplicated for allowed requests.

When rate limiting is enabled and the request passes the rate check, set_cors_headers() is called here, and then called again in the existing CORS block ~20 lines below (which still runs for non-OPTIONS requests).

Headers is std::multimap<string, string> and set_header() uses emplace, so this creates duplicate Access-Control-Allow-Origin, Access-Control-Allow-Methods, etc. Duplicate Access-Control-Allow-Origin causes browsers to reject the response entirely.

Suggested fix: Reorder so existing CORS handling runs first (for all non-OPTIONS), then rate limit check. The rate limiter block for rejected requests can just return Handled since CORS headers are already set:

// 1. CORS headers (existing code) // 2. OPTIONS -> return Handled (existing) // 3. Rate limit check -> if rejected, return Handled (CORS already set) // 4. Auth check (existing)

This eliminates the duplicated CORS logic entirely.

bburda · 2026-02-19T19:55:57Z

src/ros2_medkit_gateway/src/gateway_node.cpp

+        size_t colon = entry.rfind(':');
+        if (colon != std::string::npos && colon > 0) {
+          std::string pattern = entry.substr(0, colon);
+          int rpm = std::stoi(entry.substr(colon + 1));


nit: If someone writes endpoint_limits: ["/api/v1/health:abc"], the error from std::stoi ("invalid stoi argument") gets caught by the outer try-catch, producing: Invalid rate limiting configuration: invalid stoi argument. Rate limiting disabled.

Not very actionable. Consider wrapping:

try { rpm = std::stoi(entry.substr(colon + 1)); } catch (...) { RCLCPP_WARN(get_logger(), "Non-numeric RPM in endpoint_limit: '%s'", entry.c_str()); continue; }

This way a single bad entry doesn't disable rate limiting entirely.

bburda · 2026-02-19T19:55:58Z

src/ros2_medkit_gateway/src/http/rate_limiter.cpp

+        cleanup_stale_clients();
+      }
+    }
+  }


nit: After the per-client check passes, clients_mutex_ is released before global_mutex_ is acquired. If the global bucket is exhausted, the client token was already consumed but the request is rejected. This is a minor token leak (~1 token) during global overload.

Acceptable for a rate limiter, but worth a brief comment explaining this is by-design (e.g. // Note: client token already consumed; minor over-count under global pressure is acceptable).

bburda · 2026-02-19T19:55:58Z

src/ros2_medkit_gateway/config/gateway_params.yaml

+      # Pattern uses * as single-segment wildcard
+      # Example: "/api/v1/*/operations/*:10"
+      endpoint_limits: [""]
+


nit: The empty string is a ROS 2 workaround for declaring empty string arrays. Worth a brief comment so users don't think this is a real entry:

# ROS 2 requires at least one element; empty string is ignored at runtime. endpoint_limits: [""]

Two-layer rate limiter (global + per-client IP) integrated into the pre-routing handler before CORS and auth checks. Uses lazy-refill token bucket algorithm with no background threads. Disabled by default for backward compatibility. Includes RateLimitConfig builder with validation, endpoint-specific overrides via glob patterns, stale client cleanup, SOVD-compliant 429 error responses, and 24 unit tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…imiter - Apply CORS headers (including Authorization in Allow-Headers) before returning 429, so browser clients can read cross-origin rate-limit responses without a CORS error - Replace steady_clock::time_point last_cleanup_ with atomic<int64_t> to eliminate the data race on the cleanup timer field that is read without a mutex in check() and written under clients_mutex_ - Rewrite StaleClientsAreCleaned test to use 2 RPM so natural bucket refill cannot recover within the idle window, making cleanup the only possible path for the request to be allowed again Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…te-limiter changes)

…and TOCTOU in rate limiter Reorder check() to run per-client before global bucket, preventing exhausted clients from draining the global pool (DoS vector). Use composite bucket key (client_ip|pattern) so per-endpoint overrides apply correctly for existing clients. Compute X-RateLimit-Reset from actual refill time instead of hardcoded 60s. Replace cleanup trigger with CAS to prevent redundant concurrent cleanups. Increase test sleep margins to reduce flakiness. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mfaferek93 requested review from bburda, Copilot and mfaferek93 February 17, 2026 14:17

mfaferek93 assigned mfaferek93 and eclipse0922 and unassigned mfaferek93 Feb 17, 2026

Copilot started reviewing on behalf of mfaferek93 February 17, 2026 14:17 View session

Copilot AI reviewed Feb 17, 2026

View reviewed changes

eclipse0922 marked this pull request as draft February 17, 2026 17:16

eclipse0922 force-pushed the feat/91-rate-limiting branch from 8292ca1 to 3103d5b Compare February 18, 2026 01:33

eclipse0922 marked this pull request as ready for review February 19, 2026 08:28

eclipse0922 force-pushed the feat/91-rate-limiting branch from 5c784e7 to b348258 Compare February 19, 2026 08:30

bburda requested changes Feb 19, 2026

View reviewed changes

eclipse0922 and others added 7 commits February 20, 2026 09:14

ci: re-run CI to verify flaky test_41 (race condition unrelated to ra…

6bbb286

…te-limiter changes)

test(gateway): add rate limiting integration launch test

7292031

style(gateway): fix include order in rest_server.hpp for clang-format

f5b7cca

ci: retrigger PR checks after rolling flaky failure

284ea74

eclipse0922 force-pushed the feat/91-rate-limiting branch from 984e6f4 to 284ea74 Compare February 20, 2026 00:14

Conversation

eclipse0922 commented Feb 17, 2026

Summary

Issue

Type

What Changed

New files

Modified files

Configuration

Response headers (all responses when enabled)

Rejection response (429)

Testing

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eclipse0922 commented Feb 17, 2026

Uh oh!

bburda commented Feb 19, 2026

Uh oh!

eclipse0922 commented Feb 19, 2026

Uh oh!

bburda left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment