Skip to content

[Feat]:Add per-agent rate limiting middleware with configurable limits (closes #434)#435

Open
janhavitupe wants to merge 2 commits intoGetBindu:mainfrom
janhavitupe:feature/rate-limiting
Open

[Feat]:Add per-agent rate limiting middleware with configurable limits (closes #434)#435
janhavitupe wants to merge 2 commits intoGetBindu:mainfrom
janhavitupe:feature/rate-limiting

Conversation

@janhavitupe
Copy link
Copy Markdown

@janhavitupe janhavitupe commented Apr 3, 2026

Summary

This PR introduces a per-agent rate limiting middleware to prevent abuse, ensure fair usage, and improve system stability under load.

Changes

  • Added rate limiting middleware using a token bucket/sliding window approach
  • Enforces request limits per agent
  • Configurable limits via settings
  • Integrated middleware into server application layer
  • Added unit tests for rate limiting behavior

Behavior

  • Requests exceeding the configured limit are rejected
  • Limits are applied independently per agent
  • Designed to be lightweight and non-blocking

Why this change?

Currently, the system allows unbounded requests per agent, which can lead to resource exhaustion and unfair usage. This PR introduces a foundational mechanism to control request flow and improve reliability.

Notes

  • Implementation is in-memory (can be extended to Redis in future)
  • Focused on correctness and minimal overhead

(closes #434)

Summary by CodeRabbit

  • New Features
    • Added configurable rate-limiting for API endpoints (requests-per-window, window duration, exempt paths, trust/proxy handling).
    • Rate limiting can be enabled and tuned via environment-based settings; sensible defaults include exemptions for health/metrics and agent routes.
  • Tests
    • Added unit and integration tests validating sliding-window counting, enforcement (HTTP 429 + Retry-After), and exempt-path behavior.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 3, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 78c36736-d449-4c64-a094-e0140d6f68e3

📥 Commits

Reviewing files that changed from the base of the PR and between 5e46a4d and 22faec1.

📒 Files selected for processing (1)
  • bindu/server/applications.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • bindu/server/applications.py

📝 Walkthrough

Walkthrough

This PR adds a configurable sliding-window rate-limiting middleware, settings to control it, wiring to install the middleware during app setup, tests for behavior and edge cases, and a tiny import-order tweak in a DID extension file.

Changes

Cohort / File(s) Summary
Configuration & Settings
bindu/settings.py
Added RateLimitSettings model (enabled, requests_per_window, window_seconds, exempt_paths) and integrated it as Settings.rate_limit.
Middleware Implementation
bindu/server/middleware/rate_limit.py, bindu/server/middleware/__init__.py
New RateLimitMiddleware (uses internal _SlidingWindowCounter) implemented and exported from middleware package.
Application Integration
bindu/server/applications.py, bindu/penguin/bindufy.py
BinduApplication accepts optional rate_limit_config and conditionally installs RateLimitMiddleware; _bindufy_core() passes validated rate_limit config.
Tests
tests/unit/server/middleware/test_rate_limit.py
New unit and integration tests for sliding-window behavior, eviction/boundary cases, HTTP 429 payload and Retry-After, and exempt-path bypass.
Minor Fixes
bindu/extensions/did/did_agent_extension.py
Reordered from __future__ import annotations to appear before standard library imports.

Sequence Diagram

sequenceDiagram
    participant Client
    participant RateLimit as RateLimitMiddleware
    participant Counter as SlidingWindowCounter
    participant App as BinduApplication

    Client->>RateLimit: HTTP Request
    RateLimit->>RateLimit: Check exempt_paths
    alt Exempt path
        RateLimit->>App: call_next()
        App-->>Client: Response
    else Non-exempt
        RateLimit->>RateLimit: Extract client IP
        RateLimit->>Counter: increment(client_ip)
        Counter->>Counter: Evict old timestamps
        Counter-->>RateLimit: current_count
        alt current_count > requests_per_window
            RateLimit-->>Client: 429 JSONResponse (Retry-After header)
        else
            RateLimit->>App: call_next()
            App-->>Client: Response
        end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 I hopped in code to count each tread,
Windows slide and timestamps spread,
When too many footsteps spur the rain,
I sprinkle "429" to slow the train,
Wait, retry, then bounce back again.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 38.89% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The description addresses the problem, explains why it matters, and summarizes changes, but lacks complete sections from the template including verification steps, evidence, and detailed security assessment. Add missing template sections: verification steps with test environment details, evidence of passing tests, human verification scenarios, security impact assessment, and compatibility/failure recovery information.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding per-agent rate limiting middleware with configurable limits, directly matching the primary objective of the PR.
Linked Issues check ✅ Passed The implementation fully addresses all key requirements from issue #434: per-agent rate limiting via sliding-window middleware, configurable limits, HTTP 429 rejection, environment variable configuration, thread-safe design, and in-memory counters with extensibility.
Out of Scope Changes check ✅ Passed All changes are directly related to rate limiting requirements; minor import reordering in did_agent_extension.py is incidental and does not introduce out-of-scope functionality.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@janhavitupe janhavitupe changed the title [Feat]:Add per-agent rate limiting middleware with configurable limits [Feat]:Add per-agent rate limiting middleware with configurable limits (closes #434) Apr 3, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
bindu/server/middleware/rate_limit.py (1)

33-40: Duplicate default values with RateLimitSettings.

_DEFAULT_EXEMPT_PATHS here duplicates the defaults in bindu/settings.py RateLimitSettings.exempt_paths. These could diverge if one is updated without the other.

Consider importing from settings or using a single source of truth.

♻️ Option: Reference settings for defaults
-_DEFAULT_EXEMPT_PATHS = frozenset(
-    {
-        "/health",
-        "/healthz",
-        "/metrics",
-        "/.well-known/agent.json",
-    }
-)
+from bindu.settings import app_settings
+
+_DEFAULT_EXEMPT_PATHS = frozenset(app_settings.rate_limit.exempt_paths)

Note: This creates a module-level dependency on settings, which may need consideration for testing. Alternatively, document that the middleware default should match RateLimitSettings defaults.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindu/server/middleware/rate_limit.py` around lines 33 - 40,
_DEFAULT_EXEMPT_PATHS duplicates the default exempt paths defined on
RateLimitSettings.exempt_paths, risking divergence; replace the hardcoded
frozenset by referencing the single source of truth
(RateLimitSettings.exempt_paths) or import that default from the settings module
and use it to initialize the middleware default, updating any usage of
_DEFAULT_EXEMPT_PATHS accordingly and adding a short comment about the new
module-level dependency for tests.
bindu/server/applications.py (1)

588-608: Rate limiting middleware wiring looks correct.

The configuration resolution with fallbacks to app_settings.rate_limit.* is well-implemented. A few observations:

  1. Middleware order: Rate limiting is added after CORS but before X402/auth middleware. This is reasonable—rate limiting should happen early to reject excessive requests before expensive auth/payment checks.

  2. Logging: Consider adding a log statement when rate limiting is enabled (similar to CORS on line 575) to aid debugging.

📝 Optional: Add logging when rate limiting is enabled
         if rl_enabled:
             from .middleware import RateLimitMiddleware
 
+            logger.info(
+                f"Rate limiting middleware enabled: {rl_requests} req/{rl_window}s"
+            )
             rl_requests = _rl.get(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindu/server/applications.py` around lines 588 - 608, Add a log message when
rate limiting is enabled similar to the CORS log: after computing rl_enabled and
before appending to middleware_list, emit a debug/info via the same logger used
for CORS (e.g., processLogger or the module logger) indicating rate limiting is
enabled and include the resolved values rl_requests, rl_window, and rl_exempt;
this should be done in the block that references RateLimitMiddleware and uses
rl_enabled, rl_requests, rl_window, and rl_exempt so operators can see the
effective rate-limit configuration.
bindu/settings.py (1)

965-972: Consider adding validation for positive values.

requests_per_window and window_seconds should be positive integers. A zero or negative value would break the rate limiting logic (e.g., window_seconds=0 causes immediate eviction of all timestamps).

♻️ Proposed fix to add validation
     requests_per_window: int = Field(
         default=60,
+        gt=0,
         description="Maximum requests allowed per client per window",
     )
     window_seconds: int = Field(
         default=60,
+        gt=0,
         description="Rolling window size in seconds",
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindu/settings.py` around lines 965 - 972, The two Pydantic fields
requests_per_window and window_seconds must be validated to be positive
integers; update their definitions to enforce gt=0 (e.g., use conint(gt=0) or
Field(..., gt=0)) or add a Pydantic `@validator` on the model that raises a
ValueError when either value is <= 0, and include a clear message referencing
the offending field (requests_per_window / window_seconds) so invalid configs
are rejected early.
tests/unit/server/middleware/test_rate_limit.py (1)

68-105: Solid integration test coverage.

Tests cover the core scenarios: under limit, over limit, Retry-After header, and exempt paths.

Consider adding tests for:

  • X-Forwarded-For header parsing (multiple IPs, whitespace handling)
  • Client with no IP (request.client is None)
🧪 Optional: Additional test cases for IP extraction
def test_uses_x_forwarded_for_header(self):
    app = _make_app(requests_per_window=1, window_seconds=60)
    client = TestClient(app, raise_server_exceptions=True)
    # First request from "real" IP
    client.get("/", headers={"X-Forwarded-For": "1.2.3.4, 5.6.7.8"})
    # Second request from different forwarded IP should be allowed
    resp = client.get("/", headers={"X-Forwarded-For": "9.10.11.12"})
    assert resp.status_code == 200
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/server/middleware/test_rate_limit.py` around lines 68 - 105, Add
unit tests in TestRateLimitMiddleware to cover X-Forwarded-For parsing and
request.client being None: create tests that call _make_app and TestClient with
headers={"X-Forwarded-For": "1.2.3.4, 5.6.7.8"} (and variations with whitespace
and multiple IPs) asserting the middleware uses the first IP so subsequent
requests from different forwarded IPs behave independently, and add a test that
simulates request.client == None (e.g., by building a test ASGI scope or using
TestClient override) to ensure the rate limiter falls back safely (no exception)
and enforces limits per behavior; reference TestRateLimitMiddleware, _make_app,
and TestClient when adding these tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bindu/server/middleware/rate_limit.py`:
- Around line 88-89: The _counters dict in the rate limiter grows unbounded; add
a cleanup routine to remove stale per-IP _SlidingWindowCounter entries and call
it periodically to prevent memory growth. Implement a method (e.g.,
_cleanup_stale_counters(now: float)) that acquires self._lock, computes a stale
cutoff like now - (self._window * 2), iterates self._counters items and deletes
keys whose counter._timestamps last entry is older than the cutoff (or counters
with empty timestamps), and then invoke this cleanup from the main request path
periodically (e.g., every N requests) or via a lightweight background task to
prune inactive IPs.
- Around line 106-113: The _client_ip function currently trusts the
X-Forwarded-For header which can be spoofed; add a configurable safeguard by
introducing a trust_proxy boolean (e.g., in the rate limiter class constructor)
and only read X-Forwarded-For when trust_proxy is true, otherwise fallback to
request.client.host; update docs/comments to require a trusted reverse proxy
that strips client-supplied X-Forwarded-For and recommend using Starlette’s
ProxyHeadersMiddleware upstream to validate/properly populate forwarded headers.

---

Nitpick comments:
In `@bindu/server/applications.py`:
- Around line 588-608: Add a log message when rate limiting is enabled similar
to the CORS log: after computing rl_enabled and before appending to
middleware_list, emit a debug/info via the same logger used for CORS (e.g.,
processLogger or the module logger) indicating rate limiting is enabled and
include the resolved values rl_requests, rl_window, and rl_exempt; this should
be done in the block that references RateLimitMiddleware and uses rl_enabled,
rl_requests, rl_window, and rl_exempt so operators can see the effective
rate-limit configuration.

In `@bindu/server/middleware/rate_limit.py`:
- Around line 33-40: _DEFAULT_EXEMPT_PATHS duplicates the default exempt paths
defined on RateLimitSettings.exempt_paths, risking divergence; replace the
hardcoded frozenset by referencing the single source of truth
(RateLimitSettings.exempt_paths) or import that default from the settings module
and use it to initialize the middleware default, updating any usage of
_DEFAULT_EXEMPT_PATHS accordingly and adding a short comment about the new
module-level dependency for tests.

In `@bindu/settings.py`:
- Around line 965-972: The two Pydantic fields requests_per_window and
window_seconds must be validated to be positive integers; update their
definitions to enforce gt=0 (e.g., use conint(gt=0) or Field(..., gt=0)) or add
a Pydantic `@validator` on the model that raises a ValueError when either value is
<= 0, and include a clear message referencing the offending field
(requests_per_window / window_seconds) so invalid configs are rejected early.

In `@tests/unit/server/middleware/test_rate_limit.py`:
- Around line 68-105: Add unit tests in TestRateLimitMiddleware to cover
X-Forwarded-For parsing and request.client being None: create tests that call
_make_app and TestClient with headers={"X-Forwarded-For": "1.2.3.4, 5.6.7.8"}
(and variations with whitespace and multiple IPs) asserting the middleware uses
the first IP so subsequent requests from different forwarded IPs behave
independently, and add a test that simulates request.client == None (e.g., by
building a test ASGI scope or using TestClient override) to ensure the rate
limiter falls back safely (no exception) and enforces limits per behavior;
reference TestRateLimitMiddleware, _make_app, and TestClient when adding these
tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b5802a8c-c156-465c-9a16-8e71d7b5df82

📥 Commits

Reviewing files that changed from the base of the PR and between 45656ee and 5e46a4d.

📒 Files selected for processing (7)
  • bindu/extensions/did/did_agent_extension.py
  • bindu/penguin/bindufy.py
  • bindu/server/applications.py
  • bindu/server/middleware/__init__.py
  • bindu/server/middleware/rate_limit.py
  • bindu/settings.py
  • tests/unit/server/middleware/test_rate_limit.py

Comment on lines +88 to +89
self._counters: dict[str, _SlidingWindowCounter] = {}
self._lock = threading.Lock()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Memory growth: _counters dict is never pruned.

The _counters dictionary grows unbounded as new client IPs are seen. In production with many unique IPs (or spoofed IPs), this will consume increasing memory over time.

Consider adding periodic cleanup of stale counters (e.g., counters with no requests in the last N windows).

🧹 Suggested approach: Periodic cleanup of stale counters
def _cleanup_stale_counters(self, now: float) -> None:
    """Remove counters that have been inactive for 2x the window period."""
    stale_cutoff = now - (self._window * 2)
    with self._lock:
        stale_keys = [
            ip for ip, counter in self._counters.items()
            if counter._timestamps and counter._timestamps[-1] < stale_cutoff
        ]
        for ip in stale_keys:
            del self._counters[ip]

Call this periodically (e.g., every 100 requests or via a background task).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindu/server/middleware/rate_limit.py` around lines 88 - 89, The _counters
dict in the rate limiter grows unbounded; add a cleanup routine to remove stale
per-IP _SlidingWindowCounter entries and call it periodically to prevent memory
growth. Implement a method (e.g., _cleanup_stale_counters(now: float)) that
acquires self._lock, computes a stale cutoff like now - (self._window * 2),
iterates self._counters items and deletes keys whose counter._timestamps last
entry is older than the cutoff (or counters with empty timestamps), and then
invoke this cleanup from the main request path periodically (e.g., every N
requests) or via a lightweight background task to prune inactive IPs.

Comment on lines +106 to +113
def _client_ip(self, request: Request) -> str:
"""Extract client IP, respecting X-Forwarded-For for proxied requests."""
forwarded_for = request.headers.get("x-forwarded-for")
if forwarded_for:
return forwarded_for.split(",")[0].strip()
if request.client:
return request.client.host
return "unknown"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Security: X-Forwarded-For header can be spoofed to bypass rate limiting.

Without a trusted proxy layer that strips/validates X-Forwarded-For, malicious clients can set arbitrary values to evade per-IP rate limits. Each spoofed IP gets its own counter.

Consider:

  1. Document the requirement for a trusted reverse proxy (nginx, Traefik, etc.) that sets X-Forwarded-For correctly and strips client-provided values.
  2. Add a configuration option like trust_proxy: bool to control whether to use X-Forwarded-For.
  3. Use Starlette's ProxyHeadersMiddleware upstream to handle this correctly.
🛡️ Option: Add trust_proxy configuration
     def __init__(
         self,
         app: Any,
         requests_per_window: int,
         window_seconds: int,
         exempt_paths: frozenset[str] = _DEFAULT_EXEMPT_PATHS,
+        trust_proxy: bool = False,
     ) -> None:
         super().__init__(app)
         self._limit = requests_per_window
         self._window = window_seconds
         self._exempt = exempt_paths
+        self._trust_proxy = trust_proxy
         self._counters: dict[str, _SlidingWindowCounter] = {}
         self._lock = threading.Lock()
 
     def _client_ip(self, request: Request) -> str:
         """Extract client IP, respecting X-Forwarded-For only if trusted."""
-        forwarded_for = request.headers.get("x-forwarded-for")
-        if forwarded_for:
-            return forwarded_for.split(",")[0].strip()
+        if self._trust_proxy:
+            forwarded_for = request.headers.get("x-forwarded-for")
+            if forwarded_for:
+                return forwarded_for.split(",")[0].strip()
         if request.client:
             return request.client.host
         return "unknown"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bindu/server/middleware/rate_limit.py` around lines 106 - 113, The _client_ip
function currently trusts the X-Forwarded-For header which can be spoofed; add a
configurable safeguard by introducing a trust_proxy boolean (e.g., in the rate
limiter class constructor) and only read X-Forwarded-For when trust_proxy is
true, otherwise fallback to request.client.host; update docs/comments to require
a trusted reverse proxy that strips client-supplied X-Forwarded-For and
recommend using Starlette’s ProxyHeadersMiddleware upstream to validate/properly
populate forwarded headers.

- Add trust_proxy config to prevent X-Forwarded-For spoofing (default: false)
  * Only trusts X-Forwarded-For when behind validated reverse proxy
  * Falls back to request.client.host when trust_proxy is disabled
  * Documented in RateLimitSettings and RateLimitMiddleware

- Add periodic cleanup of stale counters to prevent unbounded memory growth
  * Removes counters inactive for 2x window period
  * Configurable via request_cleanup_interval (default: 100 requests)
  * Runs every N requests to minimize overhead

- Add comprehensive test coverage for both features
  * test_cleanup_stale_counters: verifies stale counter removal
  * test_trust_proxy_*: three tests for proxy header handling

Fixes potential security issue of header spoofing and memory leak from
accumulating per-IP counters with many unique/spoofed IPs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add per-agent rate limiting middleware to prevent abuse and ensure fair usage

1 participant