Skip to content

feat: auto-detect backend from environment variables#88

Open
27Bslash6 wants to merge 3 commits intomainfrom
feat/auto-detect-backend-from-env
Open

feat: auto-detect backend from environment variables#88
27Bslash6 wants to merge 3 commits intomainfrom
feat/auto-detect-backend-from-env

Conversation

@27Bslash6
Copy link
Copy Markdown
Contributor

@27Bslash6 27Bslash6 commented Apr 6, 2026

Summary

Closes #87

  • DefaultBackendProvider now auto-detects the cache backend from CACHEKIT_-prefixed environment variables instead of always defaulting to Redis
  • Raises ConfigurationError when multiple CACHEKIT_-prefixed backend vars are set concurrently (ambiguous config)
  • Non-prefixed REDIS_URL never conflicts — it's a 12-factor fallback that yields to any explicit CACHEKIT_ var

Priority chain: CACHEKIT_API_KEYCACHEKIT_REDIS_URLCACHEKIT_MEMCACHED_SERVERSCACHEKIT_FILE_CACHE_DIRREDIS_URL → None (L1-only)

Unchanged: explicit backend= parameter and set_default_backend() always take precedence over env-var detection.

Test plan

  • 15 new unit tests covering all detection paths, conflict errors, fallback, caching, and error messages
  • 1341 existing unit tests still pass
  • Ruff lint + format clean
  • basedpyright clean
  • Pre-commit hooks (including detect-secrets) pass

Summary by CodeRabbit

  • New Features

    • Environment-driven cache backend auto-detection with lazy one-time resolution and cached result.
    • Async operations automatically fall back to L1-only when no L2 backend is available.
  • Bug Fixes

    • Sync mode now reports a clear permanent backend error when no backend is configured.
    • Detects and errors on ambiguous / multiple backend configurations.
  • Tests

    • Expanded tests covering auto-detection, env-var mappings, fallbacks, reuse, conflict errors, and L1-only behavior.
  • Chores

    • CI workflow steps updated to install and select Rust toolchains/tooling via direct commands.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 6, 2026

📝 Walkthrough

Walkthrough

DefaultBackendProvider now auto-detects the L2 backend from CACHEKIT_* or REDIS_URL env vars, can return None for L1-only mode, and raises ConfigurationError on ambiguous backend env combinations. Wrapper behavior changed: sync fails fast with a permanent BackendError when no L2, async executes original function (L1-only). .secrets.baseline updated.

Changes

Cohort / File(s) Summary
Secrets Baseline
\.secrets\.baseline
Added detect_secrets.filters.common.is_baseline_file filter for the baseline filename, added an unverified Secret Keyword baseline entry for src/cachekit/backends/provider.py:139, and updated generated_at.
Backend Provider Auto-Detection
src/cachekit/backends/provider.py
Refactored DefaultBackendProvider for lazy env-driven resolution via _resolve_provider(), supports CachekitIO/Redis/Memcached/File selection, treats REDIS_URL as Redis fallback, enforces ambiguity errors (raises ConfigurationError when multiple backend envs set), caches resolution, may return None for L1-only, and introduced _StaticBackendProvider plus per-backend factory helpers.
Wrapper Control Flow
src/cachekit/decorators/wrapper.py
Sync wrapper now raises BackendError(..., PERMANENT) when L2 backend resolution yields None. Async wrapper now returns early and runs the original coroutine (skips L2) when backend is None.
Provider Tests
tests/unit/backends/test_provider.py
Reworked tests to assert lazy resolution state and added TestDefaultBackendProviderAutoDetect covering single-env detection (CachekitIO/Redis/Memcached/File), REDIS_URL fallback, L1-only (None) behavior, cached resolution, and conflict cases raising ConfigurationError. Added necessary imports and assertions.
L1-only Test Adjustment
tests/unit/test_l1_only_mode.py
Wraps CACHEKIT_MASTER_KEY env var setup/teardown around @cache.secure(..., backend=None) to ensure required config is present while asserting L1-only behavior.
Workflows (Rust toolchain & tools)
.github/workflows/...security-*.yml, .github/workflows/fuzz-smoke.yml
Replaced dtolnay/rust-toolchain and some third-party action usages with inline rustup toolchain install/default steps; replaced Kani action with explicit install and cargo kani invocation in workflow.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client Code
    participant Provider as DefaultBackendProvider
    participant Env as Env Vars
    participant Resolver as _resolve_provider()
    participant Factory as Backend Factories

    Client->>Provider: get_backend()
    alt first call (_resolved = False)
        Provider->>Resolver: _resolve_provider()
        Resolver->>Env: inspect CACHEKIT_API_KEY, CACHEKIT_REDIS_URL, REDIS_URL, CACHEKIT_MEMCACHED_SERVERS, CACHEKIT_FILE_CACHE_DIR
        alt CACHEKIT_API_KEY set
            Resolver->>Factory: create CachekitIOBackend
            Factory-->>Resolver: CachekitIO instance
        else if CACHEKIT_REDIS_URL or REDIS_URL set
            Resolver->>Factory: create RedisBackend
            Factory-->>Resolver: Redis instance
        else if CACHEKIT_MEMCACHED_SERVERS set
            Resolver->>Factory: create MemcachedBackend
            Factory-->>Resolver: Memcached instance
        else if CACHEKIT_FILE_CACHE_DIR set
            Resolver->>Factory: create FileBackend
            Factory-->>Resolver: File instance
        else no backend vars
            Resolver-->>Provider: None
        end
        alt multiple conflicting vars
            Resolver-->>Provider: raise ConfigurationError
        end
        Provider->>Provider: cache result, set _resolved = True
    else cached (_resolved = True)
        Provider->>Provider: return cached backend or None
    end
    Provider-->>Client: backend instance or None
Loading
sequenceDiagram
    participant User as User Code
    participant Sync as sync_wrapper
    participant Async as async_wrapper
    participant L1 as L1 Cache
    participant L2 as L2 Backend / Provider

    User->>Sync: call decorated function
    Sync->>L2: resolve backend (get_backend)
    alt backend is None (L1-only)
        Sync->>Sync: raise BackendError(PERMANENT)
        Sync-->>User: error
    else backend resolved
        Sync->>L1: check L1
        alt L1 hit
            L1-->>User: return value
        else L1 miss
            Sync->>L2: query L2
            L2-->>Sync: value
            Sync->>L1: store
            Sync-->>User: return value
        end
    end

    User->>Async: call decorated coroutine
    Async->>L2: resolve backend (get_backend)
    alt backend is None (L1-only)
        Async->>L1: check L1
        alt L1 hit
            L1-->>User: return value
        else L1 miss
            Async->>User: execute original coroutine (skip L2)
            User-->>Async: result
            Async->>L1: store
            Async-->>User: return value
        end
    else backend resolved
        Async->>L1: check L1
        alt L1 hit
            L1-->>User: return value
        else L1 miss
            Async->>L2: query L2
            L2-->>Async: value
            Async->>L1: store
            Async-->>User: return value
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I sniffed the env vars, one by one,
CACHEKIT whispers, Redis hums a drum,
If rules collide I thump my paw,
Async skips, sync raises awe—hop hurrah! 🌿

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main feature: auto-detecting cache backend from environment variables, which is the core objective of this PR.
Description check ✅ Passed The PR description covers summary, motivation via issue closure, test results, and mentions commit topics, but misses explicit Type of Change and Documentation Validation checkboxes from the template.
Linked Issues check ✅ Passed The PR fully implements the auto-detection logic per issue #87: environment variable priority chain, ConfigurationError for ambiguous configs, and precedence rules all addressed in code and tests.
Out of Scope Changes check ✅ Passed Workflow file updates (GitHub Actions Rust toolchain replacements) are tangential to the auto-detection feature but are justified as pre-commit hook improvements and tooling modernization, not core out-of-scope additions.
Docstring Coverage ✅ Passed Docstring coverage is 80.65% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/auto-detect-backend-from-env

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 6, 2026

Codecov Report

❌ Patch coverage is 92.42424% with 5 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/cachekit/decorators/wrapper.py 0.00% 2 Missing and 2 partials ⚠️
src/cachekit/backends/provider.py 98.38% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/cachekit/backends/provider.py (1)

149-164: Minor thread-safety consideration for concurrent initialization.

The lazy initialization pattern (if not self._resolved) is not thread-safe. Concurrent threads could race and both enter _resolve_provider(). Since _resolve_provider() is idempotent (reads env vars, creates provider), this is benign—at worst, duplicate providers are created momentarily.

If strict single-initialization is required in the future, consider adding a lock:

♻️ Optional: thread-safe initialization
+import threading
+
 class DefaultBackendProvider(BackendProviderInterface):
+    _init_lock = threading.Lock()
+
     def get_backend(self):
-        if not self._resolved:
-            self._provider = self._resolve_provider()
-            self._resolved = True
+        if not self._resolved:
+            with self._init_lock:
+                if not self._resolved:  # double-check
+                    self._provider = self._resolve_provider()
+                    self._resolved = True
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cachekit/backends/provider.py` around lines 149 - 164, The lazy init in
get_backend() can race since it checks self._resolved and calls
_resolve_provider() without synchronization; to make initialization thread-safe,
add a lock (e.g., self._init_lock = threading.Lock()) and wrap the resolve
sequence in get_backend() with the lock so only one thread runs
self._resolve_provider() and sets self._provider/self._resolved, leaving other
threads to read the already-set self._provider; update any constructor (or class
init) to create the lock and ensure get_backend() uses it around the if not
self._resolved / self._resolve_provider() block.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/cachekit/backends/provider.py`:
- Around line 149-164: The lazy init in get_backend() can race since it checks
self._resolved and calls _resolve_provider() without synchronization; to make
initialization thread-safe, add a lock (e.g., self._init_lock =
threading.Lock()) and wrap the resolve sequence in get_backend() with the lock
so only one thread runs self._resolve_provider() and sets
self._provider/self._resolved, leaving other threads to read the already-set
self._provider; update any constructor (or class init) to create the lock and
ensure get_backend() uses it around the if not self._resolved /
self._resolve_provider() block.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7641c536-c1e2-402b-a7a5-6a7313b97b12

📥 Commits

Reviewing files that changed from the base of the PR and between 575c047 and b603be2.

📒 Files selected for processing (4)
  • .secrets.baseline
  • src/cachekit/backends/provider.py
  • src/cachekit/decorators/wrapper.py
  • tests/unit/backends/test_provider.py

@27Bslash6 27Bslash6 force-pushed the feat/auto-detect-backend-from-env branch from b603be2 to 5dd48df Compare April 6, 2026 22:49
DefaultBackendProvider now resolves the cache backend from CACHEKIT_-prefixed
environment variables instead of always defaulting to Redis. Raises
ConfigurationError when multiple CACHEKIT_ backend vars are set concurrently.

Priority: CACHEKIT_API_KEY > CACHEKIT_REDIS_URL > CACHEKIT_MEMCACHED_SERVERS >
CACHEKIT_FILE_CACHE_DIR > REDIS_URL (12-factor fallback) > None (L1-only).

Non-prefixed REDIS_URL never conflicts with CACHEKIT_ vars.
@27Bslash6 27Bslash6 force-pushed the feat/auto-detect-backend-from-env branch from 5dd48df to 31fc1c1 Compare April 8, 2026 10:49
validate_encryption_config() checks the env var independently of the
inline master_key param passed to @cache.secure. The test was flaky
(68% fail rate on main) because it depended on whether CACHEKIT_MASTER_KEY
happened to be set from a prior test's singleton cache.
The model-checking/kani-github-action composite action internally uses
unpinned dtolnay/rust-toolchain@stable and actions/checkout@v3, which
violates the org's SHA-pinning policy and blocks Security Deep runs.

Replace all dtolnay/rust-toolchain uses with plain rustup commands
(already the pattern in fuzzing/miri/sanitizer jobs) and replace the
Kani composite action with cargo install kani-verifier + cargo kani.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/unit/test_l1_only_mode.py (1)

236-253: Consider using pytest's monkeypatch fixture for env var management.

The manual os.environ save/restore via try/finally is correct, but pytest's monkeypatch fixture handles this more concisely and is consistent with idiomatic pytest usage (auto-restores on teardown, including on exceptions raised outside the try).

♻️ Proposed refactor
-    def test_intent_presets_with_backend_none(self):
+    def test_intent_presets_with_backend_none(self, monkeypatch):
@@
             # Test `@cache.secure`(master_key="...", backend=None)
             # validate_encryption_config() checks CACHEKIT_MASTER_KEY env var
             # independently of the inline master_key param, so we must set it.
             secure_call_count = 0
-            old_key = os.environ.get("CACHEKIT_MASTER_KEY")
-            os.environ["CACHEKIT_MASTER_KEY"] = "a" * 64
-            try:
-
-                `@cache.secure`(master_key="a" * 64, backend=None)
-                def secure_func() -> str:
-                    nonlocal secure_call_count
-                    secure_call_count += 1
-                    return "secure"
-
-                assert secure_func() == "secure"
-                assert secure_func() == "secure"
-                assert secure_call_count == 1, f"@cache.secure L1 miss - called {secure_call_count} times"
-            finally:
-                if old_key is None:
-                    os.environ.pop("CACHEKIT_MASTER_KEY", None)
-                else:
-                    os.environ["CACHEKIT_MASTER_KEY"] = old_key
+            monkeypatch.setenv("CACHEKIT_MASTER_KEY", "a" * 64)
+
+            `@cache.secure`(master_key="a" * 64, backend=None)
+            def secure_func() -> str:
+                nonlocal secure_call_count
+                secure_call_count += 1
+                return "secure"
+
+            assert secure_func() == "secure"
+            assert secure_func() == "secure"
+            assert secure_call_count == 1, f"@cache.secure L1 miss - called {secure_call_count} times"

With this change, the import os added at line 17 can also be dropped if it's not used elsewhere in the file.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/test_l1_only_mode.py` around lines 236 - 253, Replace manual
os.environ save/restore with pytest's monkeypatch: in the test that sets
CACHEKIT_MASTER_KEY and defines secure_func (the block using
os.environ.get("CACHEKIT_MASTER_KEY"), os.environ["CACHEKIT_MASTER_KEY"]="a"*64,
and the try/finally), use the monkeypatch fixture to call
monkeypatch.setenv("CACHEKIT_MASTER_KEY", "a"*64) before defining
`@cache.secure-decorated` secure_func and remove the try/finally cleanup; also
remove the now-unused import os at the top of the file if it is no longer
referenced.
.github/workflows/security-deep.yml (1)

24-27: Pin the Kani verifier version for reproducible security runs.

cargo install --locked kani-verifier installs the latest published kani-verifier at workflow runtime. Since this is a scheduled daily security gate, pin a known-good Kani version and update it intentionally to avoid silent verifier changes.

♻️ Example adjustment
     - name: Install Kani verifier
       run: |
-        cargo install --locked kani-verifier
+        cargo install --locked kani-verifier --version "$KANI_VERSION"
         cargo kani setup

Define KANI_VERSION at the workflow or job level with the version validated for this repo.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/security-deep.yml around lines 24 - 27, The workflow
currently installs the latest kani-verifier at runtime; define and use a pinned
KANI_VERSION variable and update the "Install Kani verifier" step to install
that specific version (e.g., use KANI_VERSION when invoking cargo install
--locked kani-verifier) and run cargo kani setup afterwards; add KANI_VERSION at
workflow or job level so future updates are intentional and reproducible.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/security-deep.yml:
- Around line 24-27: The workflow currently installs the latest kani-verifier at
runtime; define and use a pinned KANI_VERSION variable and update the "Install
Kani verifier" step to install that specific version (e.g., use KANI_VERSION
when invoking cargo install --locked kani-verifier) and run cargo kani setup
afterwards; add KANI_VERSION at workflow or job level so future updates are
intentional and reproducible.

In `@tests/unit/test_l1_only_mode.py`:
- Around line 236-253: Replace manual os.environ save/restore with pytest's
monkeypatch: in the test that sets CACHEKIT_MASTER_KEY and defines secure_func
(the block using os.environ.get("CACHEKIT_MASTER_KEY"),
os.environ["CACHEKIT_MASTER_KEY"]="a"*64, and the try/finally), use the
monkeypatch fixture to call monkeypatch.setenv("CACHEKIT_MASTER_KEY", "a"*64)
before defining `@cache.secure-decorated` secure_func and remove the try/finally
cleanup; also remove the now-unused import os at the top of the file if it is no
longer referenced.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b5ebb031-57aa-4d50-a997-198c555d0202

📥 Commits

Reviewing files that changed from the base of the PR and between 31fc1c1 and c3e5032.

📒 Files selected for processing (5)
  • .github/workflows/fuzz-smoke.yml
  • .github/workflows/security-deep.yml
  • .github/workflows/security-fast.yml
  • .github/workflows/security-medium.yml
  • tests/unit/test_l1_only_mode.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: auto-detect backend from environment variables in DefaultBackendProvider

1 participant