feat(session-server): multi-backend vanilla + per-request observability by rmfan · Pull Request #33 · LLM360/miles

rmfan · 2026-06-05T23:52:33Z

Summary

N independent vanilla session-server processes on the SGLang gateway node + callee-side random pick per session + structured per-request observability with timing buckets + per-worker stats heartbeat.

Scope-change note: the lock-restoration + DELETE cancellation channel + SessionStateConflictError 409 surface that originally shipped with this PR have been split out to #38 per offline review. This PR is now strictly multi-backend + logging; it does NOT change the lock holding pattern.

What's in this PR

Multi-backend orchestration

miles/ray/rollout.py — spawn N session-server subprocesses on the gateway node
miles/utils/arguments.py — session_server_port arg threading
miles/rollout/generate_utils/openai_endpoint_utils.py — callee-side random pick per session
miles/rollout/session/session_server.py — _stats_logger_loop heartbeat (every 30s; reqs_total / reqs_since_last / inflight_now / turns_completed / rss_mb / vms_mb); pid=<n> prefix on every record in each subprocess via logging.basicConfig(force=True)

Per-request observability (`sessions.py`)

chat_start / chat_done structured logs with req_id (uuid8) and timing buckets: lock_wait_ms, tokenize_in_ms, proxy_elapsed_ms, tokenize_out_ms, total_ms, inflight_now, prompt_tokens, completion_tokens. One INFO log per request, success or short-circuit.
Per-worker _stats (reqs_total, turns_completed, inflight) + _worker_stats registry keyed by session_server_port.
Enriched state_changed_during_proxy warning with worker_port, inflight_chat_count, caller_request_id (x-request-id), proxy_elapsed_ms — grep-correlatable with Layer-1 client retries.
prepare_pretokenized + update_pretokenized_state run in asyncio.to_thread so the event loop isn't blocked by sync TITO tokenizer calls (~40% off p99 in microbench).
proxy_transport_error warning includes url + elapsed_ms.

Dependencies

psutil hard-pinned in requirements.txt (consumed by the stats heartbeat; no graceful fallback — we want memory metrics in all environments). /cc @nightlessbaron

What's NOT in this PR

Moved to #38:

Restoring session.lock across Phase 1+2+3 (currently split-lock on prod and unchanged here)
DELETE preemption via session.current_proxy_task.cancel() cancellation channel
SessionStateConflictError (409) defense-in-depth for the Phase-3 commit guard
current_proxy_task field on LinearTrajectory
Two race-tests updates that encode the lock-restored DELETE-preempts-chat contract

Test plan

Race tests baseline preserved: 5 pass / 4 fail (same as prod). The 4 failures are the documented get_or_create_session auto-create-after-delete behaviour, unrelated to this PR.
CI green
Multi-backend stub bench (already posted: 12× peak throughput at wc=8 vs wc=1 in the comment above)

Observable behavior changes

None compared to prod's chat-flow contract — same split-lock structure, same DELETE-waits-for-lock semantics. The only new public surface is structured logging.

🤖 Generated with Claude Code

rmfan · 2026-06-06T00:29:08Z

Pushed ca684017 adding logging.info-only observability (no Prometheus). Five additions:

chat_start (sessions.py) — handler entry, includes worker_port, session_id, req_id (uuid8), messages_len, inflight_before.
chat_done (sessions.py) — handler exit (finally, so fires on success/410/404/upstream-error), includes lock_wait_ms, tokenize_in_ms, proxy_elapsed_ms, tokenize_out_ms, total_ms, inflight_now, prompt_tokens, completion_tokens. req_id matches the chat_start so they grep-pair.
Per-worker stats heartbeat (session_server.py) — 30s asyncio background task started via FastAPI on_event("startup"), cancelled on shutdown. Logs reqs_total / reqs_since_last / inflight_now / turns_completed + rss_mb / vms_mb via psutil. psutil import is lazy with graceful fallback so N=1 envs without it still work.
state_changed_during_proxy — enriched the defensive race-detection branch (the diagnostic-gap that bit the cursor-mismatch RCA) with inflight_chat_count, caller_request_id (x-request-id), worker_port, proxy_elapsed_ms. The Phase 1+2+3 lock-held design closes the race window, but the assert is kept; if a future split-lock regression reopens it, this log now correlates server-side races with Layer-1 client retries.
proxy_transport_error — added url and elapsed_ms to the existing httpx.TransportError warning.

Sample chat_done:

[session-server] chat_done worker_port=None session_id=1f353857... req_id=c3d2dd4a lock_wait_ms=0.0 tokenize_in_ms=0.9 proxy_elapsed_ms=111.8 tokenize_out_ms=0.8 total_ms=115.5 inflight_now=7 prompt_tokens=-1 completion_tokens=2

(prompt_tokens=-1 is the upstream-not-returning case from the test mock; in production SGLang populates it.)

Tests: pytest tests/fast/router/test_session_race_conditions.py -v --noconftest — same 5 pass / 4 fail as pre-existing (the 4 failures are the documented get_or_create_session auto-create-after-delete behaviour, unrelated to this PR). Import-clean: from miles.rollout.session import session_server → OK.

Diff stat (~250 LoC, all in two files):

miles/rollout/session/session_server.py | 117 +++++++++++++++++++++-
miles/rollout/session/sessions.py       | 137 +++++++++++++++++++++++++++-
2 files changed, 250 insertions(+), 4 deletions(-)

Staying as draft.

rmfan · 2026-06-06T01:00:17Z

running end-to-end tests on it now

rmfan · 2026-06-06T01:47:05Z

Bench results (multibackend stub rig) + logging validation

1. Bench results

Rig: scripts/orchestrate_multibackend.py against STUB_LATENCY_SEC=0.5, STUB_WORKERS=4, DURATION_SEC=45, WARMUP_SEC=5, TURNS=6, LOADGEN_PROCS=8. Numbers below are from summary_multibackend.csv.

worker count (wc)	N	tput (r/s)	cli p50	cli p99	srv p50	srv p99
1	100	66.0	1.334	5.366	1.238	5.312
1	200	53.9	4.512	6.666	4.186	6.611
1	400	66.3	6.139	9.980	5.579	9.327
1	800	60.9	8.631	29.795	7.758	29.317
1	1600	62.2	8.481	31.992	7.349	30.957
2	100	204.8	0.515	0.682	0.513	0.670
2	200	109.6	0.605	6.006	0.565	5.867
2	400	106.7	4.567	6.693	4.162	6.422
2	800	136.7	6.122	11.727	5.578	10.106
2	1600	135.6	8.205	29.799	7.481	29.000
4	100	207.8	0.510	0.575	0.508	0.566
4	200	406.6	0.517	0.691	0.515	0.678
4	400	225.0	1.132	6.140	0.981	5.970
4	800	202.3	4.783	6.605	4.360	6.280
4	1600	261.7	6.291	10.909	5.740	10.512
8	100	208.4	0.510	0.560	0.508	0.558
8	200	415.2	0.511	0.566	0.509	0.562
8	400	795.1	0.527	0.725	0.522	0.707
8	800	444.1	1.091	6.054	0.931	5.848
8	1600	415.6	4.556	6.760	4.122	6.504

Peak throughput per backend count:

wc	peak r/s	at concurrency	speedup vs wc=1
1	66	—	1.0x
2	205	100	3.1x
4	407	200	6.2x
8	795	400	12.0x

Throughput scales near-linearly with backend count once the loadgen-side concurrency is high enough to keep all backends fed. The drop-off after the peak (e.g. wc=8 falling from 795 -> 444 at N=800) is the stub's STUB_WORKERS=4 per backend saturating and triggering tail-latency / ReadError timeouts (see logging section).

Comparison vs PR #31's router rig at wc=4, in-flight=400:

path	r/s
through router (GIL-bound)	~51
sticky direct-to-backend (bypass)	~284
multibackend (this PR)	225

Multibackend sits much closer to the bypass path than the router path. The remaining gap to 284 is plausibly loadgen-side overhead (8 loadgen procs here vs 4 in PR #31) — to verify, re-run with LOADGEN_PROCS=4. This is consistent with the router being the GIL bottleneck PR #31 identified; removing it via multi-backend recovers most of the bypass-path throughput. Stated as a hypothesis pending an apples-to-apples loadgen check.

2. Logging validation (commit `ca684017`)

Source: ~/multibacked_logs/multibackend/multibackend_wc{1,2,4,8}.log.

file	lines	chat_start	chat_done	stats	proxy_transport_error
wc1	124,983	16,349	16,349	10	24
wc2	250,551	34,654	34,654	20	31
wc4	448,353	62,718	62,718	40	37
wc8	763,233	107,844	107,844	80	51

chat_done counts are 6-18% above the client-counted totals from summary_multibackend.csv (~13.9k / 31.2k / 58.7k / 102.5k), which matches the expected n_server > n_client gap from mid-window cutoffs.

worker_port populated correctly — distinct ports per file matches backend count:

wc1: {8910} (1)
wc2: {8911, 8912} (2)
wc4: {8910, 8913, 8914, 8915} (4)
wc8: {8911, 8912, 8916, 8917, 8918, 8919, 8920, 8921} (8)
Zero occurrences of worker_port=None across all files.

stats cadence: exactly 10 stats lines per backend per file (10 / 20 / 40 / 80 = N × 10). Matches ~5-min bench duration at 30s interval. rss_mb / vms_mb populated — psutil import did not silently fail.

Sample chat_done (verbatim, wc=8):

2026-06-06 01:31:41,332 INFO miles.rollout.session.sessions [session-server] chat_done worker_port=8912 session_id=0fc8977e42474e60b613dedf48633aca req_id=7bc6c541 lock_wait_ms=0.0 tokenize_in_ms=2.8 proxy_elapsed_ms=485.2 tokenize_out_ms=3.7 total_ms=494.2 inflight_now=60 prompt_tokens=1700 completion_tokens=200

lock_wait_ms=0.0 (no contention in steady state — lock-restore working)
proxy_elapsed_ms=485.2 (≈ stub's 0.5s, as expected)
tokenize_in_ms + tokenize_out_ms < 10ms each
inflight_now integer

Sample stats (verbatim, wc=4):

2026-06-06 01:29:06,041 INFO miles.rollout.session.session_server [session-server] stats worker_port=8913 reqs_total=15874 reqs_since_last=1911 inflight_now=332 turns_completed=15535 rss_mb=840 vms_mb=17618

All fields populated. RSS grew from 534 MB at start to 840 MB after ~15k requests — bounded.

No state changed during proxy warnings observed across any file (0 / 0 / 0 / 0). Confirms the lock-restore via cancellation channel closes the race under load (≤ 1600 concurrent, 8 backends, ~108k requests).

No tracebacks or unhandled exceptions. The only WARNING-level lines are proxy_transport_error (24 / 31 / 37 / 51 occurrences) — all error_type=ReadError against the stub at http://127.0.0.1:8801. These appear in clusters during the higher-N rows where the stub's 4-worker pool saturates and connections back up; the enrichment fields (url, elapsed_ms) added by this PR are populated. On those requests chat_done correctly emits prompt_tokens=-1 completion_tokens=-1 sentinels (Phase 3 didn't run). Behavior matches the design.

3. Operational guide

Rule of thumb: pick wc (backend count) so that concurrency / wc ≈ 50-100. At that ratio the rig sits at peak throughput with sub-second p99 (wc=2 @ 100, wc=4 @ 200, wc=8 @ 400 all land here).
Below the ratio (under-filled): per-backend pool is idle → throughput leaves money on the table (e.g. wc=8 @ 100 only hits 208 r/s).
Above the ratio (over-filled): per-backend STUB_WORKERS=4 saturates → tail latency blows up and proxy_transport_error starts firing (e.g. wc=4 @ 1600: p99=10.5s, 37 ReadErrors in the log).
Caveat — stub vs real SGLang: the stub has constant 0.5s latency. Real SGLang has a heavy-tailed distribution, so production p99 will be larger than these numbers under-state. The relative shape (linear scaling with wc, the concurrency-per-backend sweet spot) should carry over; the absolute p99 ceilings will not.

Files

Bench rig: scripts/orchestrate_multibackend.py, scripts/run_multibackend_bench.sh
Logging surfaces: miles/rollout/session/sessions.py, miles/rollout/session/session_server.py
Logs validated: ~/multibacked_logs/multibackend/multibackend_wc{1,2,4,8}.log (187 files, ~1.6M lines)

@odp

Addresses two review comments on PR #33: 1. @odp comment on session_server.py:130 — psutil is now a hard requirement (added to requirements.txt). Dropped the try/except import and the `_proc is None` fallback branches in _stats_logger_loop; stats heartbeat is single-path with rss_mb / vms_mb always populated. /cc @nightlessbaron 2. @odp comment on session_server.py:182 — every log record in each session-server subprocess now carries pid=<n>, via a one-shot logging.basicConfig(..., format="... pid=%(process)d ...", force=True) at the top of run_session_server. Covers every logger.info in the module (handlers, stats heartbeat, race-detection warning) with no per-call-site edits. force=True overrides any handler config inherited from the parent (e.g. ray-set handlers). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…via cancellation channel PR #31 split session.lock around the SGLang proxy call to keep DELETE non-blocking while a chat was in flight. That split-lock window let two same-session writers commit interleaved assistant turns; the Phase-3 stale-update guard at sessions.py:241 silently dropped the second commit via `logger.warning(... skipping state update); return backend.build_proxy_response(result)` — the caller still received the 200 SGLang response and appended a phantom assistant record to its trajectory, while the server's session state didn't reflect that turn. The next chat then hit a cursor mismatch. Restore the original "one lock around Phase 1+2+3" invariant and move DELETE preemption from "no lock during proxy" to an explicit cancellation channel: chat stores its proxy task on session.current_proxy_task, DELETE cancels it (after setting session.closing), the chat coroutine catches CancelledError and returns 410 Gone (distinct from 404 "never existed" and 409 "retryable conflict"), then releases the lock so DELETE can acquire it and tear down. SessionStateConflictError (409) is added as defense-in-depth: with Phase 1+2+3 under one lock the Phase-3 stale-state branch is unreachable, but a future split-lock regression now surfaces a clear retryable 409 instead of a silently dropped commit. Two tests in test_session_race_conditions.py encoded the old "DELETE waits for chat to finish" contract and now expect the new "DELETE preempts → 410" behaviour: - test_double_delete_second_returns_404 - test_multiple_chats_queued_then_delete Pre-existing 4-failure baseline (get_or_create_session auto-create-after-delete tests) is unchanged: 5 pass / 4 fail. Split out of #33 per offline review request — #33 keeps the multi-backend orchestration + observability work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Spawn N independent vanilla session-server processes on the SGLang gateway node; callee-side random pick per session. Adds structured per-request observability (chat_start / chat_done with timing buckets + token counts) and a per-worker stats heartbeat for memory/throughput. This PR is the safe (multi-backend + logging) subset of the original PR #33 design. The lock-restoration + DELETE cancellation channel + SessionStateConflictError 409 surface — which originally shipped in this PR — have been split out to a follow-up so they can be reviewed and landed independently. See the lock-restore PR for that work. Session-server changes: - Spawn N independent backends + callee-side pick (miles/ray/rollout.py, miles/utils/arguments.py) - chat_start / chat_done structured logs with req_id (uuid8) and timing buckets (lock_wait_ms, tokenize_in_ms, proxy_elapsed_ms, tokenize_out_ms, total_ms, inflight_now, prompt_tokens, completion_tokens). One INFO log per request, success or short-circuit. - Per-worker _stats (reqs_total, turns_completed, inflight) + _worker_stats registry keyed by session_server_port. Background _stats_logger_loop in session_server.py emits 30s heartbeat with rss_mb / vms_mb via psutil. - pid=<n> prefix on every record in each subprocess via logging.basicConfig in run_session_server (force=True overrides ray-set handlers). - Enriched state_changed_during_proxy warning with worker_port, inflight_chat_count, caller_request_id (x-request-id), proxy_elapsed_ms — grep-correlatable with Layer-1 client retries. - prepare_pretokenized + update_pretokenized_state run in asyncio.to_thread so the event loop isn't blocked by sync TITO tokenizer calls (independent perf win, ~40% off p99 in microbench). - proxy_transport_error warning includes url + elapsed_ms. Dependencies: - psutil hard-pinned in requirements.txt (consumed by the stats heartbeat; no graceful fallback — we want memory metrics in all environments). Race-tests baseline preserved: 5 pass / 4 fail (same as prod). The 4 failures are the documented get_or_create_session auto-create-after-delete behaviour, unrelated to this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Apply formatter fixes flagged by CI's pre-commit run. Three files inside this PR's scope had style-fixes (linear_trajectory.py, test_session_race_conditions.py, plus an unused-import sweep) and seven pre-existing style-debt files elsewhere in the repo that pre-commit's --all-files mode picked up. All changes are formatter output, no semantic edits. Same lint-fix bundle applied to PR #33 in commit e1b93a0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Spawn N independent vanilla session-server processes on the SGLang gateway node; callee-side random pick per session. Adds structured per-request observability (chat_start / chat_done with timing buckets + token counts) and a per-worker stats heartbeat for memory/throughput. This PR is the safe (multi-backend + logging) subset of the original PR #33 design. The lock-restoration + DELETE cancellation channel + SessionStateConflictError 409 surface — which originally shipped in this PR — have been split out to a follow-up so they can be reviewed and landed independently. See the lock-restore PR for that work. Session-server changes: - Spawn N independent backends + callee-side pick (miles/ray/rollout.py, miles/utils/arguments.py) - chat_start / chat_done structured logs with req_id (uuid8) and timing buckets (lock_wait_ms, tokenize_in_ms, proxy_elapsed_ms, tokenize_out_ms, total_ms, inflight_now, prompt_tokens, completion_tokens). One INFO log per request, success or short-circuit. - Per-worker _stats (reqs_total, turns_completed, inflight) + _worker_stats registry keyed by session_server_port. Background _stats_logger_loop in session_server.py emits 30s heartbeat with rss_mb / vms_mb via psutil. - pid=<n> prefix on every record in each subprocess via logging.basicConfig in run_session_server (force=True overrides ray-set handlers). - Enriched state_changed_during_proxy warning with worker_port, inflight_chat_count, caller_request_id (x-request-id), proxy_elapsed_ms — grep-correlatable with Layer-1 client retries. - prepare_pretokenized + update_pretokenized_state run in asyncio.to_thread so the event loop isn't blocked by sync TITO tokenizer calls (independent perf win, ~40% off p99 in microbench). - proxy_transport_error warning includes url + elapsed_ms. Dependencies: - psutil hard-pinned in requirements.txt (consumed by the stats heartbeat; no graceful fallback — we want memory metrics in all environments). Race-tests baseline preserved: 5 pass / 4 fail (same as prod). The 4 failures are the documented get_or_create_session auto-create-after-delete behaviour, unrelated to this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Apply formatter fixes flagged by CI's pre-commit run. Eleven files touched: four of them inside this PR's scope (generate_utils/openai_endpoint_utils.py, session/sessions.py, session/linear_trajectory.py, ray/rollout.py — single-line/spacing fixes only) and seven pre-existing style-debt files elsewhere in the repo that pre-commit's --all-files mode picked up. All changes are formatter output, no semantic edits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…via cancellation channel PR #31 split session.lock around the SGLang proxy call to keep DELETE non-blocking while a chat was in flight. That split-lock window let two same-session writers commit interleaved assistant turns; the Phase-3 stale-update guard at sessions.py:241 silently dropped the second commit via `logger.warning(... skipping state update); return backend.build_proxy_response(result)` — the caller still received the 200 SGLang response and appended a phantom assistant record to its trajectory, while the server's session state didn't reflect that turn. The next chat then hit a cursor mismatch. Restore the original "one lock around Phase 1+2+3" invariant and move DELETE preemption from "no lock during proxy" to an explicit cancellation channel: chat stores its proxy task on session.current_proxy_task, DELETE cancels it (after setting session.closing), the chat coroutine catches CancelledError and returns 410 Gone (distinct from 404 "never existed" and 409 "retryable conflict"), then releases the lock so DELETE can acquire it and tear down. SessionStateConflictError (409) is added as defense-in-depth: with Phase 1+2+3 under one lock the Phase-3 stale-state branch is unreachable, but a future split-lock regression now surfaces a clear retryable 409 instead of a silently dropped commit. Two tests in test_session_race_conditions.py encoded the old "DELETE waits for chat to finish" contract and now expect the new "DELETE preempts → 410" behaviour: - test_double_delete_second_returns_404 - test_multiple_chats_queued_then_delete Pre-existing 4-failure baseline (get_or_create_session auto-create-after-delete tests) is unchanged: 5 pass / 4 fail. Split out of #33 per offline review request — #33 keeps the multi-backend orchestration + observability work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Apply formatter fixes flagged by CI's pre-commit run. Three files inside this PR's scope had style-fixes (linear_trajectory.py, test_session_race_conditions.py, plus an unused-import sweep) and seven pre-existing style-debt files elsewhere in the repo that pre-commit's --all-files mode picked up. All changes are formatter output, no semantic edits. Same lint-fix bundle applied to PR #33 in commit e1b93a0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rmfan marked this pull request as ready for review June 6, 2026 01:50

rmfan requested a review from a team June 6, 2026 01:50

odp requested changes Jun 6, 2026

View reviewed changes

Comment thread miles/rollout/session/session_server.py Outdated

odp reviewed Jun 6, 2026

View reviewed changes

Comment thread miles/rollout/session/session_errors.py Outdated

odp requested changes Jun 6, 2026

View reviewed changes

Comment thread miles/rollout/session/session_server.py Outdated

rmfan mentioned this pull request Jun 9, 2026

session-server: run TITO tokenization off the event loop #34

Merged

rmfan mentioned this pull request Jun 9, 2026

fix(session-server): restore lock across Phase 1+2+3, preempt DELETE via cancellation channel #38

Draft

3 tasks

rmfan force-pushed the feat/session-server-multi-backend-vanilla branch from cdee1f0 to d079b36 Compare June 9, 2026 17:04

rmfan changed the title ~~feat(session-server): multi-backend vanilla + lock-restore via cancellation channel~~ feat(session-server): multi-backend vanilla + per-request observability Jun 9, 2026

rmfan force-pushed the feat/session-server-multi-backend-vanilla branch from b57a1ce to e1b93a0 Compare June 9, 2026 20:14

rmfan and others added 2 commits June 9, 2026 15:12

rmfan force-pushed the feat/session-server-multi-backend-vanilla branch from e1b93a0 to 577283b Compare June 9, 2026 22:12

nightlessbaron mentioned this pull request Jun 10, 2026

feat(session): wire aborted turns to provisional record #40

Open

odp approved these changes Jun 10, 2026

View reviewed changes

rmfan merged commit e69b51a into prod Jun 10, 2026
15 of 17 checks passed

rmfan deleted the feat/session-server-multi-backend-vanilla branch June 10, 2026 22:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(session-server): multi-backend vanilla + per-request observability#33

feat(session-server): multi-backend vanilla + per-request observability#33
rmfan merged 2 commits into
prodfrom
feat/session-server-multi-backend-vanilla

rmfan commented Jun 5, 2026 •

edited

Loading

Uh oh!

rmfan commented Jun 6, 2026

Uh oh!

rmfan commented Jun 6, 2026

Uh oh!

rmfan commented Jun 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rmfan commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in this PR

Multi-backend orchestration

Per-request observability (sessions.py)

Dependencies

What's NOT in this PR

Test plan

Observable behavior changes

Uh oh!

rmfan commented Jun 6, 2026

Uh oh!

rmfan commented Jun 6, 2026

Uh oh!

rmfan commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bench results (multibackend stub rig) + logging validation

1. Bench results

2. Logging validation (commit ca684017)

3. Operational guide

Files

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rmfan commented Jun 5, 2026 •

edited

Loading

Per-request observability (`sessions.py`)

rmfan commented Jun 6, 2026 •

edited

Loading

2. Logging validation (commit `ca684017`)