Skip to content

feat: zero-copy full-tunnel mux + base64 off mux thread#881

Merged
therealaleph merged 1 commit intotherealaleph:mainfrom
dazzling-no-more:feature/tunnel-mux-zero-copy
May 8, 2026
Merged

feat: zero-copy full-tunnel mux + base64 off mux thread#881
therealaleph merged 1 commit intotherealaleph:mainfrom
dazzling-no-more:feature/tunnel-mux-zero-copy

Conversation

@dazzling-no-more
Copy link
Copy Markdown
Contributor

Summary

Performance refactor of the full-tunnel mode hot data path. Two headline wins, both internal — wire protocol unchanged, no config or behavior changes.

1. Zero-copy reads via Bytes/BytesMut

tunnel_loop and the SOCKS5 UDP receive loop drop their per-iteration Vec::to_vec() copies. MuxMsg::{ConnectData,Data,UdpOpen,UdpData} now carry Bytes instead of Vec<u8>/Arc<Vec<u8>>; the Arc::try_unwrap dance for pending_client_data is gone (Bytes is already Arc-backed).

The TCP path is threshold-based to avoid a memory regression we identified in review:

  • n ≥ 32 KB (half-buffer streaming): BytesMut::split().freeze() — saves the 64 KB memcpy on hot downloads.
  • n < 32 KB (TLS records, HTTP/2 frames, idle keepalive): Bytes::copy_from_slice + buf.clear() — payload-sized retention, buffer reused. Without this split, bytes 1.x's whole-allocation refcount would pin a full 64 KB per queued tiny read under semaphore stall (worst case ~96 MB on a backpressured tunnel).

The UDP path takes the same lesson but applied earlier: a fixed Vec<u8> recv buffer + Bytes::copy_from_slice only after the 9 KB size guard. We tried recv_buf_from + split first; it pinned the full ~65 KB datagram allocation behind every queued DNS reply.

2. Base64 encoding moved off the single mux thread

A new internal PendingOp { data: Option<Bytes>, encode_empty: bool } flows through mux_loop with raw bytes. The actual B64.encode(...) happens in fire_batch's spawned task, after the per-deployment semaphore permit. Up to ~3 MB of encoding per batch (50 ops × 64 KB) no longer serializes the single mux task.

3. Code quality (drive-bys)

  • New BatchAccum::push_or_fire collapses four near-identical match arms (~25 lines each) into ~10 lines each.
  • should_fire(pending_len, payload_bytes, op_bytes) predicate extracted from the inline cap check, with saturating_add so the helper's contract is self-contained instead of relying on caller-side bounds.
  • encode_pending(p) -> BatchOp extracted as a free function so the encoding contract — non-empty data → encoded, empty connect_dataSome(""), anything else empty → None — is directly testable.

Public API change

TunnelMux::udp_open and udp_data now take data: impl Into<Bytes> instead of Vec<u8>. Existing callers passing Vec<u8>, &'static [u8], Bytes, or BytesMut all keep compiling; no boundary change for in-tree consumers.

Test plan

  • cargo build --bins --lib clean
  • cargo test --lib passes — 208/208 (was 200, +8 new tests)
    • encode_pending_* × 4: non-empty data → base64; empty Data/UdpData/Closed omitted; empty connect_datad: ""; non-empty connect_data → encoded
    • should_fire_* × 3: first-op-never-fires; MAX_BATCH_OPS boundary; payload-cap boundary
    • batch_accum_reindexes_after_flush: post-flush reply indices restart at 0 (regression test for fire_batch's batch_resp.r.get(idx) lookup)

@dazzling-no-more dazzling-no-more changed the title perf: zero-copy full-tunnel mux + base64 off mux thread feat: zero-copy full-tunnel mux + base64 off mux thread May 7, 2026
@github-actions github-actions Bot added the type: feature feat: PR — auto-applied by release-drafter label May 7, 2026
@therealaleph therealaleph merged commit 54552bb into therealaleph:main May 8, 2026
2 checks passed
therealaleph added a commit that referenced this pull request May 8, 2026
…#881)

Bumps Cargo.toml v1.9.17 → v1.9.18 and ships the changelog for the
zero-copy mux refactor merged in 54552bb. No user-visible behavior
change; perf-focused release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dazzling-no-more dazzling-no-more deleted the feature/tunnel-mux-zero-copy branch May 8, 2026 05:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: feature feat: PR — auto-applied by release-drafter

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants