Skip to content

perf(chunk): hold ChunkPutRequest content as Bytes for zero-copy fan-out#6

Open
jacderida wants to merge 1 commit into
WithAutonomi:mainfrom
jacderida:perf/chunk-put-bytes
Open

perf(chunk): hold ChunkPutRequest content as Bytes for zero-copy fan-out#6
jacderida wants to merge 1 commit into
WithAutonomi:mainfrom
jacderida:perf/chunk-put-bytes

Conversation

@jacderida
Copy link
Copy Markdown
Contributor

Summary

  • Replace ChunkPutRequest.content: Vec<u8> with content: bytes::Bytes.
  • Enable the bytes crate's serde feature so Bytes serialises identically to Vec<u8> on the wire (both become seq[u8] under postcard).
  • Update ChunkPutRequest::new / with_payment signatures and the two existing tests.

Why

A heaptrack capture against a release ant binary uploading a 20 MB file shows peak heap pinned to ~285 MB in alloc::raw_vec::RawVecInner::finish_grow, with the dominant call chain landing at ChunkMessage::encodechunk_put_to_close_group. The fan-out path at ant-client/ant-core/src/data/client/chunk.rs:173 does:

let request = ChunkPutRequest::with_payment(address, content.to_vec(), proof);

The content parameter is already a refcounted bytes::Bytes, but to_vec() deep-copies the entire 4 MB chunk into a fresh Vec<u8> — once per recipient. With CLOSE_GROUP_MAJORITY ≈ 5 peers and the AIMD store cap at 64, peak in-flight chunk-content can reach ~2.5 GB on a single client, which exceeds the 4 GB VM budget once DHT/QUIC/etc. overhead is added.

Switching content to Bytes removes the requirement for the caller to copy. The ant-client follow-up PR drops the to_vec() and passes the Bytes straight through, so each peer's spawned task shares a single 4 MB backing buffer via refcount instead of holding N independent copies.

Wire compatibility

Identical. Under postcard + serde, both Vec<u8> and bytes::Bytes are encoded as a varint length followed by the raw bytes. A mixed network of old and new clients/servers will interoperate.

Test plan

  • cargo check --all-features clean
  • cargo test --lib chunk — all 17 chunk tests pass (the two ChunkPutRequest tests updated to construct Bytes::from_static(...))
  • Downstream consumer rebuild — ant-client PR also opened against WithAutonomi/ant-client, which patches this branch in via [patch.crates-io] and drops the to_vec(). ant-node also reads request.content as Vec<u8> in storage/handler.rs and will need a matching update before its next release.

🤖 Generated with Claude Code

Replace `content: Vec<u8>` with `content: bytes::Bytes` on
ChunkPutRequest. Wire format is unchanged — Bytes serialises as a byte
sequence under postcard/serde, identical to Vec<u8> — but the in-memory
representation is now refcounted, so callers that send the same chunk
to multiple peers (notably close-group replication) share a single
backing buffer instead of deep-copying the 4 MB payload per peer.

Heaptrack against a 20 MB upload on a release ant binary showed the
client's peak heap dominated by RawVecInner::finish_grow calls in
ChunkPutRequest construction, with 168 MB consumed via
ChunkMessage::encode → chunk_put_to_close_group. The fan-out path was
running `content.to_vec()` once per recipient at chunk.rs:173, which
the new Bytes-typed field eliminates from the caller side.

This commit is the protocol-side half of the fix; the ant-client side
drops the `to_vec()` in a follow-up PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jacderida added a commit to WithAutonomi/ant-node that referenced this pull request May 11, 2026
`ChunkPutRequest.content` is now `bytes::Bytes` upstream
(WithAutonomi/ant-protocol#6) so that close-group fan-out on the client
side can share one refcounted buffer across N peer sends instead of
deep-copying the 4 MB chunk per peer.

ant-node is on the receiving side of that wire format. The storage
handler's existing reads — `.len()`, `compute_address(&request.content)`,
`self.storage.put(&address, &request.content)` — all coerce
`Bytes`/`&Bytes` to `&[u8]` transparently, so the production hot path
needs no change. Two adjustments cover the remaining call sites:

- `FreshWriteEvent.data: Vec<u8>` still owns the chunk for replication
  fan-out. Materialise once at the boundary via `request.content.to_vec()`
  on the success path. This is one copy on a node that has already
  accepted the chunk — node VMs aren't memory-constrained the way the
  client VMs are. Propagating `Bytes` deeper into the replication path
  is a worthwhile follow-up but out of scope for this fix.

- Test sites that constructed `ChunkPutRequest::new(addr, content.to_vec())`
  from byte-string literals now use `Bytes::copy_from_slice(content)`,
  and the one site with an owned `Vec<u8>` uses `Bytes::from(content)`.

[patch.crates-io] and [patch."https://github.com/WithAutonomi/ant-protocol"]
override the dep with the jacderida/ant-protocol perf branch commit so
ant-node builds against the new field type both from `main` (crates.io)
and from testnet branches like `fix/stability-improvements` (which
resolve ant-protocol via saorsa-core's git-source patch). The pin should
be removed once a crates.io release including the ant-protocol change
is published.

Test plan
---------
- `cargo check --lib` — clean
- `cargo test --lib storage::handler` — 14/14 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant