v0.1: fast-path runs end-to-end on configured interfaces (PR #4)#4
Merged
lunarthegrey merged 10 commits intomainfrom Apr 20, 2026
Merged
v0.1: fast-path runs end-to-end on configured interfaces (PR #4)#4lunarthegrey merged 10 commits intomainfrom
lunarthegrey merged 10 commits intomainfrom
Conversation
…tach/status Brings the fast-path module from "BPF ELF compiles + passes verifier" (PR #3) to "runs end-to-end in dry-run on configured interfaces". Core changes: - `crates/modules/fast-path/src/linux_impl.rs` (new, Linux-only): opens aya::Ebpf, populates CFG (dry_run + reserved version byte), populates ALLOW_V4/ALLOW_V6 LPM tries from `allow-prefix` directives, XDP-attaches to each `attach <iface> <mode>` with trial-attach fallback per SPEC §2.3 (Auto → native first, fall back to generic; explicit Native/Generic use requested mode, no fallback), populates REDIRECT_DEVMAP with every configured ifindex before packet flow so the §4.4 defensive devmap pre-check works. Exports a `trial_attach_native` probe and `snapshot_stats` / `snapshot_links` accessors for the CLI. - `crates/modules/fast-path/src/registry.rs` (new): JSON pin registry persistence at `<state-dir>/attachments.json`, atomic write-then- rename. SPEC §8 wants deterministic teardown; actual bpffs pinning lands in PR #6 but we persist the shape now so `detach` has something authoritative to read. - `crates/modules/fast-path/src/lib.rs`: `FastPathModule` gains real trait implementations on Linux; non-Linux returns `ModuleError::other` from load/attach (NotImplemented on detach is valid since nothing was loaded). macOS dev loops still compile cleanly. - `crates/cli/src/loader.rs` (new): drives the module lifecycle from `packetframe run`. Parses config, runs the SPEC §2.1 capability probe, refuses to attach if any required capability fails, then loads / attaches each configured module and persists the registry. SIGTERM/SIGINT → exit without explicit detach per SPEC §7.3 / §8.5 (drops Ebpf; until PR #6 adds pinning, this implicitly detaches). SIGHUP logs a warning; reconfigure flow is PR #6. - `crates/cli/src/feasibility.rs`: per-interface XDP trial-attach probe (§2.3) graduates from Deferred to real per-iface verdicts (`xdp.attach.<iface>` capability entries), populated from the config's `attach` directives. - `packetframe detach` reads the pin registry and reports; in-kernel detach without an active loader needs pinning (PR #6). `packetframe status` prints the registry contents; live counter readback waits on pinning too. Pins bumped: signal-hook 0.4 added for the signal loop. aya is now a Linux-only regular dep on fast-path (was dev-only). Follow-up commits on this branch will add the veth netns integration test and the packet-level bpf_prog_test_run fixtures deferred from PR #3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
aya requires the Pod marker on map value types so it can byte-copy the value into the kernel buffer. FpCfg is repr(C), 8 bytes, no padding, all primitives — safe to impl unconditionally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Creates a veth pair, loads the fast-path ELF, attaches to one end with native→generic fallback, detaches, cleans up. Tests the aya attach/detach round-trip against a real ifindex — which the existing verifier test can't do since `Xdp::load` only exercises the verifier, not the attach path. Drop-guard cleanup ensures the veth pair is removed even if the test panics. CI's sudo step now runs all ignored fast-path integration tests (`--tests -- --ignored`), so future cap-requiring tests land without workflow edits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three intertwined issues led to PR #3's merged verifier test and PR #4's attach test reporting `ok` while never actually exercising the BPF ELF: 1. `crates/modules/fast-path/bpf/Cargo.toml` has no `[workspace]` table, so when `build.rs` invokes a nested `cargo build` in that directory, cargo walks up and finds the root workspace that excludes this path — then errors out because the package "believes it's in a workspace when it's not" (the user surfaced this running on a Linux VM; exit code 101 matches CI's failure mode). The root `workspace.exclude` alone is not sufficient when cargo is invoked from inside the excluded directory. 2. The old `build.rs` used `Command::status()`, which inherited stdout/stderr back to the outer cargo — which captures build-script output and only parses `cargo:` directives. Result: any nested cargo error was completely swallowed, leaving users staring at an opaque "BPF build failed (exit 101); using empty stub ELF" line with no way to diagnose. 3. When BPF build fails, `build.rs` writes an empty stub ELF and continues. The verifier + attach integration tests check `FAST_PATH_BPF_AVAILABLE` and early-return with `eprintln!` + `return`, which cargo counts as PASS. CI's sudo step reported 1/1 passing; nothing ever actually round-tripped through the kernel verifier or aya's attach path. Fixes: - Add an empty `[workspace]` table to `bpf/Cargo.toml` so the nested cargo treats the subcrate as its own workspace root. - `build.rs` now uses `Command::output()` and re-emits stdout + stderr as `cargo:warning` lines so the real error surfaces in the outer cargo's warning stream — no more opaque exit-code messages. - New `PACKETFRAME_BPF_REQUIRED` env var makes `build.rs` panic on the stub fallback path instead of silently writing an empty ELF. Set in `ci.yml`'s `check` job so CI can never again be green with the BPF build broken. Local dev (without the env var) still stubs gracefully on macOS laptops. After these land, CI will either actually exercise the BPF program or fail loudly. If the verifier rejects something, or aya can't attach, we find out instead of shipping the bug. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real error from the PACKETFRAME_BPF_REQUIRED CI run:
error[E0463]: can't find crate for 'core'
The nested cargo was inheriting `CARGO`, `RUSTUP_TOOLCHAIN`, and
related env vars from the outer stable build, so rustup never
switched to the nightly pinned in `bpf/rust-toolchain.toml`. Without
nightly, `build-std = ["core"]` in `.cargo/config.toml` is silently
ignored, and cargo tries to link against a precompiled `core` that
doesn't exist for `bpfel-unknown-none`.
Clearing `CARGO`/`RUSTC`/`RUSTUP_TOOLCHAIN`/`CARGO_BUILD_TARGET`/
`CARGO_TARGET_DIR`/`CARGO_MANIFEST_DIR`/`RUSTC_WRAPPER` before
spawning the nested cargo lets rustup fresh-resolve from the
subcrate's toolchain file. PATH still finds the rustup proxy, which
reads the toolchain pin and swaps to nightly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two bugs surfaced by the now-working CI BPF build: 1. `EthHdr.ether_type` is a raw u16, not an `EtherType` enum. Matching against `EtherType::Ipv4` fails with E0308. Cast to u16 via `EtherType::Ipv4 as u16`; the enum's discriminants are LE-interpreted values of network-byte-order packet bytes, so the comparison works on any BPF host (all LE). 2. Writing to a `#[repr(C)]` union field is safe in recent Rust — my wrapping `unsafe` block for the ipv6_src/ipv6_dst writes triggered `unused_unsafe`. Drop it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous `target/bpfel-unknown-none/release/fast-path` path gave us a file that aya rejected as `Invalid ELF header size or alignment`. Almost certainly bpf-linker (or cargo with a custom-target-spec) is producing the artifact at a different filename / extension now. Parsing cargo's `--message-format=json` output for the `compiler-artifact` message lets cargo itself tell us where the produced ELF landed — resilient to toolchain and linker updates. The `artifact built from` cargo:warning line tells us in CI (and locally) which file was embedded, so the next failure (if any) is one `file` command away. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
aya's ELF parser (via the `object` crate) does aligned u32/u64 reads into the header and rejects 1-byte-aligned `include_bytes!` output with "Invalid ELF header size or alignment". The System allocator returns 16-byte-aligned memory on x86_64 Linux, so `.to_vec()` gives a slice suitable for the parser. Expose `aligned_bpf_copy()` as a public helper; linux_impl and both integration tests call it instead of passing `FAST_PATH_BPF` directly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Author
Critical regression audit — PR #3 and PR #4 tests were false-positivesWhile the user tested PR #4 on a Linux VM, the local build failed to produce a valid BPF ELF. Investigation found a chain of issues that had been silently hiding since PR #3:
Net effect: the tests merged with PR #3 (verifier pass) and PR #4 (veth attach) now actually exercise the kernel verifier and aya's attach path — they used to stub the ELF and early-return ok with 0.00s execution. With this commit series they take 0.19s and 0.25s respectively, doing real kernel work. The §4.4 BPF program has genuinely passed the kernel verifier for the first time in this CI run. The sport/dport fix from PR #3's spec audit has been exercised. |
User's Linux VM reports EINVAL from our raw bpf_prog_load probe while
aya can successfully load + attach the same program type on the same
kernel. Our hand-rolled minimal-program probe is fragile across kernel
versions — newer kernels reject minimally-populated bpf_attr even when
real program loads succeed.
Demote prog_type.{xdp,sched_cls} and helper.* to required=false. The
authoritative signal is the per-interface XDP trial-attach (graduates
from Deferred to real xdp.attach.<iface> verdicts when \`--config\` is
supplied), which does a real aya-mediated load through the kernel
verifier. If that passes, helpers and prog types are present.
This unblocks \`packetframe run\` on hosts where the raw probe
false-fails. Fixing the raw probe to match kernel bpf_attr layout
across versions is a follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Brings fast-path from verifier-pass (PR #3) to end-to-end deployable in dry-run on any configured interface.
packetframe run --config <file>now loads the BPF ELF via aya, populatesallow_v4/allow_v6/cfg/redirect_devmapfrom config, XDP-attaches to everyattach <iface> <mode>with the SPEC §2.3 trial-attach fallback, persists the pin registry, and blocks on SIGTERM/SIGINT.CI proves this works end-to-end on a real interface: the
attachintegration test creates a veth pair under sudo, loads the compiled BPF ELF through aya, XDP-attaches withDRV_MODE → SKB_MODEfallback, and detaches cleanly. Together with PR #3's verifier-pass test, the load/attach/detach round-trip is covered by CI.What landed
crates/modules/fast-path/src/linux_impl.rs(new, Linux-only)Real aya-driven loader:
aya::Ebpf::load,populate_cfg(writes the v1FpCfglayout),populate_allowlists(LPM inserts fromallow-prefix{,6}),attach(resolves ifindex vialibc::if_nametoindex, XDP-attaches with per-interface trial-attach, populatesredirect_devmapbefore returning any link),detach, plustrial_attach_nativefor the feasibility probe andsnapshot_{links,stats}for status reporting.crates/modules/fast-path/src/registry.rs(new)JSON pin registry at
<state-dir>/attachments.json, atomic write-then-rename.Attachment↔AttachmentRecordconversions so thepacketframe-common::module::Attachmenttype doesn't need Serde.crates/modules/fast-path/src/lib.rsFastPathModulenow has real trait impls that delegate tolinux_implon Linux, returnModuleError::other(...)on non-Linux.links()/stats()accessors for status.crates/cli/src/loader.rs(new)Drives the module lifecycle from
packetframe run. Config parse → §2.1 capability probe → §2.3 trial-attach probe for each configured iface → refuse-to-attach-on-required-cap-fail → load/attach → persist registry → block on signal. SIGTERM/SIGINT exit is explicitly no-detach per §8.5 (Ebpf drop implicitly tears down until PR #6 adds pinning).crates/cli/src/feasibility.rsPer-interface
xdp.attach.<iface>capabilities replace the Deferred placeholder. Tests native first, falls back to generic; reportsPassfor generic-only with the native error indetail.crates/modules/fast-path/tests/attach.rs(new)veth-backed integration test: creates pf-veth0/1, loads ELF, attaches with native→generic fallback, detaches, cleans up via a
Dropguard. Runs under sudo in CI.CI
checkjob's sudo step runs all ignored fast-path integration tests (--tests -- --ignored), so the verifier + veth tests both run under sudo.Spec audit — see comment
redirect_devmappopulated pre-attach with every ifindex in scope (§4.4 step 9d defensive pre-check works)Test plan
cargo test --workspaceon macOS still clean (stub BPF path)packetframe run --config conf/example.confon a Linux dev VM with veth interfaces; confirm attach, counters accumulate, SIGINT exits cleanlymatched_v4matches expected traffic volume over 24h (SPEC §9 Phase 2)🤖 Generated with Claude Code