Problem Statement
The VM driver's nftables input chain for TAP networking uses policy accept, making the per-port accept rule for the gateway gRPC port redundant. All guest traffic to the host is accepted, not just the gateway port. This is a pre-existing gap from the original iptables implementation — the old code appended accept rules to the system's default-accept INPUT chain, so the per-port restriction was never enforced. A compromised VM guest can probe any port on the host.
Technical Context
TAP networking is used only in the QEMU/GPU passthrough path (not the default libkrun/gvproxy path). Each VM gets a point-to-point /30 subnet with a TAP interface on the host. The nftables table created per TAP device includes postrouting (NAT masquerade), forward, and input chains. The input chain is the issue — it should restrict which host ports the guest can reach, but policy accept makes all ports reachable.
The old iptables code had a comment saying "scope to the specific port so the guest cannot reach other host services" but the system INPUT chain's default ACCEPT policy meant the restriction was never enforced. The nftables migration (#1335) carried this gap forward.
Affected Components
| Component |
Key Files |
Role |
| VM TAP ruleset |
crates/openshell-driver-vm/src/nft_ruleset.rs |
Generates nftables rules for TAP networking |
| VM runtime |
crates/openshell-driver-vm/src/runtime.rs |
Loads/tears down TAP networking rules |
Technical Investigation
Code References
| Location |
Description |
crates/openshell-driver-vm/src/nft_ruleset.rs:42-53 |
Input chain with policy accept — the per-port accept rule is redundant |
crates/openshell-driver-vm/src/runtime.rs:429-501 |
setup_tap_networking() loads the ruleset |
crates/openshell-driver-vm/src/runtime.rs:129 |
gateway_port.unwrap_or(0) — with policy drop, port 0 would block all guest→host traffic |
Current Behavior
chain input {
type filter hook input priority 0; policy accept; // all traffic accepted
iifname "vmtap-abcd" tcp dport 8080 accept // redundant
}
A compromised VM guest can reach any TCP/UDP service the host exposes on 0.0.0.0 or the TAP interface IP — SSH, databases, debug ports, etc.
What Would Need to Change
Change the input chain to policy drop with explicit accepts:
ct state established,related accept — return traffic from host-initiated connections
- ICMP echo-request, destination-unreachable, time-exceeded — path MTU discovery and diagnostics
- TCP dport gateway_port accept — the gateway gRPC port (the only intended service)
The postrouting and forward chains do not need changes. The QEMU path uses static IP (no DHCP) and passes DNS via environment variable (no local DNS service on the TAP interface).
Edge Case: gateway_port == 0
With policy drop, a zero gateway port would block all guest→host traffic including the gateway connection the guest needs. The driver should validate that gateway_port > 0 when the QEMU backend is in use.
Proposed Approach
Change policy accept to policy drop in the input chain and add explicit accept rules for established/related, ICMP, and the gateway port. Update unit tests. Validate with a manual test on a GPU host if available.
Scope Assessment
- Complexity: Low
- Confidence: High
- Estimated files to change: 1-2
- Issue type:
fix
Risks & Open Questions
- Verify no other host services need to be reachable from the VM guest (investigation says no — static IP, no DHCP, external DNS)
- The forward chain also has
policy accept with redundant explicit accepts — a separate hardening pass could tighten it, but it's a different concern (transit traffic vs traffic TO the host)
Test Considerations
- Update existing unit test in
nft_ruleset.rs to assert policy drop and check for ct state / ICMP rules
- Manual e2e test on GPU/QEMU path to confirm sandbox still connects to gateway
- No new test infrastructure needed
Created by spike investigation. Use build-from-issue to plan and implement.
Problem Statement
The VM driver's nftables input chain for TAP networking uses
policy accept, making the per-port accept rule for the gateway gRPC port redundant. All guest traffic to the host is accepted, not just the gateway port. This is a pre-existing gap from the original iptables implementation — the old code appended accept rules to the system's default-accept INPUT chain, so the per-port restriction was never enforced. A compromised VM guest can probe any port on the host.Technical Context
TAP networking is used only in the QEMU/GPU passthrough path (not the default libkrun/gvproxy path). Each VM gets a point-to-point /30 subnet with a TAP interface on the host. The nftables table created per TAP device includes postrouting (NAT masquerade), forward, and input chains. The input chain is the issue — it should restrict which host ports the guest can reach, but
policy acceptmakes all ports reachable.The old iptables code had a comment saying "scope to the specific port so the guest cannot reach other host services" but the system INPUT chain's default ACCEPT policy meant the restriction was never enforced. The nftables migration (#1335) carried this gap forward.
Affected Components
crates/openshell-driver-vm/src/nft_ruleset.rscrates/openshell-driver-vm/src/runtime.rsTechnical Investigation
Code References
crates/openshell-driver-vm/src/nft_ruleset.rs:42-53policy accept— the per-port accept rule is redundantcrates/openshell-driver-vm/src/runtime.rs:429-501setup_tap_networking()loads the rulesetcrates/openshell-driver-vm/src/runtime.rs:129gateway_port.unwrap_or(0)— withpolicy drop, port 0 would block all guest→host trafficCurrent Behavior
A compromised VM guest can reach any TCP/UDP service the host exposes on
0.0.0.0or the TAP interface IP — SSH, databases, debug ports, etc.What Would Need to Change
Change the input chain to
policy dropwith explicit accepts:ct state established,related accept— return traffic from host-initiated connectionsThe postrouting and forward chains do not need changes. The QEMU path uses static IP (no DHCP) and passes DNS via environment variable (no local DNS service on the TAP interface).
Edge Case:
gateway_port == 0With
policy drop, a zero gateway port would block all guest→host traffic including the gateway connection the guest needs. The driver should validate thatgateway_port > 0when the QEMU backend is in use.Proposed Approach
Change
policy accepttopolicy dropin the input chain and add explicit accept rules for established/related, ICMP, and the gateway port. Update unit tests. Validate with a manual test on a GPU host if available.Scope Assessment
fixRisks & Open Questions
policy acceptwith redundant explicit accepts — a separate hardening pass could tighten it, but it's a different concern (transit traffic vs traffic TO the host)Test Considerations
nft_ruleset.rsto assertpolicy dropand check for ct state / ICMP rulesCreated by spike investigation. Use
build-from-issueto plan and implement.