Skip to content

vfio_assigned_device: quiesce device via Command register on reset#3338

Open
jstarks wants to merge 1 commit intomicrosoft:mainfrom
jstarks:reset2
Open

vfio_assigned_device: quiesce device via Command register on reset#3338
jstarks wants to merge 1 commit intomicrosoft:mainfrom
jstarks:reset2

Conversation

@jstarks
Copy link
Copy Markdown
Member

@jstarks jstarks commented Apr 21, 2026

Before issuing VFIO_DEVICE_RESET (or when the device doesn't support it at all), clear Bus Master Enable, Memory Space Enable, and I/O Space Enable in the PCI Command register. VFIO passes these bits through to real hardware, so this stops the device from initiating DMA and from decoding MMIO/PIO on the physical bus.

Without this, a device that lacks VFIO_DEVICE_RESET would retain BME after a VM reset, allowing it to continue DMA through the IOMMU with stale mappings.

Before issuing VFIO_DEVICE_RESET (or when the device doesn't support it
at all), clear Bus Master Enable, Memory Space Enable, and I/O Space
Enable in the PCI Command register. VFIO passes these bits through to
real hardware, so this stops the device from initiating DMA and from
decoding MMIO/PIO on the physical bus.

Without this, a device that lacks VFIO_DEVICE_RESET would retain BME
after a VM reset, allowing it to continue DMA through the IOMMU with
stale mappings.
Copilot AI review requested due to automatic review settings April 21, 2026 00:10
@jstarks jstarks requested a review from a team as a code owner April 21, 2026 00:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens VFIO-assigned PCI device reset behavior by quiescing the physical device via the PCI Command register prior to attempting VFIO_DEVICE_RESET, preventing post-reset DMA/MMIO/PIO activity from persisting on real hardware.

Changes:

  • Clear Bus Master Enable, Memory Space Enable, and I/O Space Enable in the physical device’s PCI Command register during reset().
  • Ensure quiescing occurs before the existing VFIO reset path, covering devices that don’t support VFIO_DEVICE_RESET.

Comment on lines +588 to +594
let command = cfg_space::Command::from_bits(
self.read_phys_config(HeaderType00::STATUS_COMMAND.0) as u16,
)
.with_pio_enabled(false)
.with_mmio_enabled(false)
.with_bus_master(false);
self.write_phys_config(HeaderType00::STATUS_COMMAND.0, command.into_bits().into());
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_phys_config() returns !0 on read failure. If that happens here, Command::from_bits(0xFFFF) will preserve reserved bits as 1, and the subsequent write may push an invalid Command value to hardware (potentially toggling unrelated control bits). Consider reading the config DWORD with the fallible helper (or otherwise detecting the error) and skipping the quiesce write on failure, or explicitly clearing reserved bits before writing.

Suggested change
let command = cfg_space::Command::from_bits(
self.read_phys_config(HeaderType00::STATUS_COMMAND.0) as u16,
)
.with_pio_enabled(false)
.with_mmio_enabled(false)
.with_bus_master(false);
self.write_phys_config(HeaderType00::STATUS_COMMAND.0, command.into_bits().into());
let raw_command = self.read_phys_config(HeaderType00::STATUS_COMMAND.0);
if raw_command == u32::MAX {
tracing::warn!(
pci_id = self.pci_id.as_str(),
"failed to read physical PCI command register; skipping quiesce write"
);
} else {
let command = cfg_space::Command::from_bits(raw_command as u16)
.with_pio_enabled(false)
.with_mmio_enabled(false)
.with_bus_master(false);
self.write_phys_config(HeaderType00::STATUS_COMMAND.0, command.into_bits().into());
}

Copilot uses AI. Check for mistakes.
// from mastering DMA and from decoding MMIO/PIO on the physical
// bus. Without this, a device that lacks VFIO_DEVICE_RESET support
// would continue mastering the bus after a VM reset.
let command = cfg_space::Command::from_bits(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an else of the if *supports_reset { ... } block below? In other words: prefer VFIO reset then fallback to this if not supported

@jakeatmicrosoft
Copy link
Copy Markdown
Contributor

I think this is a good change, but if you just clear BME while a device is operating, it will probably fall into an error state, requiring a reset before it is useful again. Can you force one of the other PCI reset paths, like moving it from D0->D3->D0?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants