IOMMU Remapping Grounding
This note records primary-source facts for IOMMU/remapping work. The Intel
VT-d path has landed under #[cfg(feature = "qemu")] in kernel/src/iommu.rs
as a QEMU q35 smoke (make run-iommu-remapping); AMD-Vi table programming
remains future work. DMAPool has manager-owned domain identity and
mapping-lifecycle preflight records. For the QEMU Intel IOMMU path, real VT-d
table programming, hardware-DMA translation proof, two-phase
invalidation/IOTLB-flush revocation, and IOMMU-backed hostile stale-DMA smokes
have all landed (see
ddf-iommu-qemu-intel-remapping-smoke).
For QEMU shapes without intel-iommu, the kernel-owned bounce-buffer fallback
remains active (remapping_tables=not-programmed,
hostile_hardware_isolation=not-claimed). AMD-Vi table programming and a
bounce-buffer policy for non-IOMMU devices remain open.
Sources
- Intel, Intel Virtualization Technology for Directed I/O Architecture
Specification,
content ID 671081. Intel page metadata on 2026-05-12 listed Date
2022-06-02and Version5.1 (Latest). Sections used: 6.2.2 “Context-Cache”, 6.2.4 “IOTLB”, 6.5.1 “Register-based Invalidation Interface”, 6.5.2 “Queued Invalidation Interface”, 6.5.3 “IOTLB Invalidation Considerations”, 6.6 “Set Root Table Pointer Operation”, 6.8 “Write Buffer Flushing”, 7.10 “Software Steps to Drain Page Requests & Responses”, 8.3 “DMA Remapping Hardware Unit Definition Structure”, 8.3.1 “Device Scope Structure”, 9.1 “Root Entry”, 9.3 “Context Entry”, 9.4 “Scalable-Mode Context-Entry”, and 11.4.5-11.4.9 covering the root-table-address, invalidation, fault, protected-memory-range, and invalidation-queue registers. - AMD, AMD I/O Virtualization Technology (IOMMU) Specification 48882, 48882-PUB Rev 3.10, February 2025. Sections used: 2.2 device table, device-table entry, I/O page table, and interrupt-remapping material; 2.4 “Commands”; 2.5 “Event Logging”; 3.4 “IOMMU MMIO Registers”; IVRS/device-table/page-table, command-buffer, completion-wait, invalidation, and event-log material.
- QEMU, qemu-manpage
entries for
-device intel-iommu,-device amd-iommu, and-device virtio-iommu-pci; and QEMU PCI developer documentation for PCI IOMMU and IOTLB notifier APIs. These are current-master QEMU docs, not a frozen release manual; theqemu-manpageand PCI developer pages observed on 2026-05-12 were generated for QEMU version 11.0.50.
Intel VT-d Grounding
Intel VT-d identifies DMA request sources through PCI requester/source IDs and
resolves them through DMA remapping hardware units described by DMAR DRHD
structures. The table path is rooted at a root table and context tables. Root
entries select context tables, context entries bind a source to a translation
type, domain identifier, address width, and second-level page-table root, and
scalable-mode context entries extend that context format. The landed QEMU smoke
(kernel/src/iommu.rs, cfg(qemu)) uses exactly this path: DRHD unit,
PCI segment and BDF/source ID, domain ID, aw-bits=39 address width, and a
3-level second-level page-table root. Scalable-mode context entries, 48-bit
IOVA space, interrupt remapping, and multi-device domains remain out of scope
for the current slice.
Invalidation is part of the mapping lifetime, not a diagnostic detail. Intel’s
register-based and queued invalidation interfaces cover context-cache,
IOTLB, device-TLB, interrupt-entry-cache, and wait/completion descriptors. The
landed smoke uses register-based context-cache invalidation (CCMD.ICC global
granularity) and domain-selective IOTLB invalidation (IOTLB.IVT,
CAP.IRO-decoded offset), both with bounded completion-bit polling. Page reuse
is ordered strictly after invalidation completion; a poll exhausted without
observing completion fails closed and does not free the backing pages. Queued
invalidation (GCMD.QIE) is not set in the current slice. Fault-reporting
registers (FSTS.PPF, FRCD[0].F) are the minimum diagnostic surface for
translation failures and protection faults, and are exercised by the
unmapped-IOVA and stale-DMA hostile proofs.
QEMU’s intel-iommu documentation is useful for focused emulator smokes but
should not be treated as hardware coverage. It is q35-only in QEMU current
master. Relevant options include intremap, caching-mode, device-iotlb,
and aw-bits=39|48; QEMU documents 39-bit IOVA space for 3-level IOMMU page
tables and 48-bit IOVA space for 4-level tables.
AMD-Vi Grounding
AMD-Vi uses a different vocabulary and table root. Device requests are keyed by DeviceID and resolved through a Device Table Entry. A DTE carries validity, translation, interrupt-remapping, DomainID, mode/page-table-depth, and page-table-root information. Future shared capOS abstractions can name the logical domain and IOVA lifetime generically, but AMD-specific code should not pretend it is programming Intel root/context tables.
AMD invalidation and completion are command-buffer operations. The future mapping lifetime must include command-buffer invalidation commands, completion wait, and event-log handling. The event log is the basic hardware-facing diagnostic record for malformed requests, page faults, and table errors; the MMIO register set covers control/status, command and event pointers, event-log state, alternate event-log buffers, device-table segment bases, and extended features.
QEMU’s amd-iommu documentation is also q35-only in current master. The
documented options include dma-remap for DMA address translation and
permission checking and intremap for interrupt remapping. Treat these as
emulator smoke inputs until capOS has separate hardware or provider evidence.
QEMU Test Surface
QEMU provides the emulator-level test surface for IOMMU smokes:
intel-iommuon q35 withaw-bits=39(3-level second-level page tables) is the shape used by the landedmake run-iommu-remappingsmoke, pinned to QEMU 8.2.2. The smoke asserts table programming, hardware-DMA translation (mapped_iova_translated=hardware-dma), unmapped-IOVA fault observation (unmapped_iova_fault=observed), two-phase invalidation/IOTLB-flush, and IOMMU-backed hostile stale-DMA proofs.amd-iommuon q35 with DMA remapping enabled is grounded here for a future AMD-Vi table-programming slice.virtio-iommu-pcion q35 x86_64 orvirtARM covers a portable virtio-IOMMU frontend if selected later.- PCI IOMMU/IOTLB notifier APIs in QEMU developer docs describe how emulated devices observe translation changes; they are not guest architectural requirements.
QEMU citations in the Sources section are current-master documentation observed
on 2026-05-12. Tests pin the local qemu-system-x86_64 --version, machine
type, and full device option string in the smoke evidence.
Implementation Status and Future Slices
Intel VT-d QEMU smoke (landed, cfg(qemu)):
- DMAR/DRHD discovery, MMIO/fault-status diagnostics, and disabled IOVA ledger preflight records: landed as prerequisites.
kernel/src/iommu.rsreal VT-d legacy-mode entry programming, RTAR write,GCMD/GSTSSRTP-then-TEhandshake, hardware-DMA translation proof via virtio-rng, unmapped-IOVA fault observation viaFSTS/FRCD, two-phase invalidation/IOTLB-flush revocation, and IOMMU-backed hostile stale-DMA smokes: all landed as of 2026-05-14 (slices A1/A2/B/C). See ddf-iommu-qemu-intel-remapping-smoke.- IOVA export stays disabled for this slice (
iova_export=disabled-this-slice);hostile_hardware_isolation=not-claimedin all evidence.
Future slices (not yet started):
- AMD-Vi table programming: separate source grounding and evidence; AMD-specific DTE, DeviceID, command-buffer, and event-log names must not be conflated with Intel root/context tables.
- Source-grounding refresh for AMD or additional Intel features (48-bit IOVA, scalable-mode context entries, interrupt remapping, device-IOTLB) when a real branch selects them.
- Bounce-buffer policy for QEMU shapes without
intel-iommu: an explicit decision on IOMMU/remapping or an explicit bounce-buffer policy for non-IOMMU devices remains open. - Trusted multi-device sharing groups, production NIC or storage driver ownership, and moving the live virtio-net path off bounce buffers are not in scope for the current slice.