# Security And Verification Backlog

Detailed decompositions for security and verification work. `docs/tasks/README.md`
links here but should not inline these subtasks.

## Stage-6 Trust-Boundary Refresh

- [ ] Refresh trust-boundary docs after Stage 6 IPC/capability-transfer work.

## Untrusted-Service Hardening Pass

Cover unmapped pointers, kernel-half pointers, invalid capability IDs,
corrupted rings, SQ/CQ overflow behavior, and a service without Console
authority. Audit manifest, ELF, SQE, params, and result-buffer paths so
untrusted input fails closed instead of reaching kernel panic paths.

Completed context:
- [x] Panic-surface inventory: audited `panic!`, `assert!`, `unwrap`, and
      `expect` reachable from manifest, ELF, SQE, params, result-buffer, IPC,
      and spawn inputs.
- [x] Ring/user-pointer hostile demos: added unmapped params/result-pointer,
      kernel-half params-path, invalid-capability-ID, corrupted RETURN
      `call_id`, corrupted SQ/CQ head, undersized-params, undersized-result,
      and SQ/CQ overflow coverage.
- [x] No-authority smoke: empty-CapSet service verifies expected cap lookups
      fail and invalid-cap CALLs return controlled CQEs; after removal of
      syscall 0, it proves a no-authority process cannot write and can only
      `exit`/`cap_enter`.

Remaining decomposition:
- Quota and exhaustion smokes (`make run-untrusted-exhaustion`, two QEMU passes;
  covered 2026-05-25 06:42 EEST):
  - [x] Cap-table and endpoint-queue exhaustion fail closed without corrupting
        existing calls. Endpoint-queue is proven by the small-scratch core pass
        (per-owner queue ceiling -> `Overloaded`, then a held console call still
        completes). Cap-table is proven by the small-scratch core pass and the
        default-profile `*-captable` companion pass: single-frame
        `MemoryObject` allocations first return bounded `FrameAllocator`
        success replies, then continue until the per-process cap-slot ledger
        fails closed (`Overloaded: failed to reserve MemoryObject cap slot`);
        a held console call still completes after the boundary.
  - [x] Scratch/result-buffer pressure returns controlled errors and later
        valid calls still complete (core pass: ring-scratch oversize CALL
        rejected with `CAP_ERR_INVALID_REQUEST`, reply-scratch clamp returns a
        serialized exception, then a valid console write completes).
  - [x] Repeated invalid submissions stay bounded: each structurally invalid
        SQE returns a controlled per-SQE error CQE and the ring stays usable (a
        recovery NOP completes). Note: the per-key token-bucket log aggregation
        in `docs/authority-accounting-transfer-design.md` §3 (D1/D2 suppressed-
        count summary line) is still a design target, not implemented; the
        smoke asserts bounded per-SQE rejection, not the summary line.
  - [ ] Frame-grant-page exhaustion: not cleanly reachable from a smoke. For
        single-page allocations the cap-slot ceiling (`PROCESS_CAP_SLOT_LIMIT`,
        256) is reached far before the frame-grant ledger
        (`PROCESS_FRAME_GRANT_PAGE_LIMIT`, 4096 pages), and reaching 4096 grant
        pages needs large contiguous allocations whose failure mode is physical
        fragmentation, not the grant ledger. The cap-table pass exercises the
        same fail-closed preflight-reserve path. Remaining gap.
- [x] Fail-closed cleanup: the `FrameAllocator` success-path result
      serialization now honors the caller's effective reply-scratch capacity,
      so small-scratch processes can receive bounded `MemoryObject` result caps
      before cap-slot exhaustion fails closed. Closed by
      [`security-reply-scratch-success-path-limit-local-proof`](../tasks/done/2026-06-06/security-reply-scratch-success-path-limit-local-proof.md).

## Kani Harness Bounds Refresh

- [ ] Revisit Kani harness bounds and proof shape once capability transfer,
      resource accounting, or user-buffer validation has more concrete proof
      obligations. Keep current bounds practical for `make kani-lib`; expand
      only when the added verifier cost buys a specific kernel invariant.

## DMA Assurance Model Operationalization

`dma-assurance-model-v0` (2026-05-24) landed the accepted proposal
(`docs/proposals/dma-assurance-model-proposal.md`) and inspectable-only TLA+/Alloy
skeletons (`models/dma/`), but stopped there: no run target, no CI gate, no
reconciliation with DMA code landed since. Kickoff task:
[`dma-assurance-model-operationalization`](../tasks/done/2026-06-04/dma-assurance-model-operationalization.md)
(decomposition — reconciled the v0 model with landed code and emitted the
per-tool slices below).

- [x] Reconcile `models/dma/` with landed invariants (ownership-generation on
      recycle, map-record-before-PTE-install ordering, drive-pin, epoch fence,
      scrub-before-free): gap table in [`models/dma/README.md`](../../models/dma/README.md)
      grounded against the landed symbols, done 2026-06-04.
- [x] `make model-dma-tla` — bounded TLC run of `dma_authority.tla` (pinned TLC
      2.19 / tla2tools 1.7.4 + pinned Temurin JRE 17.0.19), lifecycle ordering
      plus generation-keyed stale completion, record-before-PTE-install split,
      drive-pin/quarantine, and queue-enable epoch-fence interleavings, checked
      clean at 2 devices / 2 domains / 2 pages / 2 iovas, generations 0..1, done
      2026-06-04:
      [`dma-assurance-model-tla-checked-gate`](../tasks/done/2026-06-04/dma-assurance-model-tla-checked-gate.md).
- [x] `make model-dma-alloy` — Alloy analysis of `dma_authority.als` (pinned
      Alloy Analyzer 6.2.0), device/domain/IOVA/page/alias authority graph plus
      the ownership-generation stale-handle gate, checked at scope `for 4`, done
      2026-06-04:
      [`dma-assurance-model-alloy-checked-gate`](../tasks/done/2026-06-04/dma-assurance-model-alloy-checked-gate.md).
- [x] `make kani-dma-authority` — bounded Kani over an extracted pure
      DMA-authority core (`capos_lib::dma_authority`: ownership-generation bump
      on recycle, stale-handle rejection without mutation, no-re-expose before
      completion), `make kani-lib` style, done 2026-06-04. Faithful extraction
      of the `device_dma.rs` authority arithmetic; routing the kernel call site
      through the core is a tracked follow-up (kernel is `no_std`/`no_main`, not
      host-built):
      [`dma-assurance-model-kani-authority-core`](../tasks/done/2026-06-04/dma-assurance-model-kani-authority-core.md).
- [x] `make model-dma-deferred-completion-loom` — focused Loom (pinned 0.7.2) over
      the `DeferredCompletionQueue` reservation budget and the multi-CPU TLB
      shootdown generation re-read (deferred-EOI / completion concurrency the ring
      Loom does not cover), done 2026-06-04:
      [`dma-assurance-model-deferred-completion-loom`](../tasks/done/2026-06-04/dma-assurance-model-deferred-completion-loom.md).
- [x] CI wiring into the GitHub gate and local aggregate now that each target has
      a checked result. `make dma-assurance-model-check` runs Alloy/TLA+/Loom/
      Kani locally when `cargo-kani` is installed; GitHub CI runs
      `model-dma-alloy`, `model-dma-tla`, and
      `model-dma-deferred-completion-loom` in `dma-assurance-models`, and
      `kani-dma-authority` in `kani-proofs`. Done 2026-06-05:
      [`dma-assurance-model-ci-wiring`](../tasks/done/2026-06-05/dma-assurance-model-ci-wiring.md).

## Scheduler & IRQ Assurance Models

The scheduler is the densest unmodeled concurrency surface in the kernel
(per-CPU atomics read lock-free from ISR context while another path holds the
scheduler mutex via `try_lock`, plus IPI cross-CPU activation) and has zero
formal coverage today (smoke + measured suppressed-tick counters only). The IRQ
MSI-X waiter race was fixed by reproduction, not a model. Mirrors the DMA
operationalization pattern; tasks reuse the TLC/Alloy/Kani pins that track lands.

- [x] **S1** [`scheduler-nohz-activation-model`](../tasks/done/2026-06-04/scheduler-nohz-activation-model.md)
      (done 2026-06-04 09:00 UTC) -- TLA+/TLC for the nohz activation/rollback
      lifecycle + a focused Loom for the lock-free `NOHZ_ACTIVE_CPUS` bit vs
      locked `nohz_activation[slot]` record race. `make model-scheduler-nohz-tla`
      checks no timer-less CPU (`NoTimerlessStall` + `EventuallyReArmed`),
      bit/record agreement (`EventuallyConsistent`), and that a staled remote
      activation is dropped not applied to a newer lease (`NoStaleActivation` +
      `StaledRecordEventuallyCleared`); `make model-scheduler-nohz-loom` checks
      the lock-free-bit ↔ locked-record reconciliation keeps the timer armed.
      Checked results + mutation/non-vacuity evidence in
      [`models/scheduler/README.md`](../../models/scheduler/README.md).
- [x] **S2** [`scheduler-lapic-oneshot-timer-model`](../tasks/done/2026-06-04/scheduler-lapic-oneshot-timer-model.md)
      (done 2026-06-04) -- Kani over the extracted pure count/clamp arithmetic
      (`capos_lib::clockevent`) + a TLA+ mode-transition lemma pinning the
      halt-first reprogram ordering. `make kani-lapic-oneshot` proves the clamp
      window is well-formed, the armed count is in `[1, u32::MAX]` with no `u128`
      overflow, and the count round-trips to the request within one LAPIC count
      (3/3 SUCCESSFUL). `make model-scheduler-lapic-oneshot-tla` checks that after
      the halt-first reprogram the next fire is the one-shot at the armed count,
      never the periodic reload (`OneshotModeBoundedCount` + `HaltedDisarmed` + the
      `periodicFiredInOneshotMode` sentinel), and that every fire path (user- and
      kernel-mode consumption) restores a timer source (`NoTimerlessStall` +
      `EventuallyReArmed` / `FiredEventuallyRestored` liveness). Checked results +
      mutation/non-vacuity evidence in
      [`models/scheduler/README.md`](../../models/scheduler/README.md).
- [x] **S3/S4** [`scheduler-cpu-isolation-lease-authority-model`](../tasks/done/2026-06-04/scheduler-cpu-isolation-lease-authority-model.md)
      (done 2026-06-04 07:04 UTC) -- Alloy for the lease/grant relational
      invariants + TLA+ for the two-lock teardown and the documented non-atomic
      createLease-vs-revokeGrant SMP window. `make model-scheduler-lease-alloy`
      checks the unforgeable grant->lease binding, no live lease through a
      revoked grant outside the explicitly modeled bounded window, capacity
      never undercounting a live lease, and the stale-handle generation gate;
      `make model-scheduler-lease-tla` checks generation advances exactly once
      per termination, no capacity double-free, the single chokepoint always
      runs unregister + SQPOLL-stop + nohz-rollback before recycle, no stranded
      generation (liveness), and that the renew deadline branch never
      resurrects. Checked results + non-vacuity evidence in
      [`models/scheduler/README.md`](../../models/scheduler/README.md).
- [x] **IRQ** [`irq-msix-waiter-determinism-model`](../tasks/done/2026-06-04/irq-msix-waiter-determinism-model.md)
      (done 2026-06-04 06:10 UTC) -- TLA+/TLC for the waiter <-> delivery <->
      deferred-EOI ordering the RX MSI-X waiter fix established.
      `make model-irq-waiter-tla` checks no spurious/early injection
      (`NoCompletedEarly`), exactly-once delivery/EOI/completion accounting, EOI
      drain before route re-arm (`EpochDrainSound`), and the `NoLostWake`
      liveness property; checked result + mutation evidence in
      [`models/irq/README.md`](../../models/irq/README.md).

## Preserved Completed Security Context

These are completed and should not be re-read by default. They remain here so
future work can find their design context without bloating `docs/tasks/README.md`.

- [x] Authority graph and resource accounting design for transfer model:
      `docs/authority-accounting-transfer-design.md`.
- [x] Supply-chain and generated-code TCB hardening: pinned Limine and
      external build downloads, generated-code drift checks, dependency policy,
      pinned Cap'n Proto compiler, shared `tools/capnp-build`, and deterministic
      generated-binding comparisons.
- [x] DMA isolation model before PCI/virtio/user-driver work:
      `docs/dma-isolation-design.md` defines short-term QEMU bounce-buffer
      decision, `DMAPool`, `DeviceMmio`, `Interrupt` invariants, and the
      userspace-driver transition gate.
- [x] ELF parser arbitrary-input coverage: proptest coverage plus a bounded
      `cargo-fuzz` target.
- [x] Telnet IAC filter fuzz coverage: `TelnetFilter` extracted to
      `capos-lib::telnet`, `fuzz/fuzz_targets/telnet_filter.rs` exercises
      the state machine with structural assertions (Normal/AfterIac
      emission rules, monotonic emit count). Will travel with the parser
      when networking moves to userspace per `docs/proposals/networking-proposal.md`.
- [x] Telnet IAC filter differential round-trip fuzzing
      (`fuzz/fuzz_targets/telnet_filter_roundtrip.rs`): synthesize
      arbitrary RFC 854 event streams (`Data`, WILL/WONT/DO/DONT, SB
      blocks with payload), encode to wire bytes, and assert that filter
      output equals the concatenated `Data` payloads. Found a real
      EXOPL handling bug in the original filter -- the option byte right
      after `IAC SB` was being interpreted as the start of an `IAC IAC`
      escape, leaving the filter stuck in the subnegotiation state and
      silently dropping all subsequent data bytes. Fixed via a new
      `AfterSb` state that consumes the option byte unconditionally
      before entering payload parsing.
- [x] Line discipline extraction and fuzz coverage: pure
      `LineDiscipline` lives in `capos-lib::line_discipline`, returning
      `LineStep { outcome, echo }` descriptions. The kernel transport
      drives it and translates `Echo::Byte`/`Echo::Backspace`/Cancelled/
      Submitted/Reprompt into the existing `send_*_track_cr` calls.
      Backed by `fuzz/fuzz_targets/line_discipline.rs` with structural
      invariants (line_len <= max_bytes, ±1 line_len delta per Pending
      step, Cancelled clears, echo only when buffer grows/shrinks).
- [ ] **Future**: differential fuzzing against an external Telnet library
      (libtelnet or a Rust port) to catch RFC conformance bugs the
      structural and round-trip targets cannot express. Tracked as a
      follow-up to Track S.14.
- [x] Ring SQE wire validation extraction and fuzz coverage: lifted the
      per-opcode `*_sqe_has_unsupported_fields` predicates from the
      kernel into `capos_config::ring`, exposed a unified
      `sqe_wire_validation_error` entry point, and reroute the kernel
      through it. Added `fuzz/fuzz_targets/sqe_validation.rs` plus 12
      host unit tests covering the classification boundaries each
      opcode imposes. Closes the originally planned three-parser fuzz
      set (`elf::parse`, `manifest::decode`, ring SQE decoder).
- [x] Well-formed SQE generator oracle
      (`docs/tasks/done/2026-06-06/security-sqe-well-formed-generator-fuzz-local-proof.md`):
      the test/fuzz-only `sqe-validation-oracle` feature exposes
      `capos_config::ring::sqe_oracle`, which generates validator-accepted SQEs
      for every opcode accepted by the current build, plus one-field rejecting
      mutations for reserved fields, unsupported flags, constrained
      cap/pointer/size fields, opcode-specific constraints, and
      session-disclosure reserved bits. The existing `sqe_validation` fuzz target
      keeps arbitrary-byte coverage and runs the positive/negative oracle on each
      input. This closes the shared wire-validator oracle gap only; it does not
      claim cap-table lookup, userspace pointer mapping, transfer-descriptor
      loading, or full kernel ring semantic coverage.
- [x] **Track S.17 -- sanitizers on host tests** -- partially landed.
      `make sanitizer-host-tests` runs AddressSanitizer over the
      `capos-lib` and `capos-config` host suites (crate set / features
      mirror the `test-lib` / `test-config` aliases). Outcome: **zero
      findings** -- both suites pass clean under ASan, including the
      named `unsafe` suspects (FrameBitmap slot indexing, CapTable
      generation counters, `lazy_buffer` raw `&mut [u8]`). The "cheap to
      add" claim holds **for ASan only**: it needs no `-Zbuild-std`
      because its libc interceptors cover the uninstrumented precompiled
      std.
      - ThreadSanitizer (`make sanitizer-host-tests-tsan`) is **blocked
        upstream**, not by a capOS defect. TSan changes the crate ABI,
        so rustc refuses to link sanitized code against the
        uninstrumented precompiled std (`mixing -Zsanitizer will cause an
        ABI mismatch`). Instrumenting std needs `-Zbuild-std`, which then
        fails with `duplicate lang item in crate core: sized` for
        build-script-bearing dependencies (typenum / libc / cfg-if /
        subtle) when the sanitizer target equals the host triple --
        reproduced four ways (plain `-Zbuild-std`, renamed
        `--target` JSON spec, `target-applies-to-host=false`, and
        `-Zhost-config`). The TSan target is kept wired so it starts
        passing once the upstream `-Zbuild-std` + build-script issue is
        fixed. capOS concurrency invariants are meanwhile covered by the
        dedicated Loom model (`cargo test-ring-loom`); these host unit
        tests spawn no threads, so TSan's marginal value here is low.