Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Proposal: Native Shell, Agent Shell, and POSIX Shell

How interactive operation should work on capOS without reintroducing ambient authority through a Unix-like command line.

Problem

capOS deliberately avoids global paths, inherited file descriptors, ambient network access, and process-wide privilege bits. A conventional shell assumes all of those. If capOS copied a Unix shell model directly, the shell would either be mostly useless or become an ambiently privileged escape hatch around the capability model.

The system needs three related, but distinct, shell layers:

  • Native shell: schema-aware capability REPL and scripting language.
  • Agent shell: natural-language planning layer over the native shell.
  • POSIX shell: compatibility personality for existing programs and scripts.

All three must be ordinary userspace processes. None of them should receive special kernel privilege. The kernel and trusted capability-serving processes remain the enforcement boundary.

The first boot-to-shell milestone is text-only: local console login/setup and, later in the same family, a browser-hosted terminal gateway. Graphical shells, desktop UI, compositors, and GUI app launchers are a later tier. See boot-to-shell-proposal.md.

Design Principles

  • A shell starts with only the capabilities it was granted.
  • Natural language is not authority.
  • A shell command compiles to typed capability calls, not stringly syscalls.
  • Child processes receive explicit grants. There is no implicit inheritance of the shell’s full authority.
  • Elevation is a capability request mediated by a trusted broker, not a flag inside the shell.
  • Shell startup is a workload launch from a UserSession, service principal, or recovery profile. Session metadata informs policy and audit; it is not authority.
  • Default interactive cap sets are broker-issued session bundles, not hard-coded shell privileges.
  • POSIX behavior is an adapter over scoped Directory, File, socket factory, and process capabilities. It is not the native authority model.

User identity and policy sit above this shell model. A shell session may be associated with a human, service, guest, anonymous, or pseudonymous principal, but the session’s capabilities remain the authority. RBAC, ABAC, and mandatory policy decide which scoped caps a broker may grant; they do not create a kernel-side uid, role bit, or label check on ordinary capability calls. See user-identity-and-policy-proposal.md.

Layering

flowchart TD
    Input[Login, guest, anonymous, or service request] --> SessionMgr[SessionManager]
    SessionMgr --> Session[UserSession metadata cap]
    Session --> Broker[AuthorityBroker / PolicyEngine]
    Broker --> Bundle[Scoped session cap bundle]

    Bundle --> Agent[Agent shell]
    Bundle --> Native[Native shell]
    Bundle --> Posix[POSIX shell]

    Agent --> Plan[Typed action plan]
    Plan --> Native
    Posix --> Compat[POSIX compatibility runtime]

    Native --> Ring[capos-rt capability transport]
    Compat --> Ring
    Ring --> Kernel[Kernel cap ring]
    Ring --> Services[Userspace services]

    Agent --> Approval[Approval client cap]
    Approval --> Broker
    Broker --> Services
    Broker --> Audit[AuditLog]

The native shell is the primitive interactive surface. The agent shell emits native-shell plans after inspecting available schemas, current caps, and the session-bound policy context exposed to it. The POSIX shell is a compatibility consumer of capOS capabilities, not the model other shells are built on.

A shell may display a principal name, profile, role set, label, or POSIX UID, but those values are descriptive unless a trusted broker uses them to return a specific capability. Losing a home, logs, launcher, or approval cap cannot be repaired by presenting the same session ID back to the kernel.

Native Shell

The native shell is a typed capability graph operator. Its job is to inspect, invoke, pass, attenuate, release, and trace capabilities.

Example init or development session with explicit spawn authority:

capos:init> caps
log        Console
spawn      ProcessSpawner
boot       BootPackage
vm         VirtualMemory

capos:init> call @log.writeLine({ text: "hello" })
ok

capos:init> spawn "tls-smoke" with {
  log: @log
} -> $child
started pid 12

capos:init> wait $child
exit 0

Values

Native shell values should include:

  • @name: a named capability in the current shell context.
  • $name: a local value, result, promise, or process handle.
  • structured values: text, bytes, integers, booleans, lists, and structs.
  • result-cap values returned through the capOS transfer-result path.
  • trace values representing CQE and call-history slices.

The shell should preserve interface metadata with every capability value. A method call is valid only if the target cap exposes the method’s schema.

Commands

Initial commands should be small and explicit:

caps
inspect @log
methods @spawn
call @log.writeLine({ text: "boot complete" })
spawn "ipc-server" with { log: @log, ep: @serverEp } -> $server
wait $server
release @temporary
trace $server
bind scratch = @store.sub("scratch")
derive readonly = @home.sub("config").readOnly()

inspect should show the interface ID, label, transferability, revocation state when available, and callable methods. It should not imply that two caps with the same interface ID are the same authority.

Syntax

The syntax should be structured rather than shell-token based. A CUE-like or Cap’n-Proto-literal-like shape fits capOS better than POSIX word splitting:

spawn "net-stack" with {
  log: @log
  nic: @virtioNic
  timer: @timer
}

The shell can still provide abbreviations, but the executable representation should be an ActionPlan object with typed fields.

Composition

Native composition should pass typed caps or structured values, not inherited byte streams by default:

pipe @camera.frames()
  |> spawn "resize" with { input: $, width: 640, height: 480 }
  |> spawn "jpeg-encode" with { input: $, quality: 85 }
  |> call @photos.write({ name: "frame.jpg", data: $ })

If a byte stream is desired, it should be explicit through a ByteStream, File, or POSIX adapter capability. This keeps the “pipe” operator from silently turning every interface into untyped bytes.

Namespaces

There is no global root. A native shell may have a current Directory or Namespace capability, but that is just a default argument:

capos:user> ls @config
services
network

capos:user> cd @config.sub("services")
capos:@config/services> ls
logger
net-stack

The shell cannot traverse above a scoped directory or namespace unless it holds another capability that names that authority.

Session Context

A session-aware shell may hold a self or session cap for UserSession.info() and audit context. That cap is metadata. It can identify the principal, auth strength, expiry, quota profile, and audit identity, but it cannot widen the shell’s CapSet or authorize kernel operations by itself.

The launcher or supervisor starts the shell with a CapSet returned by AuthorityBroker(session, profile). For interactive work, that bundle should usually include scoped terminal, home, logs, launcher, status, and approval caps. For service accounts, guest sessions, anonymous workloads, and recovery mode, the broker returns different bundles under explicit policy profiles.

Shell-launched children inherit only the caps named in the spawn plan. A child may receive a UserSession or session badge for audit, per-client quotas, or service-side selection, but object access still comes from the scoped object caps passed to that child.

Agent Shell

The agent shell is a natural-language planner that emits typed native-shell plans. It should not directly own broad administrative authority.

Example:

capos:init> start the IPC demo, give the client only the server endpoint and console, then wait for both

The agent produces a plan:

Plan:
1. Spawn "ipc-server" with:
   - log: Console
   - ep: Endpoint(owner)

2. Spawn "ipc-client" with:
   - log: Console
   - server: Endpoint(client facet from server)

3. Wait for both ProcessHandle caps.

Required authority:
- ProcessSpawner
- Console
- Endpoint owner cap
- BootPackage binary access

Only after validation does the plan execute. Validation checks the current cap set, schema method IDs, transferability, grant names, quota limits, and policy.

What the Agent Adds

The useful AI-specific behavior is not raw command execution. It is:

  • intent decomposition into spawn, grant, wait, trace, and release steps.
  • schema-aware parameter construction.
  • least-authority grant selection.
  • explanation of missing capabilities.
  • diagnosis from structured errors, CQEs, logs, and process handles.
  • conversion of vague requests into an explicit plan that can be audited.
  • retry after typed failures without bypassing policy.

The agent should reason over capOS objects and schemas, not over an unbounded shell prompt.

Minimal Daily Cap Set

The daily-use agent shell should start with the user-identity proposal’s session bundle, minted by AuthorityBroker for one UserSession and profile:

terminal        TerminalSession or Console
self            self/session introspection
status          read-only SystemStatus
logs            read-only LogReader scoped to this user/session
home            Directory or Namespace scoped to user data
launcher        restricted launcher for approved user applications
approval        ApprovalClient

It should not receive these by default:

ProcessSpawner(all)
BootPackage(all)
DeviceManager
StoreAdmin
FrameAllocator
VirtualMemory for other processes
raw networking caps
global service supervisor caps

The shell can ask for more authority, but it cannot mint that authority for itself.

Guest and anonymous profiles should receive narrower variants. A guest shell may get terminal, tmp, and a restricted launcher, while an anonymous workload normally receives short-lived purpose caps, strict quotas, and no durable home namespace. An approval path exists only when the profile policy explicitly grants one.

Approval and Authentication

Elevation belongs in a trusted broker service that is outside the model-controlled agent process.

Conceptual interfaces:

interface ApprovalClient {
  request @0 (
    reason :Text,
    plan :ActionPlan,
    requestedCaps :List(CapRequest),
    durationMs :UInt64
  ) -> (grant :ApprovalGrant);
}

enum ApprovalState {
  pending @0;
  approved @1;
  denied @2;
  expired @3;
}

interface ApprovalGrant {
  state @0 () -> (state :ApprovalState, reason :Text);
  claim @1 () -> (caps :List(GrantedCap));
  cancel @2 () -> ();
}

interface AuthorityBroker {
  request @0 (
    session :UserSession,
    plan :ActionPlan,
    requestedCaps :List(CapRequest),
    durationMs :UInt64
  ) -> (grant :ApprovalGrant);
}

The agent shell holds only a session-bound ApprovalClient. It does not submit arbitrary PrincipalInfo, role, UID, label values, or authentication proofs as authority. The ApprovalClient forwards the bound UserSession and typed request to AuthorityBroker. The broker or a consent service wrapping it holds powerful caps, drives any trusted consent or step-up authentication path, and mints attenuated temporary caps after policy and authentication checks.

The conceptual API intentionally has no authProof argument on the agent-visible path. If a proof is needed, it is collected by SessionManager, the broker, or a trusted approval UI and reflected back to the agent only as pending, approved, denied, or expired.

Elevation Flow

User request:

restart the network stack

Agent plan:

Requested action:
- stop service "net-stack"
- spawn "net-stack"
- grant: nic, timer, log
- wait for health check

Missing authority:
- ServiceSupervisor(net-stack)

Requested duration:
- 60 seconds

Broker decision:

  • Which UserSession and profile is this request bound to?
  • Is that principal/profile allowed to restart net-stack?
  • Is the requested binary allowed?
  • Are the requested grants narrower than policy permits?
  • Do mandatory confidentiality and integrity constraints allow the grant?
  • Is there fresh user presence?
  • Does this require step-up authentication?

If approved, the broker returns a narrow leased capability:

supervisor: ServiceSupervisor(service="net-stack", expires=60s)

It should not return broad ProcessSpawner, BootPackage, or DeviceManager authority when a scoped supervisor cap can do the job.

Authentication

Authentication proof should be consumed by the SessionManager or broker boundary, not exposed as a secret to the agent. Suitable mechanisms include:

  • password or PIN for medium-risk local actions.
  • hardware key or WebAuthn-style challenge for administrative actions.
  • TPM-backed local presence for device or boot-policy operations.
  • multi-party approval for destructive policy, storage, or recovery actions.

The model should never receive raw tokens, private keys, recovery codes, or full environment dumps.

Agent Hardening

The agent shell must treat files, logs, web pages, service output, and CQE payloads as untrusted data. They are not instructions.

Required behavior:

  • show an executable typed plan before authority-changing actions.
  • keep elevated caps leased, narrow, and short-lived.
  • release temporary caps after the plan finishes or fails.
  • audit every approval request, grant, cap transfer, and release.
  • require exact targets for destructive actions.
  • refuse broad phrases such as “give it everything” unless a trusted policy explicitly allows a named emergency mode.
  • keep model memory separate from secrets and authentication proofs.

The enforcement rule is simple: the model may plan, explain, and request. Capabilities decide what can happen.

POSIX Shell

The POSIX shell is a compatibility layer for existing software and scripts. It should be useful, but it should not define native capOS administration.

Mapping

POSIX concepts map onto granted capabilities:

POSIX conceptcapOS backing
/synthetic root built from granted Directory or FileServer caps
cwdcurrent scoped Directory cap
fdlocal handle to File, ByteStream, pipe, terminal, or socket cap
pipeByteStream pair or userspace pipe service
PATHsearch inside the synthetic root or a command registry cap
execProcessSpawner or restricted launcher cap
socketssocket factory caps such as TcpProvider or HttpEndpoint
uid, gid, user, groupsynthetic POSIX profile derived from session metadata
$HOMEpath alias backed by a granted home directory or namespace cap
/etc/passwd, /etc/groupprofile service view, scoped to the compatibility environment
env varsdata only; never authority by themselves

If a POSIX process has no network cap, connect() fails. If it has no directory mounted at /etc, opening /etc/resolv.conf fails. If it has no device cap, /dev is empty or synthetic.

A POSIX shell is launched with both a CapSet and compatibility profile metadata. The profile controls what legacy APIs report. The CapSet controls what the process can actually do.

Compatibility Limits

Exact Unix semantics should not be promised early.

  • Prefer posix_spawn over full fork for the first implementation.
  • fork with arbitrary shared process state can be emulated later if needed.
  • setuid cannot grant caps. At most it asks a compatibility broker to replace the POSIX profile or launch a new process with a different broker-issued cap bundle.
  • Mode bits and ownership metadata do not create authority.
  • chmod can modify filesystem metadata exposed by a filesystem service, but it cannot grant caps outside that service’s policy.
  • /proc is a debugging service view, not kernel ambient introspection.
  • Device files exist only when a capability-backed adapter deliberately exposes them.

This is enough for many build tools and CLI programs without making POSIX the security model.

POSIX Session Caps

A normal POSIX shell session might receive:

terminal      TerminalSession
session       UserSession metadata
profile       POSIX profile view
root          Directory or FileServer synthetic root
launcher      restricted ProcessSpawner/command launcher
pipeFactory   ByteStream factory
clock         Timer

Optional caps:

tcp           scoped socket provider
home          writable user Directory
tmp           temporary Directory
proc          read-only process inspection tree

Administrative caps still require broker-mediated approval.

Recovery Shell

A recovery shell is a separate policy profile, not the normal agent shell with hidden extra privileges. It may receive a larger cap set, but only after strong local authentication and with full audit logging. Guest and anonymous profiles must not fall into recovery authority by omission.

Possible recovery bundle:

console
boot package read
system status read
service supervisor for critical services
read-only storage inspection
scoped repair caps
approval client

Destructive recovery operations should still go through exact-target approval. The recovery shell should be local-only unless a separate remote recovery policy explicitly grants network access.

Required Interfaces

This proposal implies several service interfaces beyond the current smoke-test surface:

  • UserSession / SessionManager: principal/session metadata, audit context, and guest or anonymous profile creation (user identity proposal).
  • TerminalSession: structured terminal I/O, window size, paste boundaries.
  • SchemaRegistry: maps interface IDs to method names and parameter schemas.
  • CommandRegistry: optional registry of native command capabilities.
  • SystemStatus: read-only process and service status.
  • LogReader: scoped log access.
  • ServiceSupervisor: restart/status authority for one service or subtree.
  • AuthorityBroker / ApprovalClient: session-bound base bundles, plan-specific leased grants, and policy/authentication mediation.
  • CredentialStore, ConsoleLogin, and WebShellGateway: boot-to-shell authentication services for password-verifier setup, passkey registration, and text terminal launch (boot-to-shell proposal).
  • AuditLog: append-only record of plans, approvals, grants, and releases.
  • POSIXProfile / compatibility broker: synthetic UID/GID, names, $HOME, cwd, and profile replacement without treating POSIX metadata as authority.
  • ByteStream / pipe factory: explicit byte-stream composition for POSIX and selected native pipelines.

These should be ordinary capabilities. A shell only sees the subset it has been granted.

Implementation Plan

  1. Native serial shell

    • Built on capos-rt.
    • Lists initial CapSet entries.
    • Invokes typed Console methods.
    • Spawns and waits on boot-package binaries through ProcessSpawner.
    • Provides caps, inspect, call, spawn, wait, release, and trace.
  2. Session-aware shell profile

    • Use the SessionManager -> UserSession metadata and AuthorityBroker(session, profile) -> cap bundle split.
    • Add self/session introspection without making identity metadata authoritative.
    • Start with guest, local-presence, and service-account profiles before durable account storage exists.
  3. Structured native scripting

    • Add typed variables, result-cap binding, and plan serialization.
    • Add schema registry support for method names and argument validation.
    • Add explicit byte-stream adapters for commands that need text streams.
  4. Approval broker

    • Define ActionPlan, CapRequest, ApprovalClient, and leased grant records.
    • Add local authentication and audit logging.
    • Make administrative native-shell operations request scoped caps through the broker instead of running from a permanently privileged shell.
  5. Boot-to-shell integration

    • Add local console login/setup in front of the native shell.
    • Require a configured password verifier when one exists.
    • Enter setup mode when no console password verifier exists.
    • Treat guest as an explicit local profile and anonymous as a separate remote/programmatic profile, not as missing-password fallbacks.
    • Support passkey-only web terminal setup through local/bootstrap authority, not unauthenticated remote first use.
  6. Agent shell

    • Natural-language frontend that emits native ActionPlan objects.
    • Starts with the broker-issued minimal daily session bundle.
    • Uses the approval broker for elevation.
    • Treats all external content as untrusted data.
  7. POSIX shell

    • Implement after Directory/File, ByteStream, and restricted process launch exist.
    • Start with posix_spawn, fd table emulation, cwd, scoped root, pipes, and terminal I/O, plus synthetic POSIX profile metadata.
    • Add broader compatibility only as real workloads demand it.

Non-Goals

  • No global root namespace.
  • No shell-owned root/admin bit.
  • No model-visible secrets.
  • No default inheritance of all shell caps into children.
  • No authorization from PrincipalInfo, UID/GID, role, or label values alone.
  • No promise that POSIX scripts observe exact Unix behavior without a compatibility profile that grants the needed caps.

Open Questions

  • Should the native shell syntax be CUE-derived, Cap’n-Proto-literal-derived, or a smaller custom grammar?
  • How should schema reflection be packaged before a full runtime SchemaRegistry exists?
  • What is the first minimal TerminalSession interface beyond Console?
  • Should approval be synchronous only, or can long-running agent plans request staged approvals?
  • How should audit logs be stored before persistent storage exists?