Proposal: Native Shell, Agent Shell, and POSIX Shell
How interactive operation should work on capOS without reintroducing ambient authority through a Unix-like command line.
Problem
capOS deliberately avoids global paths, inherited file descriptors, ambient network access, and process-wide privilege bits. A conventional shell assumes all of those. If capOS copied a Unix shell model directly, the shell would either be mostly useless or become an ambiently privileged escape hatch around the capability model.
The system needs three related, but distinct, shell layers:
- Native shell: schema-aware capability REPL and scripting language.
- Agent shell: natural-language planning layer over the native shell.
- POSIX shell: compatibility personality for existing programs and scripts.
All three must be ordinary userspace processes. None of them should receive special kernel privilege. The kernel and trusted capability-serving processes remain the enforcement boundary.
The first boot-to-shell milestone is text-only: local console login/setup and, later in the same family, a browser-hosted terminal gateway. Graphical shells, desktop UI, compositors, and GUI app launchers are a later tier. See boot-to-shell-proposal.md.
Design Principles
- A shell starts with only the capabilities it was granted.
- Natural language is not authority.
- A shell command compiles to typed capability calls, not stringly syscalls.
- Child processes receive explicit grants. There is no implicit inheritance of the shell’s full authority.
- Elevation is a capability request mediated by a trusted broker, not a flag inside the shell.
- Shell startup is a workload launch from a
UserSession, service principal, or recovery profile. Session metadata informs policy and audit; it is not authority. - Default interactive cap sets are broker-issued session bundles, not hard-coded shell privileges.
- POSIX behavior is an adapter over scoped
Directory,File, socket factory, and process capabilities. It is not the native authority model.
User identity and policy sit above this shell model. A shell session may be
associated with a human, service, guest, anonymous, or pseudonymous principal,
but the session’s capabilities remain the authority. RBAC, ABAC, and mandatory
policy decide which scoped caps a broker may grant; they do not create a
kernel-side uid, role bit, or label check on ordinary capability calls. See
user-identity-and-policy-proposal.md.
Layering
flowchart TD
Input[Login, guest, anonymous, or service request] --> SessionMgr[SessionManager]
SessionMgr --> Session[UserSession metadata cap]
Session --> Broker[AuthorityBroker / PolicyEngine]
Broker --> Bundle[Scoped session cap bundle]
Bundle --> Agent[Agent shell]
Bundle --> Native[Native shell]
Bundle --> Posix[POSIX shell]
Agent --> Plan[Typed action plan]
Plan --> Native
Posix --> Compat[POSIX compatibility runtime]
Native --> Ring[capos-rt capability transport]
Compat --> Ring
Ring --> Kernel[Kernel cap ring]
Ring --> Services[Userspace services]
Agent --> Approval[Approval client cap]
Approval --> Broker
Broker --> Services
Broker --> Audit[AuditLog]
The native shell is the primitive interactive surface. The agent shell emits native-shell plans after inspecting available schemas, current caps, and the session-bound policy context exposed to it. The POSIX shell is a compatibility consumer of capOS capabilities, not the model other shells are built on.
A shell may display a principal name, profile, role set, label, or POSIX UID,
but those values are descriptive unless a trusted broker uses them to return a
specific capability. Losing a home, logs, launcher, or approval cap
cannot be repaired by presenting the same session ID back to the kernel.
Native Shell
The native shell is a typed capability graph operator. Its job is to inspect, invoke, pass, attenuate, release, and trace capabilities.
Example init or development session with explicit spawn authority:
capos:init> caps
log Console
spawn ProcessSpawner
boot BootPackage
vm VirtualMemory
capos:init> call @log.writeLine({ text: "hello" })
ok
capos:init> spawn "tls-smoke" with {
log: @log
} -> $child
started pid 12
capos:init> wait $child
exit 0
Values
Native shell values should include:
@name: a named capability in the current shell context.$name: a local value, result, promise, or process handle.- structured values: text, bytes, integers, booleans, lists, and structs.
- result-cap values returned through the capOS transfer-result path.
- trace values representing CQE and call-history slices.
The shell should preserve interface metadata with every capability value. A method call is valid only if the target cap exposes the method’s schema.
Commands
Initial commands should be small and explicit:
caps
inspect @log
methods @spawn
call @log.writeLine({ text: "boot complete" })
spawn "ipc-server" with { log: @log, ep: @serverEp } -> $server
wait $server
release @temporary
trace $server
bind scratch = @store.sub("scratch")
derive readonly = @home.sub("config").readOnly()
inspect should show the interface ID, label, transferability, revocation
state when available, and callable methods. It should not imply that two caps
with the same interface ID are the same authority.
Syntax
The syntax should be structured rather than shell-token based. A CUE-like or Cap’n-Proto-literal-like shape fits capOS better than POSIX word splitting:
spawn "net-stack" with {
log: @log
nic: @virtioNic
timer: @timer
}
The shell can still provide abbreviations, but the executable representation
should be an ActionPlan object with typed fields.
Composition
Native composition should pass typed caps or structured values, not inherited byte streams by default:
pipe @camera.frames()
|> spawn "resize" with { input: $, width: 640, height: 480 }
|> spawn "jpeg-encode" with { input: $, quality: 85 }
|> call @photos.write({ name: "frame.jpg", data: $ })
If a byte stream is desired, it should be explicit through a ByteStream,
File, or POSIX adapter capability. This keeps the “pipe” operator from
silently turning every interface into untyped bytes.
Namespaces
There is no global root. A native shell may have a current Directory or
Namespace capability, but that is just a default argument:
capos:user> ls @config
services
network
capos:user> cd @config.sub("services")
capos:@config/services> ls
logger
net-stack
The shell cannot traverse above a scoped directory or namespace unless it holds another capability that names that authority.
Session Context
A session-aware shell may hold a self or session cap for UserSession.info()
and audit context. That cap is metadata. It can identify the principal, auth
strength, expiry, quota profile, and audit identity, but it cannot widen the
shell’s CapSet or authorize kernel operations by itself.
The launcher or supervisor starts the shell with a CapSet returned by
AuthorityBroker(session, profile). For interactive work, that bundle should
usually include scoped terminal, home, logs, launcher, status, and approval
caps. For service accounts, guest sessions, anonymous workloads, and recovery
mode, the broker returns different bundles under explicit policy profiles.
Shell-launched children inherit only the caps named in the spawn plan. A child
may receive a UserSession or session badge for audit, per-client quotas, or
service-side selection, but object access still comes from the scoped object
caps passed to that child.
Agent Shell
The agent shell is a natural-language planner that emits typed native-shell plans. It should not directly own broad administrative authority.
Example:
capos:init> start the IPC demo, give the client only the server endpoint and console, then wait for both
The agent produces a plan:
Plan:
1. Spawn "ipc-server" with:
- log: Console
- ep: Endpoint(owner)
2. Spawn "ipc-client" with:
- log: Console
- server: Endpoint(client facet from server)
3. Wait for both ProcessHandle caps.
Required authority:
- ProcessSpawner
- Console
- Endpoint owner cap
- BootPackage binary access
Only after validation does the plan execute. Validation checks the current cap set, schema method IDs, transferability, grant names, quota limits, and policy.
What the Agent Adds
The useful AI-specific behavior is not raw command execution. It is:
- intent decomposition into spawn, grant, wait, trace, and release steps.
- schema-aware parameter construction.
- least-authority grant selection.
- explanation of missing capabilities.
- diagnosis from structured errors, CQEs, logs, and process handles.
- conversion of vague requests into an explicit plan that can be audited.
- retry after typed failures without bypassing policy.
The agent should reason over capOS objects and schemas, not over an unbounded shell prompt.
Minimal Daily Cap Set
The daily-use agent shell should start with the user-identity proposal’s
session bundle, minted by AuthorityBroker for one UserSession and profile:
terminal TerminalSession or Console
self self/session introspection
status read-only SystemStatus
logs read-only LogReader scoped to this user/session
home Directory or Namespace scoped to user data
launcher restricted launcher for approved user applications
approval ApprovalClient
It should not receive these by default:
ProcessSpawner(all)
BootPackage(all)
DeviceManager
StoreAdmin
FrameAllocator
VirtualMemory for other processes
raw networking caps
global service supervisor caps
The shell can ask for more authority, but it cannot mint that authority for itself.
Guest and anonymous profiles should receive narrower variants. A guest shell
may get terminal, tmp, and a restricted launcher, while an anonymous
workload normally receives short-lived purpose caps, strict quotas, and no
durable home namespace. An approval path exists only when the profile policy
explicitly grants one.
Approval and Authentication
Elevation belongs in a trusted broker service that is outside the model-controlled agent process.
Conceptual interfaces:
interface ApprovalClient {
request @0 (
reason :Text,
plan :ActionPlan,
requestedCaps :List(CapRequest),
durationMs :UInt64
) -> (grant :ApprovalGrant);
}
enum ApprovalState {
pending @0;
approved @1;
denied @2;
expired @3;
}
interface ApprovalGrant {
state @0 () -> (state :ApprovalState, reason :Text);
claim @1 () -> (caps :List(GrantedCap));
cancel @2 () -> ();
}
interface AuthorityBroker {
request @0 (
session :UserSession,
plan :ActionPlan,
requestedCaps :List(CapRequest),
durationMs :UInt64
) -> (grant :ApprovalGrant);
}
The agent shell holds only a session-bound ApprovalClient. It does not submit
arbitrary PrincipalInfo, role, UID, label values, or authentication proofs as
authority. The ApprovalClient forwards the bound UserSession and typed
request to AuthorityBroker. The broker or a consent service wrapping it holds
powerful caps, drives any trusted consent or step-up authentication path, and
mints attenuated temporary caps after policy and authentication checks.
The conceptual API intentionally has no authProof argument on the
agent-visible path. If a proof is needed, it is collected by SessionManager,
the broker, or a trusted approval UI and reflected back to the agent only as
pending, approved, denied, or expired.
Elevation Flow
User request:
restart the network stack
Agent plan:
Requested action:
- stop service "net-stack"
- spawn "net-stack"
- grant: nic, timer, log
- wait for health check
Missing authority:
- ServiceSupervisor(net-stack)
Requested duration:
- 60 seconds
Broker decision:
- Which
UserSessionand profile is this request bound to? - Is that principal/profile allowed to restart
net-stack? - Is the requested binary allowed?
- Are the requested grants narrower than policy permits?
- Do mandatory confidentiality and integrity constraints allow the grant?
- Is there fresh user presence?
- Does this require step-up authentication?
If approved, the broker returns a narrow leased capability:
supervisor: ServiceSupervisor(service="net-stack", expires=60s)
It should not return broad ProcessSpawner, BootPackage, or DeviceManager
authority when a scoped supervisor cap can do the job.
Authentication
Authentication proof should be consumed by the SessionManager or broker
boundary, not exposed as a secret to the agent. Suitable mechanisms include:
- password or PIN for medium-risk local actions.
- hardware key or WebAuthn-style challenge for administrative actions.
- TPM-backed local presence for device or boot-policy operations.
- multi-party approval for destructive policy, storage, or recovery actions.
The model should never receive raw tokens, private keys, recovery codes, or full environment dumps.
Agent Hardening
The agent shell must treat files, logs, web pages, service output, and CQE payloads as untrusted data. They are not instructions.
Required behavior:
- show an executable typed plan before authority-changing actions.
- keep elevated caps leased, narrow, and short-lived.
- release temporary caps after the plan finishes or fails.
- audit every approval request, grant, cap transfer, and release.
- require exact targets for destructive actions.
- refuse broad phrases such as “give it everything” unless a trusted policy explicitly allows a named emergency mode.
- keep model memory separate from secrets and authentication proofs.
The enforcement rule is simple: the model may plan, explain, and request. Capabilities decide what can happen.
POSIX Shell
The POSIX shell is a compatibility layer for existing software and scripts. It should be useful, but it should not define native capOS administration.
Mapping
POSIX concepts map onto granted capabilities:
| POSIX concept | capOS backing |
|---|---|
/ | synthetic root built from granted Directory or FileServer caps |
| cwd | current scoped Directory cap |
| fd | local handle to File, ByteStream, pipe, terminal, or socket cap |
| pipe | ByteStream pair or userspace pipe service |
PATH | search inside the synthetic root or a command registry cap |
exec | ProcessSpawner or restricted launcher cap |
| sockets | socket factory caps such as TcpProvider or HttpEndpoint |
uid, gid, user, group | synthetic POSIX profile derived from session metadata |
$HOME | path alias backed by a granted home directory or namespace cap |
/etc/passwd, /etc/group | profile service view, scoped to the compatibility environment |
| env vars | data only; never authority by themselves |
If a POSIX process has no network cap, connect() fails. If it has no
directory mounted at /etc, opening /etc/resolv.conf fails. If it has no
device cap, /dev is empty or synthetic.
A POSIX shell is launched with both a CapSet and compatibility profile metadata. The profile controls what legacy APIs report. The CapSet controls what the process can actually do.
Compatibility Limits
Exact Unix semantics should not be promised early.
- Prefer
posix_spawnover fullforkfor the first implementation. forkwith arbitrary shared process state can be emulated later if needed.setuidcannot grant caps. At most it asks a compatibility broker to replace the POSIX profile or launch a new process with a different broker-issued cap bundle.- Mode bits and ownership metadata do not create authority.
chmodcan modify filesystem metadata exposed by a filesystem service, but it cannot grant caps outside that service’s policy./procis a debugging service view, not kernel ambient introspection.- Device files exist only when a capability-backed adapter deliberately exposes them.
This is enough for many build tools and CLI programs without making POSIX the security model.
POSIX Session Caps
A normal POSIX shell session might receive:
terminal TerminalSession
session UserSession metadata
profile POSIX profile view
root Directory or FileServer synthetic root
launcher restricted ProcessSpawner/command launcher
pipeFactory ByteStream factory
clock Timer
Optional caps:
tcp scoped socket provider
home writable user Directory
tmp temporary Directory
proc read-only process inspection tree
Administrative caps still require broker-mediated approval.
Recovery Shell
A recovery shell is a separate policy profile, not the normal agent shell with hidden extra privileges. It may receive a larger cap set, but only after strong local authentication and with full audit logging. Guest and anonymous profiles must not fall into recovery authority by omission.
Possible recovery bundle:
console
boot package read
system status read
service supervisor for critical services
read-only storage inspection
scoped repair caps
approval client
Destructive recovery operations should still go through exact-target approval. The recovery shell should be local-only unless a separate remote recovery policy explicitly grants network access.
Required Interfaces
This proposal implies several service interfaces beyond the current smoke-test surface:
UserSession/SessionManager: principal/session metadata, audit context, and guest or anonymous profile creation (user identity proposal).TerminalSession: structured terminal I/O, window size, paste boundaries.SchemaRegistry: maps interface IDs to method names and parameter schemas.CommandRegistry: optional registry of native command capabilities.SystemStatus: read-only process and service status.LogReader: scoped log access.ServiceSupervisor: restart/status authority for one service or subtree.AuthorityBroker/ApprovalClient: session-bound base bundles, plan-specific leased grants, and policy/authentication mediation.CredentialStore,ConsoleLogin, andWebShellGateway: boot-to-shell authentication services for password-verifier setup, passkey registration, and text terminal launch (boot-to-shell proposal).AuditLog: append-only record of plans, approvals, grants, and releases.POSIXProfile/ compatibility broker: synthetic UID/GID, names,$HOME, cwd, and profile replacement without treating POSIX metadata as authority.ByteStream/ pipe factory: explicit byte-stream composition for POSIX and selected native pipelines.
These should be ordinary capabilities. A shell only sees the subset it has been granted.
Implementation Plan
-
Native serial shell
- Built on
capos-rt. - Lists initial CapSet entries.
- Invokes typed Console methods.
- Spawns and waits on boot-package binaries through
ProcessSpawner. - Provides
caps,inspect,call,spawn,wait,release, andtrace.
- Built on
-
Session-aware shell profile
- Use the
SessionManager -> UserSession metadataandAuthorityBroker(session, profile) -> cap bundlesplit. - Add
self/sessionintrospection without making identity metadata authoritative. - Start with guest, local-presence, and service-account profiles before durable account storage exists.
- Use the
-
Structured native scripting
- Add typed variables, result-cap binding, and plan serialization.
- Add schema registry support for method names and argument validation.
- Add explicit byte-stream adapters for commands that need text streams.
-
Approval broker
- Define
ActionPlan,CapRequest,ApprovalClient, and leased grant records. - Add local authentication and audit logging.
- Make administrative native-shell operations request scoped caps through the broker instead of running from a permanently privileged shell.
- Define
-
Boot-to-shell integration
- Add local console login/setup in front of the native shell.
- Require a configured password verifier when one exists.
- Enter setup mode when no console password verifier exists.
- Treat guest as an explicit local profile and anonymous as a separate remote/programmatic profile, not as missing-password fallbacks.
- Support passkey-only web terminal setup through local/bootstrap authority, not unauthenticated remote first use.
-
Agent shell
- Natural-language frontend that emits native
ActionPlanobjects. - Starts with the broker-issued minimal daily session bundle.
- Uses the approval broker for elevation.
- Treats all external content as untrusted data.
- Natural-language frontend that emits native
-
POSIX shell
- Implement after
Directory/File,ByteStream, and restricted process launch exist. - Start with
posix_spawn, fd table emulation, cwd, scoped root, pipes, and terminal I/O, plus synthetic POSIX profile metadata. - Add broader compatibility only as real workloads demand it.
- Implement after
Non-Goals
- No global root namespace.
- No shell-owned root/admin bit.
- No model-visible secrets.
- No default inheritance of all shell caps into children.
- No authorization from
PrincipalInfo, UID/GID, role, or label values alone. - No promise that POSIX scripts observe exact Unix behavior without a compatibility profile that grants the needed caps.
Open Questions
- Should the native shell syntax be CUE-derived, Cap’n-Proto-literal-derived, or a smaller custom grammar?
- How should schema reflection be packaged before a full runtime
SchemaRegistryexists? - What is the first minimal
TerminalSessioninterface beyondConsole? - Should approval be synchronous only, or can long-running agent plans request staged approvals?
- How should audit logs be stored before persistent storage exists?