Tools & Capabilities¶
Agents act on the world through tools. SynthOrg defines a pluggable tool system with 15+ categories (file system, git, web, database, terminal, sandbox, MCP bridge, analytics, communication, design, headless browser, governed external data access, virtual desktop), layered sandboxing (subprocess for low-risk, Docker for high-risk, Kubernetes for future multi-tenant), MCP server integration, and a progressive-disclosure model that limits the surface an agent sees to what its role, seniority, and autonomy tier permit.
Tool Categories¶
| Category | Tools | Typical Roles |
|---|---|---|
| File System | Read, write, edit, list, delete files | All developers, writers |
| Code Execution | Run code in sandboxed environments | Developers, QA |
| Version Control | Git operations, PR management | Developers, DevOps |
| Web | HTTP requests, web scraping, search | Researchers, analysts |
| Database | Query, migrate, admin | Backend devs, DBAs |
| Terminal | Shell commands (sandboxed) | DevOps, senior devs |
| Design | Image generation, mockup tools | Designers |
| Communication | Email, Slack, notifications | PMs, executives |
| Analytics | Metrics, dashboards, reporting | Data analysts, CFO |
| Deployment | CI/CD, container management | DevOps, SRE |
| Memory | Search memory, recall by ID | All agents (tool-based strategy) |
| Browser | Headless Playwright + Chromium: navigate, screenshot, SSIM diff, axe accessibility scan, full spec | QA, frontend devs, agents validating web deliverables |
| External Data | Governed external API/data access through a configured connection: credentials brokered from the connection catalog, egress constrained to the connection host (SSRF policy + DNS pinning), per-connection rate limiting, sensitive/write calls gated to approval | Agents consuming third-party APIs while building deliverables |
| Desktop | Virtual desktop (Xvfb + xdotool + scrot in a container): launch a GUI app, click/type/press-keys/scroll, capture screenshots | QA, frontend devs, agents validating GUI deliverables |
| MCP Servers | Any MCP-compatible tool | Configurable per agent |
Tool Execution Model¶
When the LLM requests multiple tool calls in a single turn, ToolInvoker.invoke_all executes
them concurrently using asyncio.TaskGroup. An optional max_concurrency parameter
(default unbounded) limits parallelism via asyncio.Semaphore. Recoverable errors are captured
as ToolResult(is_error=True) without aborting sibling invocations. Non-recoverable errors
(MemoryError, RecursionError) are collected and re-raised after all tasks complete (bare
exception for one, ExceptionGroup for multiple).
Permission checking follows a priority-based system:
get_permitted_definitions()filters tool definitions sent to the LLM; the agent only sees tools it is permitted to use- At invocation time, denied tools return
ToolResult(is_error=True)with a descriptive denial reason (defence-in-depth against LLM hallucinating unpresented tools)
Resolution order: denied list (highest) > allowed list > access-level categories > deny (default).
Tool Sandboxing¶
Tool execution uses a layered sandboxing strategy with a pluggable SandboxBackend
protocol. The default configuration uses lighter isolation for low-risk tools and stronger
isolation for high-risk tools.
Sandbox Backends¶
| Backend | Isolation | Latency | Dependencies | Status |
|---|---|---|---|---|
SubprocessSandbox |
Process-level: env filtering (allowlist + denylist), restricted PATH (configurable via extra_safe_path_prefixes), workspace-scoped cwd, timeout + process-group kill, library injection var blocking, explicit transport cleanup on Windows |
~ms | None | Implemented |
DockerSandbox |
Container-level: keep-alive container reused per the configured lifecycle strategy (per-agent default; per-call for maximum isolation), mounted workspace, no network (default) or sidecar-based host:port allowlist (dual-layer DNS + DNAT transparent proxy), resource limits (CPU/memory/time) |
~1-2s on first acquire; reused warm thereafter | Docker | Implemented |
K8sSandbox |
Pod-level: per-agent containers, namespace isolation, resource quotas, network policies | ~2-5s | Kubernetes | Planned |
Default Layered Sandbox Configuration
sandboxing:
default_backend: "subprocess" # subprocess, docker, k8s
overrides: # per-category backend overrides
file_system: "subprocess" # low risk -- fast, no deps
git: "subprocess" # low risk -- workspace-scoped
web: "docker" # medium risk -- needs network isolation
code_execution: "docker" # high risk -- strong isolation required
terminal: "docker" # high risk -- arbitrary commands
database: "docker" # high risk -- data mutation
browser: "docker" # opt-in -- Playwright + Chromium image
desktop: "docker" # opt-in -- Xvfb + xdotool + scrot image
subprocess:
timeout_seconds: 30
workspace_only: true # restrict filesystem access to project dir
restricted_path: true # strip dangerous binaries from PATH
docker:
image: "synthorg-sandbox:latest" # pre-built image with common runtimes
network: "none" # no network by default
network_overrides: # category-specific network policies
database: "bridge" # database tools need TCP access to DB host
web: "bridge" # web tools need outbound HTTP; no inbound
allowed_hosts: [] # allowlist of host:port pairs (TCP only)
dns_allowed: true # allow outbound DNS when allowed_hosts restricts network
loopback_allowed: true # allow loopback traffic in restricted network mode
memory_limit: "512m"
cpu_limit: "1.0"
timeout_seconds: 120
pids_limit: 64 # PID cap (main container) -- guards against fork-bomb runaways
tmpfs_size: "64m" # tmpfs mounted at /tmp in the main container
sidecar_pids_limit: 32 # PID cap for the stdio sidecar helper
sidecar_tmpfs_size: "8m" # tmpfs for the stdio sidecar helper
mount_mode: "ro" # read-only by default
auto_remove: true # remove the container once its lifecycle strategy tears it down
k8s: # planned -- per-agent pod isolation
namespace: "synthorg-agents"
resource_requests:
cpu: "250m"
memory: "256Mi"
resource_limits:
cpu: "1"
memory: "1Gi"
network_policy: "deny-all" # default deny, allowlist per tool
Per-category backend selection is implemented in tools/sandbox/factory.py via three functions:
build_sandbox_backends (instantiates only the backends referenced by config),
resolve_sandbox_for_category (looks up the correct backend for a ToolCategory), and
cleanup_sandbox_backends (parallel cleanup with error isolation). The tool factory
(build_default_tools_from_config) wires tool categories. Core tools
(FILE_SYSTEM, VERSION_CONTROL, web, etc.) are part of the default toolset
and always registered. The
auxiliary categories DESIGN, COMMUNICATION, and ANALYTICS are opt-in: tools
are only registered when the corresponding config section is present, and some
individual tools additionally require a runtime dependency (e.g. image tools
require an ImageProvider, notification tools require a dispatcher, analytics
query/metric tools require a provider or sink).
Docker is optional; only required when code execution, terminal, web, database, or browser tools are enabled. File system and git tools work out of the box with subprocess isolation. This keeps the local-first experience lightweight while providing strong isolation where it matters.
Docker MVP uses aiodocker (async-native) with a pre-built image
(Python 3.14 + Node.js LTS + basic utils, <500MB). If Docker is unavailable, the framework
fails with a clear error for any tool category whose configured backend is Docker;
low-risk categories (file_system, git) continue to run via subprocess
(Decision Log D16).
Container Log Shipping¶
DockerSandbox collects structured logs from both sandbox and sidecar containers
before removal and ships them through the backend's observability pipeline.
Sidecar JSON stdout is parsed line-by-line; malformed lines are skipped.
Sandbox stdout/stderr are shipped alongside the sidecar entries. All shipped
events carry correlation context (agent_id, session_id, task_id,
request_id) injected via structlog contextvars, and the same IDs are set as
SYNTHORG_AGENT_ID, SYNTHORG_SESSION_ID, SYNTHORG_TASK_ID,
SYNTHORG_REQUEST_ID environment variables in both containers so
container-side logs can self-correlate.
SandboxResult includes optional Docker-specific fields: container_id,
sidecar_id, sidecar_logs, agent_id, and execution_time_ms. These
default to None/empty for non-Docker backends.
Log shipping is failure-tolerant (errors are logged at debug level, never
propagated) and bounded by ContainerLogShippingConfig.collection_timeout_seconds
and max_log_bytes. By default only metadata (sizes, counts, timing) is
shipped; raw stdout/stderr/sidecar payloads require explicit opt-in via
ship_raw_logs=True to prevent secrets from bypassing key-name-based
redaction. Configuration lives on LogConfig.container_log_shipping
(default: enabled).
Scaling Path
In a future Kubernetes deployment, each agent can run in its own pod via
K8sSandbox. At that point, the layered configuration becomes less relevant; all tools
execute within the agent's isolated pod. The SandboxBackend protocol makes this
transition seamless.
Sandbox Lifecycle Strategies¶
Container lifecycle isolation (when to create, reuse, or destroy sandbox containers)
is configurable via the pluggable SandboxLifecycleStrategy protocol
(src/synthorg/tools/sandbox/lifecycle/protocol.py). Three built-in strategies control
the trade-off between resource efficiency and isolation:
| Strategy | Behaviour | Use case |
|---|---|---|
per-agent (default) |
One persistent container per agent; destroyed after a configurable grace period (default 30s) when the agent stops | Development, trusted environments |
per-task |
New container per task; destroyed immediately on task completion | Production, medium isolation |
per-call |
New container per tool invocation; destroyed immediately (current ephemeral behaviour) | High-security, maximum isolation |
Strategy selection via sandboxing.docker.lifecycle.strategy in SandboxingConfig.
The sidecar container shares the sandbox container's lifetime (created and destroyed
together, since they share a network namespace).
The configured default is per-agent (the strategy field default in
SandboxLifecycleConfig); the table above is authoritative. The strategy is
constructed at boot (workers/runtime_builder) with the application clock and
injected into DockerSandbox via the sandbox factory. Each tool call runs as
a docker exec inside a long-lived idle container (tail -f /dev/null
entrypoint) the strategy acquires; per-agent and per-task reuse the container
across calls while per-call destroys it immediately after the single exec. The
lifecycle owner is resolved from an explicit owner_id, else the structlog
correlation context (agent_id for per-agent, task_id for per-task). The
per-call degradation below is a per-invocation safety fallback, not a change
of the configured default: when a reuse strategy cannot derive an owner for a
given call, that single call degrades to ephemeral per-call behaviour while
the configured strategy stays in force for calls that can resolve an owner. AgentEngineExecutionService releases the owner at the
task boundary (per-task destroys immediately; per-agent starts the grace
timer so a subsequent task for the same agent within the window re-acquires
the warm container); DockerSandbox.cleanup() destroys all strategy-owned
containers via cleanup_all(). Containers carry the synthorg.managed=true
label so the reconciliation pass reclaims any orphaned on an unclean exit.
Virtual Desktop & Vision Verification¶
For GUI deliverables an agent must SEE and operate the running app, not just
unit-test it. The desktop tool (tools/desktop/) drives a headless X session
inside the existing DockerSandbox: it launches a windowed GUI app, injects
pointer / keyboard input via xdotool, and captures screenshots via scrot. The
session is stateful across calls because the per-agent lifecycle keeps the warm
container (Xvfb + the running app) alive between tool invocations; a per-call
reset surfaces as DesktopAppNotRunningError rather than a silent empty capture.
The session bring-up is pluggable behind a DesktopDriver protocol + factory
(tools/desktop/driver/): xvfb (the deterministic default: Xvfb + xdotool +
scrot) and vnc (adds an x11vnc observation channel). The protocol leaves room
for a future Windows-container / Wayland driver without reworking the tool. The
driver targets Linux-renderable GUI toolkits (Qt / Tk / GTK / Electron / X11);
the desktop-capable image is built from docker/desktop/Dockerfile. Screenshots
are written under <workspace>/.synthorg/desktop/screenshots/ with a sha256, so
they are durable provenance on disk (never in the database).
The screenshots feed the vision verifier quality gate (the UI cousin of the red-team gate); see Verification & Quality.
Git Clone SSRF Prevention¶
The git_clone tool validates clone URLs against SSRF attacks via hostname/IP
validation with async DNS resolution (git_url_validator module). All resolved
IPs must be public; private, loopback, link-local, and reserved addresses are
blocked by default. A configurable hostname_allowlist lets legitimate internal
Git servers bypass the private-IP check.
TOCTOU DNS rebinding mitigation closes the gap between DNS validation and
git clone's own resolution:
- HTTPS URLs: Validated IPs are pinned via
git -c http.curloptResolve=host:port:ip(git >= 2.37.0; sandbox ships git 2.39+), so git uses the same addresses the validator checked. - SSH / SCP-like URLs: A second DNS resolution runs immediately before execution; if the re-resolved IP set is not a subset of the validated set, the clone is blocked.
- Literal IP URLs: Immune (no DNS resolution occurs).
Both mitigations are configurable via GitCloneNetworkPolicy.dns_rebinding_mitigation
(default: enabled). Disable for hosts behind CDNs or geo-DNS where resolved IPs
legitimately vary between queries. For full defence-in-depth, combine with
network-level egress controls (firewall, HTTP CONNECT proxy) or container
network isolation (see Tool Sandboxing above).
MCP Integration¶
External tools are integrated via the Model Context Protocol (MCP).
- SDK: Official
mcpPython SDK, pinned version. A thinMCPBridgeTooladapter layer isolates the rest of the codebase from SDK API changes (Decision Log D17) - Transports: stdio (local/dev) and Streamable HTTP (remote/production). Deprecated SSE is skipped.
- Result mapping: Text blocks concatenate to
content: str; image/audio use placeholders with base64 in metadata;structuredContentmaps tometadata["structured_content"];isErrormaps 1:1 tois_error(Decision Log D18)
SynthOrg MCP Tool Surface¶
SynthOrg exposes its own MCP server offering 200+ tools across 15 domain
modules (agents, analytics, approvals, budget, communication, coordination,
infrastructure, integrations, memory, meta, organisation, quality, signals,
tasks, workflows). Tool definitions are classified
by capability action via the
read_tool / write_tool / admin_tool builders
(src/synthorg/meta/mcp/tool_builder.py); only the admin_tool subset is
destructive and subject to the guardrail triple. Every tool is handled by an
async function in src/synthorg/meta/mcp/handlers/<domain>.py; handlers shim
onto the existing service layer rather than reimplementing business logic.
Handler Protocol. Every handler implements
ToolHandler.__call__(*, app_state, arguments: dict[str, Any], actor: AgentIdentity | None = None) -> str
(see src/synthorg/meta/mcp/handler_protocol.py). The actor argument threads the
calling agent identity through the invoker so destructive-op guardrails can
enforce attribution; handlers that don't care about identity accept it and
ignore it.
Typed args (#1611 Phase 4). Each MCP tool registration optionally carries
an args_model: type[BaseModel] (see MCPToolDef.args_model). When set, the
invoker validates the raw arguments dict against the Pydantic model before
dispatching to the handler; validation failures short-circuit to a typed
ArgumentValidationError envelope without ever invoking the handler. Handlers
therefore receive a structurally-validated dict and can either access fields
directly (the model guarantees presence + type) or re-validate locally for
typed access (args_model.model_validate(arguments)). Tools without
args_model (legacy / dynamic shapes such as MCPBridgeTool) continue using
the manual common_args validators inside the handler body.
Envelope Contract. Every handler returns a JSON string. Success envelope (data is always present, pagination appears only on list/collection responses):
{
"status": "ok",
"data": {"example": "payload"},
"pagination": {"total": 100, "offset": 0, "limit": 50}
}
Note: the example shows the MCP-layer
PaginationMeta(defined insynthorg.meta.mcp.handlers.common), which intentionally retains the legacytotal/offsetshape because MCP handlers slice already-materialised sequences. The HTTP API uses a separate cursor-only envelope (synthorg.api.dto.PaginationMetawith{limit, next_cursor, has_more}); see persistence.md.
Handler-caught error envelope (domain_code identifies the error class for programmatic dispatch):
{
"status": "error",
"error_type": "ArgumentValidationError",
"message": "Argument 'approval_id' missing or not a non-blank string",
"domain_code": "invalid_argument"
}
Shared handler infrastructure lives in three sibling modules under
src/synthorg/meta/mcp/handlers/. The split keeps each module focused
on one concern and below the 800-line file ceiling.
common.py: response envelopes, pagination output, guardrails,
placeholder factories:
ok(data, *, pagination=None): success envelope with optionalPaginationMetametadata (frozen Pydantic model,allow_inf_nan=False).err(exc, *, domain_code=None): error envelope;messagealways goes throughsafe_error_description(exc)(SEC-1) anddomain_codefalls back toexc.domain_codewhen present.not_supported(tool_name, reason): stablestatus="error"/domain_code="not_supported"envelope for tools whose service facade is not wired. Emits theMCP_HANDLER_NOT_IMPLEMENTEDWARNING event so operators can alert on unwired tools. Every tool registered today is wired; this path fires only for newly registered tools that have not been given a concrete handler.service_fallback(tool_name, reason): helper retained incommon.pyfor future surgical use. EmitsMCP_HANDLER_SERVICE_FALLBACK; META-MCP-2 removed every call site and the integration sweep attests/integration/mcp/test_tool_surface.pyasserts zero emissions of this event across the full 204-tool surface.capability_gap(tool_name, reason): live handler whose underlying primitive does not yet expose the required method (e.g. agentactivity_feed, memory fine-tune orchestrator on a backend that lacks fine-tune support). Identical wire envelope tonot_supported(domain_code="not_supported") but emits the dedicatedMCP_HANDLER_CAPABILITY_GAPINFO event so ops telemetry distinguishes "primitive missing method" from "handler unwired".require_admin_guardrails(arguments, actor): single source of truth for the admin-op precondition triple: non-Noneactor, literalconfirm=True, non-blankreason. RaisesGuardrailViolationErrorwith a typedviolationcode ("missing_actor"/"missing_confirm"/"missing_reason").paginate_sequence(seq, *, offset, limit, total=None): in-memory page slicing that returns(page, PaginationMeta).dump_many(models): batch Pydantic model serialisation to JSON-mode dicts.
common_args.py: argument validators/extractors. Every helper
raises ArgumentValidationError on bad input so handlers can convert
to a stable err(...) envelope without catching framework-specific
exceptions:
require_arg(arguments, key, ty): typed required-argument extraction (ruffEM101-safe).require_non_blank(arguments, key): required non-blank string, whitespace-stripped.get_optional_str(arguments, key): optional non-blank string; returnsNonewhen missing/empty.require_dict(arguments, key, *, value_type=None, deep_copy=True): required dict argument; passvalue_type=strfordict[str, str]validation. Defaults to deep-copying the input to decouple handler mutations from caller payload.parse_time_window(arguments, *, until_required=True): ISO 8601 since/until parsing with timezone-aware enforcement andsince < untilordering.parse_str_sequence(arguments, key): optional sequence-of-non-blank-strings.coerce_pagination(arguments, *, default_limit=50): offset/limit parsing with strict bounds and explicit bool rejection. MCP tools default to 50; this is intentionally lower than the repository-layerDEFAULT_LIST_LIMIT = 100so paginated MCP responses stay terse for assistants.actor_id(actor)/require_actor_id(actor)/actor_label(actor): actor identity helpers. Useactor_idfor optional attribution,require_actor_idwhen attribution is mandatory (raises if unidentifiable),actor_labelonly for emit-only paths where a"mcp-anonymous"fallback is acceptable.
common_logging.py: the three handler-layer log helpers.
Module-scoped logger keyed at synthorg.meta.mcp.handlers so test
assertions see a single stable event source regardless of which domain
handler emitted the event:
log_handler_argument_invalid(tool, exc): caughtArgumentValidationError. EmitsMCP_HANDLER_ARGUMENT_INVALIDat WARNING.log_handler_invoke_failed(tool, exc, **context): genericExceptionfrom the service layer.**contextcarries optional correlation ids (e.g.task_id=,decision_id=); keys that would shadow the canonical event fields (tool_name,error_type,error,event,log_level) are rejected withValueError.log_handler_guardrail_violated(tool, exc): caughtGuardrailViolationErrorfrom an admin-op precondition. Records only the typedviolationcode; the human message stays in the response envelope.
All three route exception messages through safe_error_description
(SEC-1) so secret-shaped fragments are scrubbed before reaching logs.
Domain Codes. Handlers set stable wire codes so callers can dispatch
programmatically: invalid_argument, guardrail_violated, not_supported,
not_found, conflict (e.g. active-checkpoint delete), plus any
domain-specific codes set via the domain_code kwarg on err(...).
Registry Immutability. Each domain handler module exports an
XXX_HANDLERS: Mapping[str, ToolHandler] constant wrapped in
MappingProxyType to enforce read-only access;
build_handler_map() in src/synthorg/meta/mcp/handlers/__init__.py
merges them and raises on duplicate keys.
Schema-Level Validation. Admin-op schemas in
src/synthorg/meta/mcp/domains/*.py enforce the reason field as a
non-whitespace string via "minLength": 1 + "pattern": r".*\S.*", and the
confirm field as literal true via JSON Schema "enum": [true]. Handler
guardrails run regardless so validation stays uniform once services come
online.
Self-Extending Toolkit¶
A fixed toolset caps what the studio can build. The toolsmith
(src/synthorg/meta/toolsmith/) lets the organisation extend its own MCP tool
surface at runtime when it hits a recurring capability gap, governed end to end.
Detection. Every unfulfilled capability request is recorded into a
ring-buffered CapabilityGapStore (the ToolsmithService is the sink). When a
capability signature (domain:action) recurs at least
gap_recurrence_threshold times within gap_window_hours, it qualifies as a
recurring gap.
Authoring. LLMToolBlueprintGenerator authors a ToolBlueprint from the
gap: a declarative spec (name, capability, JSON Schema, action type) plus a
self-contained Python script_body. The tool name is derived from the
capability so it always satisfies the synthorg_{domain}_{action} contract;
the sandbox backend and network policy come from config, never the model, so an
authored tool cannot widen its own isolation. Capabilities that need
service-layer access (configured via service_access_capabilities) cannot be a
sandbox script and route to the CODE_MODIFICATION overflow handler instead.
Governance. Tool creation runs at the TOOL_CREATION proposal altitude
behind the same guard chain as self-improvement (scope, rollback plan, rate
limit, mandatory approval). The tool:create action type is HIGH risk and
human-gated under supervised and semi autonomy. Nothing is trusted without
human approval.
Validation. On approval, ToolCreationApplier runs the
BenchmarkToolValidationGate: a focused per-tool acceptance brief (the authored
script actually runs in its resolved sandbox and must return structured output)
followed by a golden-company scorecard delta (registering the candidate must not
regress the benchmark). A failing gate registers nothing; the blueprint keeps
its validation record for audit but never goes ACTIVE.
Live registration. A validated blueprint is persisted
(PENDING -> VALIDATED -> ACTIVE; RETIRED on rollback) and registered into the
mutable DynamicToolRegistry. The static DomainToolRegistry stays frozen; a
LayeredToolRegistry reads the static surface first then the dynamic layer, so
MCPToolInvoker dispatches authored tools (validating arguments against a
Pydantic args model materialised from the blueprint's JSON Schema) without
unfreezing anything. A later task invokes the new tool exactly like a built-in.
The toolsmith is disabled by default (meta.self_improvement ->
tool_creation_enabled); it wires at boot only when enabled, a provider is
registered, and persistence is connected.
Progressive Tool Disclosure¶
When the tool inventory exceeds ~30 tools, loading every full definition into the LLM context upfront becomes a major token tax. Progressive disclosure uses a three-level hierarchy inspired by Google ADK's skill loading pattern:
| Level | Contents | When injected | Token cost |
|---|---|---|---|
| L1 metadata | name, one-line description, category, cost tier | Always (system prompt) | ~100 tokens/tool |
| L2 body | full description, JSON Schema, examples, failure modes | On demand via load_tool() |
<5K tokens/tool |
| L3 resource | markdown guides, code samples, example traces | Explicit via load_tool_resource() |
Varies |
Discovery tools (always available regardless of agent access level):
list_tools(): returns L1 metadata for all permitted toolsload_tool(tool_name): returns L2 body; marks tool as loaded inAgentContextload_tool_resource(tool_name, resource_id): returns specific L3 resource
Context injection:
- L1 metadata is injected into the system prompt for all permitted tools
- Full
ToolDefinitionobjects are sent via the provider APItoolsparameter only for loaded tools + discovery tools - L3 resources are never auto-injected; returned inline from
load_tool_resource
Auto-unload: When AgentContext.context_fill_percent exceeds
ToolDisclosureConfig.unload_threshold_percent (default 80%), the oldest-loaded
L2 body is unloaded (FIFO by insertion order). L1 metadata remains.
Configuration (ToolDisclosureConfig):
l1_token_budget(default 3000): max tokens for L1 metadatal2_token_budget(default 15000): max tokens for loaded L2 bodiesauto_unload_on_budget_pressure(defaulttrue)unload_threshold_percent(default 80.0)
Cross-reference: MCP integration above is the external tool integration pattern; progressive disclosure is the local analogue for managing context cost.
Action Type System¶
Action types classify agent actions for use by autonomy presets (see Security & Approval), SecOps validation, tiered timeout policies, and progressive trust (Decision Log D1).
Registry: StrEnum for ~41 built-in action types (type safety, autocomplete, typos caught
by static type checking and config-load-time validation) + ActionTypeRegistry for custom
types via explicit registration. Unknown strings are rejected at config load time; a typo
in human_approval list silently meaning "skip approval" is a critical safety concern.
Granularity: Two-level category:action hierarchy. Category shortcuts expand to all
actions in that category (e.g., auto_approve: ["code"] expands to all code:* actions).
Fine-grained overrides are supported (e.g., human_approval: ["code:create"]).
Taxonomy (41 leaf types):
code:read, code:write, code:create, code:delete, code:refactor
test:write, test:run
docs:write
vcs:read, vcs:commit, vcs:push, vcs:branch
deploy:staging, deploy:production
comms:internal, comms:external
budget:spend, budget:exceed
org:hire, org:fire, org:promote
db:query, db:mutate, db:admin
arch:decide
tool:create
memory:read
knowledge:ingest, knowledge:reindex
browser:navigate, browser:screenshot, browser:diff, browser:accessibility_scan, browser:spec
external_data:request
desktop:launch, desktop:click, desktop:type, desktop:key, desktop:screenshot, desktop:scroll
Classification: Static tool metadata. Each BaseTool declares its action_type. Default
mapping from ToolCategory to action type. Non-tool actions (org:hire, budget:spend) are
triggered by engine-level operations. No LLM in the security classification path.
Tool Access Levels¶
Tool Access Level Configuration
tool_access:
levels:
sandboxed:
description: "No external access. Isolated workspace."
file_system: "workspace_only"
code_execution: "containerized"
network: "none"
git: "local_only"
restricted:
description: "Limited external access with approval."
file_system: "project_directory"
code_execution: "containerized"
network: "allowlist_only"
git: "read_and_branch"
requires_approval: ["deployment", "database_write"]
standard:
description: "Normal development access."
file_system: "project_directory"
code_execution: "containerized"
network: "open"
git: "full"
terminal: "restricted_commands"
elevated:
description: "Full access for senior/trusted agents."
file_system: "full"
code_execution: "containerized"
network: "open"
git: "full"
terminal: "full"
deployment: true
custom:
description: "Per-agent custom configuration."
The ToolPermissionChecker implements two layers of enforcement: category-level gating
(each access level maps to permitted ToolCategory values) and granular sub-constraints
(SubConstraintEnforcer) checking file system scope, network mode, terminal access, git access,
code execution isolation, and approval requirements against each tool invocation. Per-agent
overrides can customise all six dimensions via ToolPermissions.sub_constraints. K8s sandbox
backend integration is on the roadmap.
Progressive Trust¶
Agents can earn higher tool access over time through configurable trust strategies. The trust
system implements a TrustStrategy protocol, making it extensible. All four strategies are
implemented.
Security Invariant
The standard_to_elevated promotion always requires human approval. No agent can
auto-gain production access regardless of trust strategy.
Trust is disabled. Agents receive their configured access level at hire time and it never changes. Simplest option, useful when the human manages permissions manually.
A single trust score computed from weighted factors: task difficulty completed, error rate, time active, and human feedback. One global trust level per agent, applied to all tool categories.
trust:
strategy: "weighted"
initial_level: "sandboxed"
weights:
task_difficulty: 0.3 # harder tasks completed = more trust
completion_rate: 0.25
error_rate: 0.25 # inverse -- fewer errors = more trust
human_feedback: 0.2
promotion_thresholds:
sandboxed_to_restricted: 0.4
restricted_to_standard: 0.6
standard_to_elevated:
score: 0.8
requires_human_approval: true # always human-gated
Simple model, easy to understand. One number to track. However, too coarse; an agent trusted for file edits should not auto-gain deployment access.
Separate trust tracks per tool category (filesystem, git, deployment, database, network). An agent can be "standard" for files but "sandboxed" for deployment. Promotion criteria differ per category.
trust:
strategy: "per_category"
initial_levels:
file_system: "restricted"
git: "restricted"
code_execution: "sandboxed"
deployment: "sandboxed"
database: "sandboxed"
terminal: "sandboxed"
promotion_criteria:
file_system:
restricted_to_standard:
tasks_completed: 10
quality_score_min: 7.0
deployment:
sandboxed_to_restricted:
tasks_completed: 20
quality_score_min: 8.5
requires_human_approval: true # always human-gated for deployment
Granular. Matches real security models (IAM roles). Prevents gaming via easy tasks. Trust state is a matrix per agent, not a scalar.
Explicit capability milestones aligned with the Cloud Security Alliance Agentic Trust Framework. Automated promotion for low-risk levels. Human approval gates for elevated access. Trust is time-bound and subject to periodic re-verification.
trust:
strategy: "milestone"
initial_level: "sandboxed"
milestones:
sandboxed_to_restricted:
tasks_completed: 5
quality_score_min: 7.0
auto_promote: true # no human needed
restricted_to_standard:
tasks_completed: 20
quality_score_min: 8.0
time_active_days: 7
auto_promote: true
standard_to_elevated:
requires_human_approval: true # always human-gated
clean_history_days: 14 # no errors in last 14 days
re_verification:
enabled: true
interval_days: 90 # re-verify every 90 days
decay_on_idle_days: 30 # demote one level if idle 30+ days
decay_on_error_rate: 0.15 # demote if error rate exceeds 15%
Industry-aligned. Re-verification prevents stale trust. Trust decay may need tuning to avoid frustrating users.
See Also¶
- Providers: LLM abstraction and routing
- Security & Approval: autonomy tiers, approval gates, progressive trust
- Design Overview: full index