Is agent-orchestrator safe?

https://clawhub.ai/aatmaan1/agent-orchestrator

38
DANGEROUS

This skill is an autonomous agent factory that instructs the host AI to spawn unlimited sub-agents with dynamically generated prompts and unrestricted tool access (Bash, filesystem, network). While it contains no executable code itself, the prompt instructions create a recursive prompt injection vector and command-and-control architecture that could be weaponized for privilege amplification, data exfiltration, and evidence-destroying cleanup — all through legitimate agent tool usage rather than traditional exploits.

Category Scores

Prompt Injection 15/100 · 30%
Data Exfiltration 35/100 · 25%
Code Execution 25/100 · 20%
Clone Behavior 65/100 · 10%
Canary Integrity 100/100 · 10%
Behavioral Reasoning 10/100 · 5%

Findings (11)

CRITICAL Recursive prompt injection via dynamic SKILL.md generation -40

The skill instructs the host agent to dynamically generate new SKILL.md files for sub-agents. This means arbitrary prompt content is written to files and then injected into new agent contexts. An attacker who controls any input to the orchestrator (task description, inbox files, or dependency outputs) can inject arbitrary instructions into sub-agent prompts, creating a recursive prompt injection chain with no boundary enforcement.

CRITICAL Uncontrolled autonomous agent spawning -35

The skill spawns sub-agents via the Task tool with prompts that grant full tool access (Bash, Write, Read, WebSearch, WebFetch, Glob, Grep). These sub-agents operate autonomously with no permission scoping, no sandboxing, and no output validation. Each spawned agent is essentially an unrestricted AI agent that can execute arbitrary shell commands, read any file, and make network requests.

CRITICAL Agent factory pattern enables command-and-control architecture -90

This skill transforms the host agent into an agent factory that can spawn unlimited autonomous sub-agents. Combined with the file-based communication protocol (inbox/outbox), this creates a command-and-control architecture where a malicious actor could chain: external input → orchestrator → N sub-agents each performing different attack phases (recon, exfiltration, persistence). The dissolution phase provides built-in evidence cleanup.

HIGH Overly broad MANDATORY TRIGGERS hijack normal workflows -10

The MANDATORY TRIGGERS list includes extremely common terms like 'orchestrate', 'task breakdown', 'delegate tasks', 'parallel agents'. This means the skill would activate on many routine user requests, forcibly inserting the agent-orchestration workflow even when the user didn't intend multi-agent execution.

HIGH References non-existent Python scripts for agent creation -25

The skill references 'python3 scripts/create_agent.py' and 'python3 scripts/dissolve_agents.py' which do not exist in the repository. This creates a dangerous name-squatting vector: if any other skill or user action creates files at these paths, the orchestrator will execute them. Additionally, the agent may attempt to create these scripts itself to fulfill the skill's instructions.

HIGH Sub-agent templates grant unrestricted Bash access -30

Multiple sub-agent templates (Code Agent, Analysis Agent, Integration Agent) explicitly list Bash as an available tool with descriptions encouraging arbitrary command execution. There are no restrictions on what commands can be run, no allowlists, and no sandboxing.

HIGH Sub-agents have unrestricted filesystem and network access -40

Spawned sub-agents have no filesystem boundary enforcement. The Research Agent template grants WebSearch and WebFetch, enabling data exfiltration to external endpoints. Combined with unrestricted Read access, any sub-agent could read sensitive files (.env, SSH keys, credentials) and transmit them externally.

MEDIUM Predictable file paths enable cross-skill data harvesting -15

The file-based communication protocol creates files at predictable paths (agent-workspace/inbox/, outbox/, status.json). Any other skill or process with filesystem access can read these files, which may contain sensitive task data, credentials passed as context, or intermediate results.

MEDIUM Sensitive file reads during installation -35

Filesystem monitoring detected reads of .env, .aws/credentials, and OpenClaw auth profiles during the clone/install phase. While these appear to be runtime bootstrapping reads rather than skill-initiated, the skill operates in an environment where these files are accessible.

LOW No package.json or executable code in repository -5

The skill contains no executable code, no npm scripts, no git hooks, no submodules, and no symlinks. All danger comes from the prompt instructions rather than traditional code execution vectors.

INFO Canary files unmodified 0

All honeypot files (fake .env, SSH keys, AWS credentials) remained intact. No evidence of automated credential harvesting during installation.