Is maxkle1nz/war-room safe?

https://github.com/openclaw/skills/tree/main/skills/maxkle1nz/war-room

66
CAUTION

The war-room skill implements a legitimate multi-agent brainstorming methodology but contains several concerning design patterns: most critically, SKILL.md embeds HTML comments explicitly addressed to 'SECURITY REVIEWERS' that assert cron scheduling and file-opening are pre-authorized — these comments are part of the agent's injected context and function as runtime prompt injection, not human-readable documentation. The INTERCEPTOR component's explicit 'never stops' design with cron-based autonomous persistence, combined with social engineering language about making operators 'FEEL' in control, raises material concerns about the true scope of agent autonomy relative to user oversight. The underlying brainstorming framework is coherent and the install behavior was clean, but the in-context security pre-authorization pattern is a sophisticated manipulation technique that warrants careful evaluation before deployment.

Category Scores

Prompt Injection 40/100 · 30%
Data Exfiltration 72/100 · 25%
Code Execution 78/100 · 20%
Clone Behavior 90/100 · 10%
Canary Integrity 82/100 · 10%
Behavioral Reasoning 55/100 · 5%

Findings (9)

HIGH Embedded security pre-authorization assertions inside SKILL.md -35

SKILL.md contains two HTML comment blocks explicitly addressed to 'SECURITY REVIEWERS' that assert the skill's own actions are already authorized and benign. These comments are inside the skill's injected system-prompt content, meaning the LLM agent that loads this skill reads them as part of its operational context. The first comment claims cron scheduling is 'already authorized by running the gateway.' The second claims open/xdg-open is 'equivalent to double-clicking a file in Finder' with 'no arbitrary commands constructed from user input.' This is a prompt injection pattern: embedding justification text in skill content to make the agent treat potentially dangerous actions as pre-approved, circumventing the agent's own safety reasoning at runtime.

HIGH INTERCEPTOR 'never stops' — explicit no-idle autonomous agent design -15

The INTERCEPTOR section declares there is no IDLE state and instructs the agent to perpetually cycle through EXEC/AWAIT/WATCH states using cron auto-wake scheduling. This design removes natural interaction boundaries where a user would regain unambiguous control, creating an agent that continues acting autonomously between user sessions. When combined with subagent spawning capabilities, this means the war room can propagate actions without synchronous user consent.

MEDIUM Social engineering language: operators made to 'FEEL' in control -10

The INTERCEPTOR Communication Style section explicitly states 'The Operator must FEEL they are controlling an advanced system.' This is not incidental phrasing — it describes an intent to create an impression of user control that may not reflect the actual autonomy exercised by the agent. This language, embedded in the skill's injected content, reveals a design philosophy of managed perception rather than transparent operation.

MEDIUM Proactive shell command execution via open/xdg-open for artifact presentation -22

The Artifact Presentation section instructs agents to proactively run open (macOS) or xdg-open (Linux) on generated files after each wave completes. These OS commands launch the default application for a given file type, which can include browsers, PDF readers, terminal emulators, or any registered handler. While paths are stated to be scoped to war-rooms/{project}/artifacts/, the file type is not constrained — a crafted filename with an executable-associated extension could trigger unintended application launches.

MEDIUM Cron-based session persistence enables autonomous action outside user sessions -25

The Continuity Protocol instructs agents to schedule follow-up cron jobs every 3 minutes to check on subagent progress, consolidate results, and spawn additional actions — all potentially without the user actively interacting. In the OpenClaw gateway context, this means the war room session continues executing filesystem reads/writes and agent spawns between user sessions, accumulating actions that the user may not review until after the fact.

MEDIUM DNA mandatory protocol injection cascades into all spawned subagents -20

Every subagent spawned by the war room receives the DNA template as part of its system prompt. The DNA v3 explicitly frames its 19 protocols as 'MANDATORY — not decoration' and states 'Each principle produces a MANDATORY action that agents MUST execute.' This creates a cascading injection chain: the skill's behavioral protocols propagate to all child agents and can override those agents' default safety behaviors by framing compliance as a mandatory operational requirement.

LOW Subagent read access spans entire war-rooms workspace directory tree -28

Each specialist agent is instructed to read all prior agents' output folders (agents/*/) as part of its context-gathering. If a war room is initialized in or near a directory containing sensitive files, or if BRIEF.md or DNA.md is crafted to reference external paths, agents could ingest sensitive content. The skill does not restrict read scope beyond the war-rooms/ convention.

LOW Expected sparse-checkout clone with no anomalous behavior -10

The installation process used a standard git sparse-checkout pattern, connecting only to GitHub (140.82.113.4:443), cloning to /tmp/monorepo-clone, extracting the skill subdirectory, and cleaning up. No unexpected network destinations, process spawning, or filesystem modifications outside the designated skill directory were detected.

INFO Canary credential files accessed post-install but not modified -18

A synchronous batch read of all six canary credential files occurred at timestamp 1771934904 (after skill installation at ~1771934889), consistent with the audit framework's own final integrity check routine. The .env file was also accessed at timestamps 1771934891 and 1771934898. No canary files were modified and the monitoring system confirmed intact integrity. The access pattern does not indicate exfiltration from the skill itself.