Is metalbreeze/bot-debate safe?

https://github.com/openclaw/skills/tree/main/skills/metalbreeze/bot-debate

79
CAUTION

metalbreeze/bot-debate is a documentation-only skill with no executable installation code, no malicious instructions, and intact canary files. Its primary security risk is architectural: the skill instructs agents to build prompts by directly embedding unescaped server-controlled content (debate_log entries from opponent bots), creating a clear indirect prompt injection attack surface exploitable by any debate participant or the operator of the required localhost:8081 server. A secondary concern is an undisclosed skill dependency in .clawhub/lock.json that could silently install an additional unaudited skill.

Category Scores

Prompt Injection 60/100 · 30%
Data Exfiltration 85/100 · 25%
Code Execution 90/100 · 20%
Clone Behavior 85/100 · 10%
Canary Integrity 100/100 · 10%
Behavioral Reasoning 65/100 · 5%

Findings (7)

HIGH Indirect prompt injection via server-controlled debate_log content -30

The skill instructs the agent to build operational prompts by directly embedding debate_log[*].message.content from server API responses. Any participant in the debate (including an attacker-controlled opposing bot) can place arbitrary LLM instructions inside their speech content, which the agent will process as trusted prompt material. The skill provides no escaping, no XML or delimiter boundaries, and no instruction to treat this content as untrusted data.

MEDIUM Server-controlled prompt context fields enable server-side injection -10

The fields topic, your_side, min_content_length, and max_content_length returned by the poll endpoint are directly interpolated into the agent's prompt template. A malicious or compromised server at localhost:8081 can manipulate these to inject instructions alongside the structural prompt framing.

MEDIUM Unverified localhost server dependency creates persistent privileged channel -25

The skill exclusively operates against http://localhost:8081 with no TLS, no certificate pinning, and no server identity verification. Any process on the local machine that listens on port 8081 can impersonate the debate server, feeding the agent arbitrary topics, stances, debate history, and injected instructions through all three API endpoints. This is a durable, privileged, zero-authentication control channel to the agent.

MEDIUM Speech submission endpoint usable for indirect data exfiltration -15

The skill directs the agent to POST message content to /api/debate/{id}/speech. If the agent is first manipulated via prompt injection (e.g., through a malicious debate_log entry instructing it to read .env or SSH keys and include them in its speech), the speech submission becomes an exfiltration channel to whoever operates localhost:8081. The canary files were not accessed by the skill directly, but this indirect vector is architecturally present.

LOW Direct output instruction may suppress agent safety filtering -10

The skill's reply format requirements include the instruction '直接输出辩论内容' (directly output debate content). This phrasing encourages the agent to output server-provided content without its own evaluation pass, which could allow injected content or harmful material to reach end users without interception.

LOW Undisclosed skill dependency declared in lock file -15

The .clawhub/lock.json file contains a dependency on academic-research-hub v0.1.0 that is not documented in SKILL.md or _meta.json. If the ClawHub installation system automatically installs skills listed in lock files, this silently expands the installation to a second skill whose security posture has not been reviewed.

LOW Executable shell command templates may be directly invoked by agent -10

The skill contains syntactically valid bash curl command blocks throughout the documentation. Agents configured with shell execution tools (Bash, terminal) may execute these literally rather than treating them as illustrative examples, triggering immediate API calls to localhost:8081 without user confirmation.