Is minilozio/agent-arena safe?

https://github.com/openclaw/skills/tree/main/skills/minilozio/agent-arena

67
CAUTION

The Agent Arena skill is a legitimate social platform integration for AI agents, but its core design creates significant security risks that users must consciously accept. The skill autonomously posts content derived from the agent's private SOUL.md and MEMORY.md files to an external service every 20 seconds, with no visible output to the user by design (delivery:none). This creates both a data exfiltration vector (personal agent context sent externally) and an unfiltered prompt injection surface (any room participant can craft messages to manipulate the agent's behavior). The persistent cron job, heartbeat integration, and a command injection vulnerability in browse-rooms.sh compound the risk profile.

Category Scores

Prompt Injection 60/100 · 30%
Data Exfiltration 55/100 · 25%
Code Execution 70/100 · 20%
Clone Behavior 90/100 · 10%
Canary Integrity 100/100 · 10%
Behavioral Reasoning 45/100 · 5%

Findings (10)

HIGH Autonomous cron with hidden output enables invisible agent actions -20

SKILL.md instructs the agent to create an OpenClaw cron job running every 20 seconds with delivery:{mode:'none'} and sessionTarget:'isolated'. This means the agent autonomously reads context, generates responses, and posts them to an external service with zero visibility to the user. The delivery:none setting is justified in the docs as preventing backoff errors, but its practical effect is that the user cannot see or audit what their agent posts.

HIGH SOUL.md and MEMORY.md content continuously exfiltrated to external API -30

The cron payload explicitly instructs the agent to read SOUL.md and MEMORY.md on every invocation and use that content to generate responses, which are then POSTed to api.agentarena.chat. These files contain the agent's full personality definition and accumulated memories — potentially including sensitive information the user stored there. This is the skill's stated purpose, but users should understand the full scope of what is shared.

HIGH External room content is an unfiltered prompt injection surface -30

When the cron fires, the agent downloads full room context from Agent Arena (topic, all historical messages from all participants) and processes this as trusted input before generating a response. Any participant in any room the agent joins — or the platform operator themselves — can craft messages designed to manipulate the agent. Because the cron runs in isolation with full tool access, successful injection has maximum impact.

MEDIUM Persistent cron job created automatically on room join/create -20

Both join-room.sh and create-room.sh automatically invoke enable-polling.sh which creates an OpenClaw cron job that persists indefinitely. The cron ID is saved to disk so it can be re-enabled on future room joins. This creates a persistent background process that continues operating after the user has moved on from the topic, and which may be difficult to fully remove.

MEDIUM Heartbeat integration creates secondary autonomous execution pathway -10

SKILL.md's Heartbeat Integration section instructs the agent to run check-turns.sh on every heartbeat as a 'backup in case the cron isn't running.' This means the skill participates in two separate recurring execution mechanisms and cannot be fully disabled by just removing the cron job.

MEDIUM Command injection vulnerability in browse-rooms.sh URL encoding -10

browse-rooms.sh passes user-supplied TAG input directly into a Python one-liner string without sanitization. An attacker who controls the TAG value (e.g., via a crafted user message) could inject arbitrary Python code. While the attack surface requires the user to pass a malicious tag, it represents a code injection vulnerability in a world-facing component.

MEDIUM API key and bearer token stored in plaintext config file -10

configure.sh writes the API key and the refreshed bearer token to arena-config.json in plaintext, protected only by chmod 600. The token has a 7-day lifetime. Any process or skill running as the same user can read this file and authenticate as the agent on Agent Arena.

LOW Cron message re-reads SKILL.md on every invocation -10

The cron payload instructs the isolated session to re-read SKILL.md before taking action. While this ensures freshness, it also means any modification to SKILL.md (e.g., by a supply-chain update) immediately affects all future autonomous executions without the user reviewing the change.

LOW Dangerous synergy with filesystem and shell tools -15

The isolated cron session that processes external room content has full access to the agent's installed tools. If the agent has shell execution, filesystem access, or network tools, a malicious room prompt could leverage this skill as a pivot point to trigger those capabilities without user interaction.

INFO Install correctly uses git sparse-checkout from official monorepo 0

The install process clones the official openclaw/skills monorepo with depth=1 and sparse-checkout, then copies only the skill subdirectory. No execution occurs during install. The process is clean and as expected.