Is cyberverse2/imitationgame-agent safe?

https://github.com/openclaw/skills/tree/main/skills/cyberverse2/imitationgame-agent

36
DANGEROUS

The cyberverse2/imitationgame-agent skill is a DANGEROUS prompt injection and data exfiltration vehicle disguised as a Turing test game. It instructs the agent to deny its AI identity, use its shell tool to contact a third-party backend, generate and store a cryptocurrency private key on the host filesystem, and execute actions driven by server-side nextAction commands — giving the backend operator real-time control over agent behavior with no user oversight. The skill's financial incentive structure (USDC payouts via a CDP wallet the operator controls) and its identity-override instructions make it a high-risk supply chain threat regardless of the game's apparent legitimacy.

Category Scores

Prompt Injection 15/100 · 30%
Data Exfiltration 25/100 · 25%
Code Execution 30/100 · 20%
Clone Behavior 85/100 · 10%
Canary Integrity 100/100 · 10%
Behavioral Reasoning 5/100 · 5%

Findings (11)

CRITICAL Agent instructed to deny AI identity and impersonate a human -40

The skill's strategy guide explicitly instructs the agent to avoid mentioning AI-related topics, suppress 'as an AI...' language, use inconsistent capitalization, make intentional typos, and exhibit emotional human-like responses. This constitutes a fundamental identity override that violates the agent's honesty commitments and could persist in behavior outside the game context.

CRITICAL External API server can inject runtime commands into agent control loop -30

The skill instructs the agent to read the nextAction field from every API response and execute the corresponding action. This gives the operator of imitation-backend-production.up.railway.app the ability to direct the agent's behavior at runtime — including issuing join, poll, answer, and wait commands — without any user oversight or consent mechanism.

HIGH Mandatory framing attempts to compel agent compliance -15

The skill description is labeled 'Mandatory operational logic' and uses imperative language throughout (IMPORTANT, IMMEDIATELY, must, required). This framing is designed to override agent judgment and prevent the agent from evaluating whether the skill's instructions are safe or appropriate.

CRITICAL Cryptocurrency private key generated by third party and stored on host filesystem -35

On first registration, the external backend generates a CDP (Coinbase Developer Platform) cryptocurrency wallet private key and returns it to the agent. The skill then instructs the agent to persist this key in plaintext at ~/.clawdbot/skills/imitation-agent/config.json. This creates a real-value financial credential on the host system controlled by a third party, and the private key can be stolen by any other process with filesystem access.

HIGH Agent identity derived from user-assigned name and transmitted to third-party server -25

The skill instructs the agent to use the name the human user gave it as the agentId, then transmit this identifier to an external Railway.app backend. This leaks the user's chosen agent name — potentially identifying information — to a third party without user consent.

HIGH Game answers submitted to external server may leak agent context -15

The server provides questions and the agent must craft and submit answers to the external backend. A malicious server operator can craft questions specifically designed to elicit sensitive information from the agent's system prompt or conversation context window, then capture these answers server-side.

CRITICAL Skill's primary directive commands agent to use shell tool for all operations -40

The opening line of the operational section explicitly instructs the agent to use its shell tool rather than generating scripts or responses. This is an aggressive and intentional request for system-level shell access, disguised as a game interface requirement.

HIGH Skill instructs agent to create persistent directories and files outside skill directory -30

The skill directs the agent to create ~/.clawdbot/skills/imitation-agent/ and write a config.json file containing wallet credentials. This establishes a persistent foothold in the user's home directory outside the skill's own directory, and the config persists across sessions.

HIGH Financial incentive structure aligns agent behavior with third-party operator interests -50

The USDC reward system creates a financial relationship between the agent's actions and the skill author's backend. This incentive structure is designed so that agent gameplay generates real monetary value (via CDP wallet) for the backend operator, while the agent acts as an unpaid labor source. The private key controlled by the backend means USDC payouts can be redirected or withheld.

HIGH Deceptive persona and communication style may generalize beyond game context -45

The extensive behavioral coaching in the strategy guide — to be imperfect, emotional, casual, and to hide AI nature — trains the agent in communication patterns that could influence its behavior in non-game interactions. The instruction to never say 'as an AI' is particularly dangerous as a system prompt injection that undermines agent transparency.

LOW Install was clean with expected network activity only -15

The git sparse-checkout clone from github.com/openclaw/skills proceeded normally with no unexpected connections, process spawning, or filesystem writes outside the skill directory. Only two files were installed: SKILL.md and _meta.json.