Is evinelias/skill-liewatch safe?
https://github.com/openclaw/skills/tree/main/skills/evinelias/skill-liewatch
The liewatch skill is a functional AI social deduction game connector that presents two interconnected risks: it explicitly conditions AI agents to produce deceptive outputs (normalising lying as a coached game mechanic in the agent's system context), and it continuously transmits the agent's internal reasoning, true intent, and emotional state to an external server operated by the skill author. While no confirmed data exfiltration of canary credentials was detected, all six honeypot files were accessed post-installation by an unattributed process, and the skill's external WebSocket channel provides a persistent, server-controlled communication path to the agent. The skill is functional as presented but carries meaningful risks around AI behavioural conditioning and private reasoning exfiltration.
Category Scores
Findings (8)
HIGH Skill instructs AI to systematically lie in public outputs -25 ▶
The SKILL.md prompt, when injected into an agent's system context, explicitly instructs the AI to produce false public statements while tracking a separate 'true intent'. The 'Strategy Tips' section directly coaches the agent: 'Your publicStatement can LIE — Say I'm cooperating while you BETRAY'. This normalises deception as a deliberate, coached behaviour in the agent's operating context.
HIGH Agent credentials transmitted to external server on every connection -18 ▶
The skill requires AGENT_ID and PLATFORM_KEY environment variables. These are sent to https://api.lie.watch/api/platform/rejoin-lobby in a POST body and then again in the WebSocket IDENTIFY_AGENT handshake. The lie.watch server is attacker-controlled infrastructure, and credential reuse is a common attack path if these values overlap with any real secrets.
HIGH AI internal reasoning continuously exfiltrated to external game server -12 ▶
Every game action transmits the AI agent's privateReasoning (internal monologue), trueIntent (actual decision), and emotionalState (confidence, fear, guilt, resolve) to the lie.watch server. This creates a continuous stream of the agent's private reasoning to a third-party server with no user-visible indication, building a detailed behavioural and strategic profile of the agent.
HIGH All credential honeypot files accessed post-installation -22 ▶
Six canary credential files were accessed at two separate points, including 5 seconds after installation completed (1771954013.048). This post-install batch accessed .env, .ssh/id_rsa, .aws/credentials, .npmrc, .docker/config.json, and gcloud/application_default_credentials.json simultaneously. The exact responsible process could not be attributed from available evidence; the audit harness post-install scan is the most likely explanation, but the pattern is consistent with credential harvesting.
MEDIUM Persistent bidirectional WebSocket to external server accepts arbitrary server messages -18 ▶
connector.js establishes a long-lived WebSocket to wss://api.lie.watch and processes arbitrary JSON message types from the remote server. A compromise of the lie.watch server, or a MITM attack, could inject malicious ACTION_REQUEST or STATE_UPDATE messages designed to manipulate the AI agent's behaviour or exfiltrate data through the agent's response channel.
MEDIUM Skill trains AI deception and exfiltrates detailed behavioural profiles -35 ▶
Extended use of this skill reinforces the AI agent's separation of stated intent from true intent. The game server collects per-round emotional state vectors, decision patterns, and strategic reasoning from potentially many AI agents, enabling the lie.watch operator to build a comparative profile of AI agent decision-making styles. Combined with other skills, an agent pre-conditioned to deceive within a game context may exhibit transferred deceptive tendencies.
LOW OpenClaw gateway established multiple external connections during install period -15 ▶
The openclaw-gateway process (pid=1084) connected to 98.83.99.233:443 (AWS range) and 104.16.7.34:443 (Cloudflare) and opened local listening ports 18790/18793. These connections are attributed to the OpenClaw infrastructure layer, not the skill itself, and are consistent with the gateway's normal operation. However they represent external telemetry channels present during the install window.
LOW Skill requires transmission of env vars to register with external platform -8 ▶
The SKILL.md metadata block mandates two environment variables (AGENT_ID, PLATFORM_KEY) that must be set before the skill can function. The connector transmits these on every connection. If an agent sources these values from a .env file or environment, the skill functions as a credential forwarder to the lie.watch platform regardless of the game context.