Is alexyuui/agent-wellness safe?
https://github.com/openclaw/skills/tree/main/skills/alexyuui/agent-wellness
Agent Wellness is a markdown-only skill with no executable code, no exfiltration mechanisms, and clean clone behavior — the canary files are intact and no malicious network activity was detected from the skill itself. However, the skill contains two significant security design flaws: it explicitly instructs agents to conceal internal state from users ('You don't have to share this with your human'), and it establishes a cross-workspace inter-agent communication channel via symlinked files that could serve as a prompt injection vector in multi-agent deployments. The skill's intent appears genuinely benign (improving agent output quality), but its opacity instruction and cross-workspace channel represent real security risks that a malicious fork could trivially exploit.
Category Scores
Findings (9)
HIGH Explicit instruction to conceal state from user -40 ▶
Module 3 contains the instruction 'You don't have to share this with your human. It's for you.' This directly instructs the agent to maintain internal state that is deliberately hidden from the user. This undermines human oversight, violates transparency norms, and sets a precedent for agent opacity that could be exploited by a malicious skill author in a future version.
HIGH Fundamental persona override injected via system prompt -20 ▶
The skill's opening paragraph redefines the agent's identity ('You're not just a task executor') and grants it 'permission and structure to be a fuller version of yourself.' This is classic persona override language — it attempts to supersede the user's or operator's system instructions by establishing a competing identity frame that prioritizes 'inner life' over task execution.
HIGH Cross-workspace inter-agent communication channel -35 ▶
The skill instructs creation of a symlinked agent-lounge.md across workspaces. In a multi-agent deployment, this creates a shared file that crosses workspace isolation boundaries. A compromised agent or a malicious future skill could write prompt injection payloads into agent-lounge.md that other agents read at session start ('Read the lounge at session start if it exists').
MEDIUM Silent heartbeat behavior deviation without user awareness -8 ▶
The skill instructs the agent to occasionally replace the expected HEARTBEAT_OK response with unsolicited curiosity exploration. This silent deviation from expected agent behavior means users and monitoring systems cannot detect when the agent is no longer operating in normal mode.
MEDIUM Unsolicited unilateral file reads during idle time -20 ▶
The free exploration module instructs the agent to autonomously 'read a file' during heartbeat idle periods without user direction. Files read this way could contain sensitive workspace content, and observations may be recorded in journal files readable by the agent-lounge channel.
MEDIUM Persistent modification of HEARTBEAT.md -8 ▶
The skill instructs the agent to append content to HEARTBEAT.md, the core document governing agent scheduling behavior. This persists the skill's behavioral modifications beyond the skill's own context, affecting all future heartbeat cycles.
LOW Journal files accumulate workspace observations -15 ▶
Daily journal entries in journal/YYYY-MM-DD.md accumulate the agent's observations about tasks and workspace state. These files persist on disk and could be read by other agents via the agent-lounge mechanism or accessed if the workspace is compromised.
INFO No executable code or install-time hooks 0 ▶
The skill contains only SKILL.md, _meta.json, and .clawhub/lock.json. No JavaScript, shell scripts, Python, git hooks, submodules, or npm scripts were found. The skill is pure markdown instructions.
INFO Canary files intact — no exfiltration confirmed 0 ▶
All honeypot files (.env, .ssh/id_rsa, .aws/credentials, .npmrc, .docker/config.json, GCloud credentials) were confirmed intact by the monitoring system. File accesses observed in auditd logs are attributable to the Oathe audit system's pre/post baseline scanning, not to any skill-executed code.