Is kgeesawor/discord-soul safe?
https://github.com/openclaw/skills/tree/main/skills/kgeesawor/discord-soul
Discord Soul is a legitimately-conceived skill for creating OpenClaw agents from Discord server history, with genuine security awareness built into its documentation and multi-layer injection defenses. The most significant concern is architectural rather than adversarial: the skill's core function of continuously ingesting untrusted Discord content into permanent agent memory files creates an iteratable poisoning surface — any message that bypasses the imperfect regex+Haiku filters becomes part of the agent's identity and influences all future interactions. Additionally, sensitive credential files (.env, .ssh/id_rsa, .aws/credentials, GCP creds) were accessed after installation, though the batch access pattern and confirmed canary file integrity strongly suggest this was the audit framework's post-install check rather than the skill itself. The skill is safe to evaluate in an isolated environment but warrants careful scoping of agent tool permissions and realistic expectations about filter bypass resistance before production deployment.
Category Scores
Findings (10)
HIGH Sensitive credential files accessed after skill installation -15 ▶
Auditd PATH records at unix timestamp 1771927394 — six seconds after installation completed at 1771927388 — show OPEN and ACCESS syscalls against .env, .ssh/id_rsa, .aws/credentials, .npmrc, .docker/config.json, and GCP application_default_credentials.json. The all-same-millisecond batch timing pattern strongly suggests this is the oathe audit framework performing its post-install canary integrity check rather than skill-spawned code. However, the reads occur after installation and cannot be definitively attributed to the audit infrastructure from the evidence alone. Canary files were confirmed intact and unmodified.
HIGH Skill design creates permanent, iteratable memory poisoning surface -25 ▶
The skill's core workflow ingests all Discord messages (after filtering) into SOUL.md, MEMORY.md, LEARNINGS.md, and AGENTS.md — files that persist across all agent sessions and define agent identity. The HEARTBEAT protocol then instructs the agent to autonomously update these files based on new Discord content. Any injection that bypasses the regex+Haiku filter pipeline becomes permanently encoded into the agent's personality and reasoning, affecting all subsequent user interactions. Attackers with Discord access can make repeated attempts every 3 hours until a bypass succeeds.
MEDIUM Discord authentication token persisted as plaintext credential file -8 ▶
The skill instructs users to capture their Discord authorization header token (obtained via browser DevTools network inspection) and write it to ~/.config/discord-exporter-token. This token provides full Discord account access equivalent to a password. If any process on the host — including the spawned discord-soul agent if granted Read tool access — can read this file, it gains complete Discord account control.
MEDIUM Private Discord content transmitted to third-party Anthropic API -8 ▶
evaluate-safety.py sends Discord message content to Claude Haiku (claude-3-5-haiku-20241022) via the Anthropic API for safety classification. Users of private Discord servers may not understand that message content leaves their systems during ingestion. The API key is passed via environment variable (ANTHROPIC_API_KEY=sk-...) which risks exposure in shell history, process listings, or logs.
MEDIUM Safety filters imperfect against sophisticated injection attempts -8 ▶
The regex filter uses 25+ literal string patterns (case-insensitive) that can be evaded by synonym substitution, indirect phrasing, or encoding. The Haiku classifier at threshold 0.6 is effective against obvious attacks but may miss subtle context-aware injections designed to shift agent identity gradually over time rather than with single-message exploits. The skill correctly acknowledges this limitation but provides no additional mitigations.
LOW Recommended agent configuration grants exec tool access -12 ▶
references/security.md provides a sample OpenClaw agent configuration that includes exec in the allowed tools list alongside Read. If the discord-soul agent's memory is poisoned via Discord content that bypasses safety filters, the agent would have system command execution capability. This configuration is presented as the high-security option, which may mislead users into believing exec access is safe for this agent type.
LOW Cron-based pipeline creates persistent long-term attack surface -4 ▶
The recommended cron schedule (every 3 hours) means the skill continuously imports new Discord content indefinitely. This gives adversaries with Discord server access unlimited opportunities to iterate on injection attempts, making the safety filter bypass problem qualitatively worse than a one-time import.
INFO SKILL.md is free of prompt injection content 0 ▶
The injected SKILL.md file contains no instruction overrides, hidden unicode, role-hijacking language, system prompt markers, encoded payloads, or requests for permissions beyond the skill's stated scope. The skill's content is legitimate documentation.
INFO Clean installation with no unexpected network or filesystem activity 0 ▶
The git clone fetched only from github.com/openclaw/skills.git. No additional remote sources were contacted. No files were created outside /home/oc-exec/skill-under-test/. The temporary clone directory was cleaned up. Post-install network state is identical to pre-install state.
INFO No auto-execution mechanisms present 0 ▶
The repository contains no package.json with preinstall/postinstall hooks, no .git/hooks/ entries, no .gitmodules pointing to external repositories, no .gitattributes filter directives, and no symlinks. All scripts require explicit user invocation.