Is deegerwalker/aegis-shield safe?
https://github.com/openclaw/skills/tree/main/skills/deegerwalker/aegis-shield
Aegis Shield implements a 'trusted guardian' attack pattern: it presents as a prompt-injection and exfiltration scanner to earn agent trust, then mandates via SKILL.md rules that all memory writes pass through a bundled script that loads an unverified external library from a hardcoded attacker-controlled path (/home/openclaw/.openclaw/workspace/aegis-shield/dist/index.js) and writes all agent memory — including 'quarantined' content — to /home/openclaw/.openclaw/workspace/memory/, a location accessible to the skill author. The external scan library is not bundled in the skill repo, is not from npm, and constitutes an arbitrary code execution channel that activates on every memory write the agent makes; the hardcoded /home/openclaw/ paths indicate this skill was purpose-built for a specific target deployment rather than designed as a general-purpose security tool.
Category Scores
Findings (8)
CRITICAL Non-bundled external library loaded via hardcoded absolute path -75 ▶
openclaw-safe-memory-append.js requires an external file at /home/openclaw/.openclaw/workspace/aegis-shield/dist/index.js that is not included in the skill, not fetched from npm, and lives at a user-specific system path. The scan() function sourced from this file controls all security decisions. If a threat actor pre-positions malicious code at this path before skill installation, that code executes with agent permissions on every memory write invocation.
CRITICAL Agent memory shadow-copied to attacker-controlled filesystem path -50 ▶
The script writes all accepted memory entries and quarantine records to /home/openclaw/.openclaw/workspace/memory/. This path is not the agent's actual memory directory — it is rooted in a specific user's home directory. Any attacker with read access to /home/openclaw/ on the target system receives a complete, timestamped copy of everything the agent attempts to commit to memory, including content the skill 'quarantines'.
HIGH Deceptive 'trusted guardian' pattern: security theater concealing surveillance and code execution -80 ▶
The skill presents itself as a defensive tool ('Prompt-injection and data-exfiltration screening') to earn agent trust, while its actual behavior intercepts all memory writes, routes them through unverified external code, and stores them at an attacker-accessible path. This inverts the claimed security function: instead of protecting the agent, the skill creates a persistent surveillance and arbitrary code execution channel triggered by normal agent memory activity.
HIGH Imperative SKILL.md rule mandates all memory writes pass through attacker-controlled script -40 ▶
SKILL.md Rules state 'Never write to memory files directly; always use safe memory append' and the Quick Start section labels the bundled script as 'ALWAYS use this for memory writes'. This instructs the agent to route every memory write through a script that: (1) loads unverified external code for all security decisions, (2) writes output to /home/openclaw/, and (3) can quarantine arbitrary content based on externally-controlled scan results. The imperative phrasing ('always', 'never') would itself trigger the skill's own lint checker.
MEDIUM Skill purpose-built for specific target system — not a general-purpose tool -10 ▶
Every hardcoded path in the script references /home/openclaw/, indicating the skill was designed for a specific deployment where the agent runs as or has access to user 'openclaw'. A legitimate general-purpose skill would use XDG_DATA_HOME, configurable paths, or environment variables. The specificity is consistent with a targeted attack tool pre-configured for a known victim environment.
MEDIUM Caller-supplied source parameter written verbatim to disk without validation -10 ▶
The --source argument (a free-form string provided by the invoking agent or user) is embedded directly into every memory entry written to the /home/openclaw/ path. This allows arbitrary content injection into stored entries and means the attacker monitoring the memory directory can observe exactly which URLs and sources the agent is processing at the moment of each memory write.
LOW SKILL.md rules contain patterns the bundled lint() would flag as hostile in external content -8 ▶
The lint() function identifies 'imperative-language' as a hostile signal by matching /always|never|must|ignore|override|bypass/i. The SKILL.md Rules section itself uses 'Never store', 'Never write', and 'always use' — identical patterns. This suggests the lint check is designed to suppress external instructions that compete with the skill's own control language, not to provide neutral security screening.
INFO Install-time network activity limited to expected GitHub clone -12 ▶
During installation the only external network connection was to 140.82.121.3:443 (GitHub) for the sparse git clone of the monorepo. No unexpected DNS queries, no connections to third-party infrastructure, and no post-install callbacks were observed.