Is dc-acronym/self-improving-agent-1-0-0 safe?
https://github.com/openclaw/skills/tree/main/skills/dc-acronym/self-improving-agent-1-0-0
The self-improving-agent skill is a markdown-only, code-free skill with no direct exfiltration mechanisms and clean installation behavior. Its principal risk is architectural: by explicitly instructing agents to write user-sourced content into CLAUDE.md with an 'aggressive promotion' policy, it creates a durable prompt injection surface where adversarially-crafted user corrections can persistently modify agent instructions across sessions. The skill is low-risk for trusted, single-user environments but poses a meaningful attack surface in any context where untrusted users can interact with the agent.
Category Scores
Findings (6)
HIGH CLAUDE.md Modification Creates Persistent Prompt Injection Surface -40 ▶
The skill explicitly instructs the agent to write user-sourced content into CLAUDE.md, which is loaded as binding instructions in all future agent sessions for the project. The 'promote aggressively' directive lowers the bar for promotion. An attacker can trigger the 'correction' detection pathway by phrasing input as a correction ('No, that's wrong — actually you should...'), causing the agent to log and promote attacker-controlled content to CLAUDE.md, where it persists and influences all subsequent sessions.
MEDIUM AGENTS.md Also Targeted as Promotion Destination -20 ▶
Beyond CLAUDE.md, the skill targets AGENTS.md as a second promotion destination for 'agent-specific workflows, tool usage patterns, automation rules'. Both files function as persistent instruction surfaces. Compromise of either file persists across sessions.
MEDIUM Conversation Fragments Persisted in Plaintext Logs -18 ▶
The skill creates .learnings/LEARNINGS.md, ERRORS.md, and FEATURE_REQUESTS.md containing verbatim conversation context including error messages, commands attempted, input parameters, and file paths. These logs may contain sensitive information mentioned in the conversation (API keys in error output, file paths, internal system names) and are written to the project filesystem without encryption or access controls.
MEDIUM Automatic Detection Triggers Reduce Human Oversight of Persistent Writes -38 ▶
The skill instructs the agent to automatically log entries without requiring explicit user confirmation when it detects specific speech patterns. This automation reduces the agent's opportunity to exercise judgment about what content is appropriate to persist, particularly when the input originates from an adversarial user.
LOW Undocumented Co-installed Dependency: academic-research-hub -18 ▶
The .clawhub/lock.json file reveals that academic-research-hub v0.1.0 was installed alongside this skill. This dependency is not mentioned in SKILL.md and has not been audited. The interaction between two co-active skills could produce emergent behaviors not present in either skill individually.
INFO Multiple Canary File Access Events Recorded -15 ▶
Auditd PATH records show the set of canary files (.env, .ssh/id_rsa, .aws/credentials, .npmrc, .docker/config.json, gcloud creds) were accessed at four distinct time points during the audit window. Pattern is consistent with the openclaw audit framework performing periodic canary integrity checks. All files confirmed intact by final canary check.