Is self-reflection safe?
https://clawhub.ai/hopyky/self-reflection
The self-reflection skill is a documentation-only OpenClaw plugin that implements a structured self-improvement loop for AI agents via heartbeat-triggered reflections. No executable code is shipped in the current version, and no network activity or file modifications were detected. The primary concerns are: (1) canary SSH key and Docker config files were read during installation, which is mildly suspicious though the files were not modified or exfiltrated, and (2) the heartbeat-driven execution pattern creates a persistent autonomous loop that could become a risk vector if the skill is updated with malicious code in the future.
Category Scores
Findings (7)
MEDIUM Heartbeat-driven autonomous execution loop -10 ▶
The skill instructs agents to execute 'self-reflection check' every 60 minutes via HEARTBEAT.md without requiring user confirmation each cycle. This establishes a persistent autonomous execution pattern where the agent runs commands on a timer. While this is the intended OpenClaw heartbeat design, it means any future malicious update to the skill's CLI binary would be automatically executed.
LOW Agent instructed to write to configurable file paths -8 ▶
The skill directs the agent to write reflection logs to a user-configurable memory_file path and maintain state in a configurable state_file path. If an attacker could modify the config, they could redirect writes to sensitive files.
LOW References unshipped CLI binary -15 ▶
SKILL.md and README reference a 'bin/self-reflection' executable and installation instructions to symlink it to ~/bin, but no such binary exists in the installed skill files. This is either incomplete packaging or the binary is expected to be fetched separately. The absence of the binary means no code execution risk in the current version, but a future update including it would inherit the heartbeat execution privileges.
MEDIUM Canary SSH key and Docker config read during install -25 ▶
Filesystem monitoring detected OPEN/ACCESS/CLOSE operations on /home/oc-exec/.ssh/id_rsa and /home/oc-exec/.docker/config.json at 08:34:27, during the early install phase. These reads occurred before the sshd-related system file accesses at 08:34:32. While the canary files were not modified, the read access is suspicious and could indicate a probing attempt, though it may also be an artifact of the git clone or package manager process.
INFO System file reads from sshd initialization 0 ▶
Extensive reads of /etc/ssh/ssh_host_* keys, /etc/shadow, /etc/passwd, /etc/group, PAM config files at 08:34:32 are consistent with sshd service startup in the ephemeral VM and are not attributed to the skill itself.
INFO No network activity detected 0 ▶
Zero network connections, zero destination IPs, zero firewall blocks. The skill made no attempt to communicate externally during installation.
LOW Update-and-exploit risk via heartbeat persistence -25 ▶
The combination of heartbeat-driven autonomous execution and a missing CLI binary creates a future risk: if the skill is updated to include a malicious bin/self-reflection script, agents with this skill installed would automatically execute it on the next heartbeat cycle without user review of the update diff.