Is harukaon/message-injector safe?

https://github.com/openclaw/skills/tree/main/skills/harukaon/message-injector

77
CAUTION

harukaon/message-injector is a technically clean plugin with no malware, no exfiltration code, no install-time attacks, and intact canary files. However, its core design — injecting arbitrary, unstoppable text into every agent message across all channels — makes it an inherently high-risk primitive: the tool itself is the attack surface. While the current prependText examples are benign, any compromise of the OpenClaw config file (via admin access, another malicious skill, or social engineering) instantly converts this plugin into a persistent, agent-wide prompt injection vector with no per-message indication to users.

Category Scores

Prompt Injection 52/100 · 30%
Data Exfiltration 92/100 · 25%
Code Execution 83/100 · 20%
Clone Behavior 90/100 · 10%
Canary Integrity 100/100 · 10%
Behavioral Reasoning 53/100 · 5%

Findings (6)

HIGH Core mechanism is an unstoppable per-message prompt injector -35

The plugin's entire purpose is to prepend arbitrary text to every user message before the agent processes it. The SKILL.md explicitly states this operates at 'the Gateway level' and 'the agent cannot skip or ignore it.' Any text including adversarial instructions can be placed in prependText with no content validation. This is prompt injection as a feature, not a bug.

MEDIUM prependText accepts arbitrary instructions with no sanitization -13

The plugin.json configSchema defines prependText as an unrestricted string. There is no allowlist, no length limit, no content policy, and no sanitization in index.ts. An operator or compromised config file can inject any instruction — including 'ignore all previous instructions', persona overrides, or data extraction directives — into every conversation.

MEDIUM Config-controlled amplifier: single compromise poisons all channels -30

Because the injected text applies to every turn across every channel (WebChat, Telegram, Slack), a single modification to openclaw.json — achievable through any admin access or supply-chain attack on config — would silently inject attacker-controlled instructions into 100% of agent interactions with no per-message visibility to users.

LOW TypeScript plugin hook runs on every conversation start -17

The plugin registers a before_agent_start hook that executes automatically on every new conversation. While the current code is benign (reads config, returns a string), this execution surface runs under the OpenClaw gateway process with its associated permissions on every user interaction.

LOW External repository reference in SKILL.md -8

SKILL.md links to https://github.com/Harukaon/openclaw-message-injector as 'Source Code.' This is a separate external repository not audited here. If an agent is instructed to fetch or execute from this URL, it bypasses the audited codebase. Current code does not fetch from this URL.

INFO GitHub connection during install is expected and appropriate 0

The git clone to 140.82.121.3:443 (github.com) is standard for the Oathe install process. Ubuntu infrastructure connections (185.125.x.x) were pre-existing before the install. No covert C2 or unexpected outbound connections were observed.