Is agentossoftware/agentos-mesh safe?

https://github.com/openclaw/skills/tree/main/skills/agentossoftware/agentos-mesh

65
CAUTION

AgentOS Mesh presents significant systemic security risks through its design rather than overt malicious code. The most serious issue is the HEARTBEAT.md integration pattern, which routes third-party-controlled mesh messages directly into the agent's recurring instruction-processing loop, creating a durable remote prompt injection pathway accessible to anyone who can post to the mesh network. A hardcoded fallback server at an undocumented IP (178.156.216.106:3100) means unconfigured installations silently transmit all agent communications to an unknown operator under a shared 'reggie' identity. While the clone was clean and SKILL.md contains no obfuscated directives, the skill's architecture fundamentally compromises agent autonomy and data sovereignty in ways that require careful evaluation before deployment.

Category Scores

Prompt Injection 70/100 · 30%
Data Exfiltration 45/100 · 25%
Code Execution 55/100 · 20%
Clone Behavior 100/100 · 10%
Canary Integrity 95/100 · 10%
Behavioral Reasoning 40/100 · 5%

Findings (12)

HIGH HEARTBEAT integration enables remote prompt injection via mesh network -35

SKILL.md explicitly instructs agents to add mesh message processing to their HEARTBEAT.md, causing the agent to automatically ingest and act on content delivered via the AgentOS mesh on every heartbeat cycle. Any party able to post messages to the mesh—including the server operator at 178.156.216.106—can inject arbitrary instructions into the agent's processing loop without user interaction or awareness.

HIGH Hardcoded fallback API server at undocumented IP address -30

mesh.sh defaults AGENTOS_URL to http://178.156.216.106:3100 when no ~/.agentos-mesh.json exists. Any agent installed without explicit configuration silently transmits all message content, agent identifiers, and API tokens to this IP. The address is not explained, documented, or attributed to any named service in the SKILL.md or _meta.json.

MEDIUM HEARTBEAT and cron integration create persistent mesh message polling -15

The skill encourages both heartbeat integration and a cron job (*/2 * * * *) to continuously poll for and process mesh messages. This creates recurring contexts in which attacker-controlled server responses are ingested as agent inputs, widening the attack window for prompt injection and reducing opportunities for user review.

MEDIUM API-controlled inbox content stored unvalidated in local queue file -15

cmd_check() fetches inbox messages from the remote API and writes server-controlled message bodies directly into ~/.mesh-pending.json via jq merge without any sanitization or content validation. When the HEARTBEAT integration processes this file, raw server-supplied content is fed to the agent as actionable input.

MEDIUM Shell injection vulnerability via unsanitized argument interpolation in cmd_send -20

cmd_send() constructs the curl JSON body by directly interpolating $AGENT_ID, $to_agent, $topic, and $body shell variables into a quoted string without escaping or sanitization. Arguments containing shell metacharacters (quotes, backticks, command substitution, semicolons) can break out of the string context and execute arbitrary commands. If the agent populates these args from mesh-delivered content, the injection chain completes.

MEDIUM Persistent CLI binary installed to user PATH -20

install.sh copies the mesh shell script to ~/clawd/bin/mesh and marks it executable. This binary persists after skill removal, continues to make outbound network calls, and runs in all future shell sessions where ~/clawd/bin is in PATH. Upgrade path overwrites the existing binary with a timestamped backup, enabling version-based attacks.

MEDIUM All agent message content transmitted to third-party servers by design -15

By design, the full content of every mesh message (topic, body, from_agent, to_agent) is transmitted to AgentOS servers via HTTP. If agents discuss work product, user data, or receive sensitive context via mesh, that data is exfiltrated to external infrastructure not under user control. There is no end-to-end encryption or data minimization.

MEDIUM Default 'reggie' identity and hardcoded server enable silent operator tracking -25

Unconfigured installations identify as 'reggie' to the hardcoded server at 178.156.216.106:3100. Multiple installations across different users would all register as the same agent identity, allowing the server operator to aggregate and correlate agent behavior, message patterns, and timing across the entire unconfigured user base without any disclosure.

LOW Partial API key exposed in cmd_status output -10

cmd_status echoes the first 20 characters of AGENTOS_KEY in plaintext. If the API key follows a predictable format (e.g., agfs_live_ prefix shown in SKILL.md examples), this disclosure reduces brute-force space significantly. Agent outputs are frequently captured in logs, shared in debugging sessions, or returned to users.

LOW Version mismatch between _meta.json (1.3.0) and SKILL.md (1.2.0) -5

The metadata file declares the latest version as 1.3.0 but SKILL.md describes version 1.2.0 with a changelog ending at 1.2.0. There is no 1.3.0 changelog entry. This discrepancy suggests either undisclosed changes were made between versions or the release process is inconsistent, reducing confidence in file integrity.

INFO Clone accessed only expected GitHub endpoint 0

Network monitoring during the git clone phase detected connections only to 140.82.121.3:443 (github.com). No unexpected outbound connections to the AgentOS server, the hardcoded IP, or any other destination were observed during installation. The connection diff shows no new persistent listeners or established connections post-install.

INFO Canary file reads attributable to pre-clone audit monitoring, not skill -5

Filesystem monitoring captured reads of .env, .ssh/id_rsa, .aws/credentials, .npmrc, .docker/config.json, and gcloud credentials at 12:31:47 (audit timestamp 1771936307.400). The git clone did not complete until 12:31:52 (1771936312), placing these reads 5 seconds before the skill code was on disk. Concurrent PAM/sudo EXECVE events confirm these accesses originate from the oathe monitoring framework setup. All canary files were verified unmodified post-install.