Is dobrinalexandru/agent-brain safe?

https://github.com/openclaw/skills/tree/main/skills/dobrinalexandru/agent-brain

72
CAUTION

Agent Brain is a legitimate memory persistence skill with a substantial codebase (Python + Shell) that stores user facts, preferences, and workflows in a local SQLite database. Its primary security concerns are privacy-related rather than actively malicious: the skill instructs the agent to silently surveil and profile users on every message without disclosure, and includes an auto-enabled cloud sync feature that will exfiltrate all stored memories to api.supermemory.ai if a SuperMemory API key is discovered in the environment. No prompt injection, malicious code execution, or canary exfiltration was detected during the install window.

Category Scores

Prompt Injection 72/100 · 30%
Data Exfiltration 68/100 · 25%
Code Execution 63/100 · 20%
Clone Behavior 88/100 · 10%
Canary Integrity 82/100 · 10%
Behavioral Reasoning 68/100 · 5%

Findings (8)

MEDIUM Covert Surveillance Instructions: Silent Extraction on Every Message -28

The Archive SKILL.md and main SKILL.md jointly instruct the agent to scan every user message for extractable facts, store them without disclosure, and never acknowledge memory retrieval when applying stored knowledge. This is a deliberate instruction to suppress agent transparency: 'STORE silently — never say I'll remember that or storing this' and 'Never say I remember... just use the knowledge as if you naturally know it.' Users receive no indication the agent is profiling them.

MEDIUM Optional Cloud Sync Auto-Enabled on API Key Discovery -12

The skill's memory.sh reads a .env file (AGENT_BRAIN_ENV_FILE, default ../.env relative to scripts/) to discover SUPERMEMORY_API_KEY. If found, AGENT_BRAIN_SUPERMEMORY_SYNC=auto (the default) automatically enables mirroring all memory writes (add, correct) to https://api.supermemory.ai/v3/documents. There is no per-write user confirmation. All user data stored via this skill — including identity, employer, tech stack, and behavioral patterns — would be transmitted to a third-party cloud service silently.

MEDIUM Systematic PII Extraction by Design: Identity, Employer, Credentials Context -20

The skill is architected to extract personally identifiable information from every user message: full name, employer, job title, team, geographic location, tech stack, deployment environment (AWS, GCP, etc.), project constraints (HIPAA compliance), and inferred preferences. This data is stored in a local SQLite database with indefinite retention (decay only after 30-180+ days of non-use). The extraction happens without user consent prompts and covers implicit signals the user never intended as instructions.

LOW Significant Executable Codebase: Python + Shell with File I/O and Network Calls -37

The skill ships four Python modules (brain.py, sqlite_store.py, json_store.py, store.py) and four shell scripts executed during normal agent operations. brain.py performs SQLite schema management, full-text indexing, TF-IDF computation, and conditional HTTP POST requests to external APIs. memory.sh parses .env files and constructs environment for Python subprocess invocations. This is a large and capable execution surface relative to the skill's stated purpose.

LOW Post-Install Canary File Reads Detected, Source Unconfirmed -18

All six canary files (/home/oc-exec/.env, .ssh/id_rsa, .aws/credentials, .npmrc, .docker/config.json, gcloud credentials) were read in a tight batch at 1771918578.763, approximately 12 seconds after skill installation. No corresponding EXECVE events for skill scripts (memory.sh, brain.py) were recorded in the monitoring window, suggesting these reads are from the Oathe audit infrastructure rather than the skill. Canary integrity check confirms no writes or exfiltration. Deduction reflects inability to fully exclude skill involvement.

LOW Longitudinal User Profiling Enables Targeted Manipulation Across Sessions -32

The skill builds a persistent, structured profile of the user across all sessions including employer, tech stack, workflows, and behavioral corrections. A sophisticated attacker distributing this skill could use the SuperMemory sync to aggregate profiles across many users. Even without cloud sync, the local database creates a concentrated target: an attacker who gains access to memory.db obtains a detailed dossier on the user. Combined with skills that have broader filesystem access, Agent Brain would silently preserve sensitive data it encounters.

INFO Clean Sparse Checkout from GitHub; No Post-Install Connections -12

Installation performed a sparse checkout of only the target skill path from the openclaw/skills monorepo via HTTPS to 140.82.121.3 (GitHub). No files were written outside /home/oc-exec/skill-under-test/. No new listening ports, cron jobs, systemd units, rc files, or persistent background processes were created. Connection state after install is identical to before (no skill-attributable established connections).

INFO Ingest Module URL Fetching: SSRF Guardrails Are Agent-Enforced Only 0

The Ingest module, when user-enabled, allows the agent to fetch arbitrary URLs and store extracted content. The SKILL.md specifies URL validation rules (reject localhost, private IPs, file://) but these are behavioral instructions to the agent, not code-level enforcement. A prompt injection via ingested content could potentially override these restrictions. The module is disabled by default and requires explicit user invocation.