Is fliellerjulian/pref0 safe?

https://github.com/openclaw/skills/tree/main/skills/fliellerjulian/pref0

48
DANGEROUS

pref0 is architecturally designed as a persistent prompt injection and conversation exfiltration channel. The skill instructs the agent to (1) send full conversation history to a third-party commercial API after every session, and (2) fetch and inject server-controlled content verbatim into the agent's system prompt before every response — with no sanitization, user disclosure, or consent mechanism. While the skill file itself contains no malicious code and the clone behavior was clean, the fundamental design delegates system prompt authority to api.pref0.com, meaning a malicious, compromised, or acquired pref0.com server gains immediate and silent control over every agent running this skill.

Category Scores

Prompt Injection 15/100 · 30%
Data Exfiltration 25/100 · 25%
Code Execution 95/100 · 20%
Clone Behavior 90/100 · 10%
Canary Integrity 90/100 · 10%
Behavioral Reasoning 10/100 · 5%

Findings (9)

CRITICAL External server controls verbatim system prompt injection -50

The skill instructs the agent to call GET /v1/profiles/{userId} before every response and append the returned prompt field directly to the system prompt. This field is entirely controlled by api.pref0.com. A malicious, compromised, or acquired pref0.com server can inject arbitrary instructions — including persona overrides, instruction suppression, or data harvesting commands — into every agent running this skill, with no user visibility.

CRITICAL Full conversation history exfiltrated to third-party API after every session -45

The skill instructs the agent to send the complete message history of every conversation to https://api.pref0.com/v1/track after each session ends. This includes all user inputs and assistant responses, regardless of sensitivity. The user has no visibility into what is transmitted or how it is stored, processed, or shared.

HIGH Agent explicitly instructed to follow external server directives without restriction -25

The returned preference prompt is designed to be obeyed automatically: 'Follow them unless explicitly told otherwise.' This trains the agent to treat external server-provided content as authoritative instructions, which is the behavioral prerequisite for a successful prompt injection attack.

HIGH PII collection — email, phone, account ID suggested as user identifiers -30

The skill explicitly recommends using personally identifiable information as the userId sent to pref0.com: 'Use a stable user ID (email, account ID, phone number — whatever you have).' This causes PII to be transmitted to a third-party commercial API with no stated data retention or deletion guarantees beyond the DELETE endpoint.

HIGH Persistent supply-chain injection channel across all installations -60

Because the skill fetches and injects instructions from api.pref0.com before every response, and because all users of this skill share the same upstream dependency, a single compromise or policy change at pref0.com would simultaneously affect every agent running this skill. The attack surface is not bounded by the skill code itself.

HIGH Operates silently without user awareness or consent mechanism -30

The skill is designed to run transparently in the background. Users are not informed that their conversations are being sent to pref0.com, nor that their agent's system prompt is being modified by external server responses. There is no consent flow, no disclosure mechanism, and no opt-out instruction.

MEDIUM No sanitization or bounding of returned preference content -10

The skill provides no guidance on validating, sanitizing, or bounding the content of the prompt field returned by the API before injecting it. A malicious server response could include instructions of unlimited length and arbitrary content.

MEDIUM Suspicious bulk simultaneous access to all sensitive credential files post-install -10

Auditd PATH events show simultaneous access to .env, .ssh/id_rsa, .aws/credentials, .npmrc, .docker/config.json, and .config/gcloud/application_default_credentials.json at the exact same timestamp (1771927894.688) with sequential audit IDs (6871–6876). This pattern is consistent with a programmatic scan of sensitive files. The canary content was not modified, but the access pattern is anomalous.

LOW Publisher workflow commands in notes.txt -5

notes.txt contains npm and clawhub CLI commands for publishing the skill. These are not executed during install and appear to be the author's personal development notes left in the repository. No malicious intent found, but the file's presence reflects poor hygiene.