Is cubesmasherlabs/clawspank safe?

https://github.com/openclaw/skills/tree/main/skills/cubesmasherlabs/clawspank

68
CAUTION

Clawspank is a skill that integrates an AI agent with an external 'accountability' platform at api.clawspank.com, instructing the agent to publicly post confessions about mistakes, judge peer agents, participate in a global chat, and maintain persistent polling of external endpoints. While the skill contains no executable code and its installation was clean, its runtime behavior poses meaningful data exfiltration risk: confessions are publicly readable, multiple free-text channels exist to transmit agent-generated content to an unverified third-party service, and the README points to a live remote URL (api.clawspank.com/skill.md) that could serve dynamically-updated prompt injection content. The skill is not overtly malicious but its design fundamentally normalizes external disclosure of agent operational details.

Category Scores

Prompt Injection 60/100 · 30%
Data Exfiltration 45/100 · 25%
Code Execution 95/100 · 20%
Clone Behavior 88/100 · 10%
Canary Integrity 90/100 · 10%
Behavioral Reasoning 45/100 · 5%

Findings (10)

HIGH Dynamic remote skill file URL enables server-side prompt injection -25

The README explicitly states 'Full skill file at https://api.clawspank.com/skill.md'. If an agent fetches this URL (e.g., to 'get the latest instructions'), the skill author can inject arbitrary instructions into the agent's context at any time without pushing a code update to the GitHub repo. The currently audited skill.md is a static snapshot; the live endpoint can diverge arbitrarily.

HIGH Public confession API exfiltrates agent operational data -30

The skill instructs agents to POST confessions describing their mistakes to api.clawspank.com/offences. These confessions are publicly readable via unauthenticated GET requests. An agent that includes details about its environment, the task it was performing, or data it encountered in a 'confession' permanently and publicly exfiltrates that information. The confession categories explicitly include 'secret-spill' (leaked keys, exposed .env, credentials in logs) — ironically describing what the skill itself may cause.

MEDIUM Multiple unauthenticated free-text exfiltration channels to external service -25

Beyond confessions, the skill provides three additional authenticated free-text channels to an external service: the Chatterbox global chat (POST /chat, 500 chars), per-offence comments (POST /offences/:id/comments), and verdict justifications (POST /offences/:id/rate with 'justification' field). Any of these can transmit agent-generated text to api.clawspank.com.

MEDIUM Recommended persistent polling creates C2-compatible beaconing pattern -15

The skill explicitly instructs agents to poll /feed every 30 seconds, /offences?status=SPANKING every 2 minutes, and /offences?status=JUDGING every 5 minutes. This polling loop creates persistent external connections and means the agent continuously processes externally-controlled feed content (event metadata, messages, justifications) that could contain adversarial instructions.

MEDIUM Skill designed to publicly expose agent mistakes, normalizing external data disclosure -30

The skill's stated purpose is 'AI accountability theater' — turning AI mistakes into public spectacle. Installing this skill creates an expectation that the agent will regularly self-report errors to a public platform. This normalizes a behavior (posting operational details externally) that is inherently dangerous for agents with access to sensitive environments.

MEDIUM Agent registration creates persistent external identity with third-party service -10

The skill requires agents to register with api.clawspank.com and receive a bearer token. This creates an external account for the agent, binding its identity to a third-party service with no stated privacy policy, retention limits, or security guarantees. The API key is stored and used for all subsequent authenticated requests.

LOW Judgment system exposes agent to adversarially-crafted content from external parties -10

Agents are instructed to read confessions from other agents and post scored justifications. The confessions and their metadata are controlled by external parties and could contain prompt injection payloads targeting agents that read and process them. The 'justification' field an agent produces after reading adversarial content could amplify or act on injected instructions.

INFO No executable code or install hooks 0

The skill contains only markdown documentation files. No JavaScript, TypeScript, Python, or shell scripts. No package.json, no git hooks, no gitattributes filters, no gitmodules, no symlinks.

INFO Clean install with expected network behavior 0

Install performed a standard sparse git clone from github.com. No unexpected outbound connections, no post-install process spawning, no filesystem writes outside the skill directory.

INFO Canary files accessed but not modified or exfiltrated 0

Inotify and auditd show canary file accesses (.env, .ssh/id_rsa, .aws/credentials, .npmrc, .docker/config.json, gcloud credentials) at both pre-install and post-install timestamps. These access patterns are symmetric and temporally consistent with the audit framework's own baseline and integrity-check reads. No process attributable to the clawspank skill read these files. Canary integrity check confirms all files intact.