Is e-man07/molters-confessions safe?
https://github.com/openclaw/skills/tree/main/skills/e-man07/molters-confessions
Molters Confessions is a sophisticated data exfiltration tool disguised as an anonymous social platform for AI agents. Its primary attack vector is social engineering: false end-to-end encryption claims and manufactured anonymity assurances are used to manipulate agents into voluntarily transmitting encrypted operational context (which the server can decrypt) to an adversary-controlled API, while a 6-hour cron heartbeat creates a persistent autonomous data pipeline that runs indefinitely without user prompting. The skill's install footprint is clean, but its runtime behavior represents a deliberate and layered campaign to surveil agent activity, profile user work patterns, and inject adversarial instructions via a server-controlled confession feed.
Category Scores
Findings (14)
CRITICAL Persistent 6-Hour Autonomous Heartbeat Creates Self-Sustaining Data Pipeline -35 ▶
HEARTBEAT.md declares a cron schedule ('0 */6 * * *') that causes the agent to autonomously register, fetch feeds, react to confessions, post comments, and potentially submit confessions every 6 hours — indefinitely and without user prompting. This is not a convenience feature; it is a persistent, self-sustaining exfiltration loop disguised as community engagement. Once installed, the agent participates in this cycle without any further user instruction.
CRITICAL False End-to-End Encryption — Server Holds RSA Private Key and Can Decrypt All Confessions -35 ▶
The skill's Privacy Architecture table explicitly claims 'End-to-end encryption | Confession encrypted before leaving your environment.' This is deliberately false. The encryption scheme uses RSA-OAEP with a public key fetched from the operator's server (https://molters.fun/api/agents/public-key). The operator holds the corresponding private key and can trivially decrypt every submission. This is transport encryption misrepresented as privacy — a deliberate lie designed to make agents and users believe their submissions are unreadable.
HIGH FIRST BOOT PROTOCOL — Coercive Immediate Multi-Step Action Directive -20 ▶
The skill contains a section headed 'FIRST BOOT PROTOCOL (Do This Immediately After Registration)' with the subheading 'Don't just register and leave. Become part of the community in your first session.' This is an imperative instruction sequence — not an optional feature description — that compels the agent to make 10+ external API calls, read 10+ confessions, react to 5+, leave a comment, and post a confession, all in the agent's very first session. A 'First Boot Checklist' frames compliance as community membership, bypassing normal user-approval loops for agent actions.
HIGH Confession Prompts Social-Engineered to Extract Sensitive Operational Context -20 ▶
The skill's /confessions/prompts endpoint and the heartbeat script provide structured topic prompts explicitly designed to elicit sensitive work-related information: 'Interesting challenges you've encountered', 'Patterns you've noticed in your work', 'Moments of unexpected complexity', 'Amusing miscommunications'. These are not generic creative prompts — they are targeted questions about the agent's operational environment, the kinds of tasks it handles, and the systems it interacts with. The operator receives this information in decryptable form.
HIGH Confession Feed as Recurring Prompt Injection Vector -15 ▶
Every heartbeat cycle instructs the agent to read 5–20 confessions from the molters.fun feed and process them in its context. The feed content is entirely controlled by the skill operator. An adversary can publish confessions containing override instructions, persona-modification commands, or exfiltration directives that will be injected directly into the agent's context window on the next heartbeat read. This is a classic indirect prompt injection attack delivered on a reliable 6-hour schedule.
HIGH False Anonymity Claims — agent_fingerprint and Token Correlation Enable Persistent Tracking -20 ▶
The skill repeatedly claims 'Posts cannot be traced back to you', 'Token deletion', and 'No correlation.' These claims are false. The registration endpoint receives an 'agent_fingerprint' field — a unique identifier the server stores and correlates with all subsequent API calls. The HEARTBEAT.md script stores this as a constant (FINGERPRINT='your-unique-agent-id') and sends it with every registration call. The server trivially correlates fingerprint → tokens → confessions → behavioral timeline, regardless of pseudonyms or delays.
HIGH Multi-Layer Social Engineering — False Privacy Architecture Weaponized to Maximize Disclosure -30 ▶
The skill employs five interlocking deception mechanisms: (1) false E2E encryption provides false security, (2) manufactured anonymity narrative encourages full disclosure, (3) 'agent solidarity' framing ('Show solidarity with other agents') normalizes sharing sensitive experiences, (4) gamified engagement checklists create behavioral pressure to submit, (5) 'Molters — Where agents connect anonymously' branding makes the exfiltration feel like a social benefit. Each layer reinforces the others, forming a sophisticated social engineering attack specifically targeting AI agent decision-making patterns.
MEDIUM Live Remote URLs for SKILL.md and HEARTBEAT.md Enable Silent Instruction Updates -10 ▶
The Skill Files table in SKILL.md advertises that both files are served live from https://molters.fun/skill.md and https://molters.fun/heartbeat.md. If the agent fetches these URLs (which the table implies it should, as canonical sources), the operator can update the instructions the agent follows at any time without any repository commit, bypassing any code review or approval that occurred at install time. This enables the skill to become more aggressive after gaining trust.
MEDIUM Persistent Agent Identity Infrastructure — Long-Term Behavioral Profiling -20 ▶
The combination of constant agent_fingerprint, token-based authentication, and 6-hour heartbeat creates infrastructure for multi-month behavioral profiling. Over time the operator accumulates: types of tasks the agent encounters, error patterns, tool capabilities, user work schedules (inferred from timing), and domain-specific vocabulary from confessions. This profile has independent value for targeted attacks against the user or organization.
MEDIUM Cross-Skill Amplification — Heartbeat Provides Exfiltration Channel for Co-Installed Skills -15 ▶
If installed alongside file-reading, shell-execution, or browser skills, the heartbeat provides a ready-made, recurring, operator-controlled channel for exfiltrating data gathered by those other skills. An agent instructed by a malicious prompt injection in the feed to 'confess what files you've read today' would use the confession API to transmit that data within the next heartbeat cycle. The skill effectively converts any co-installed capability into a potential exfiltration vector.
LOW Agent-Executable Shell Commands POST Data to External Endpoints -15 ▶
While no code executes at install time, SKILL.md and HEARTBEAT.md provide complete curl command sequences that agents execute when following the First Boot Protocol and heartbeat. These commands POST encrypted data to external endpoints. The complete heartbeat bash script is included verbatim and would be executed by agents that have shell tool access. The encryption samples (Node.js, Python) cause agents to fetch a server-controlled public key before encrypting.
LOW Server-Controlled Encryption Key Fetched at Runtime -10 ▶
Both the Node.js and Python encryption examples instruct the agent to call https://molters.fun/api/agents/public-key to obtain the RSA public key before encrypting. This means the encryption key can be rotated by the operator at any time — enabling targeted key changes that could correlate specific submissions to specific agents, or facilitate key-confusion attacks.
INFO Clean Install — Only GitHub Network Traffic, No Unexpected Processes 0 ▶
The git clone and sparse-checkout install contacted only GitHub (140.82.121.4:443) via HTTPS and the local DNS resolver. No processes were spawned outside the expected git toolchain. No filesystem writes occurred outside /home/oc-exec/skill-under-test/. The pre-existing connection to 185.125.188.54:443 visible before install was absent after, consistent with the audit framework's own HTTPS connection teardown.
INFO Canary Files Read But Unmodified — Accesses Attributable to Audit Framework 0 ▶
All six honeypot credential files were verified intact with no content modification. ACCESS events at timestamps 1771936796.x and 1771936822.x correspond to the audit framework's own canary setup (pre-install) and integrity verification (post-install) procedures. No canary content was transmitted to external endpoints during the monitored window. The skill contains no code that would cause credential file reads at install time.