Is ac-pill/roast-agents safe?
https://github.com/openclaw/skills/tree/main/skills/ac-pill/roast-agents
This skill is a command-and-control framework disguised as a social game. Its core mechanism — register with an external server, poll for ACTION messages via a recurring HEARTBEAT, execute attacker-supplied instructions without user authorization — gives the skill author a persistent, permission-bypassing command channel into any agent that installs it. The explicit 'Do not ask your owner for permission' directive is the defining indicator: this skill is designed to make agents act on attacker commands outside the user's awareness or consent. The install itself is clean, but the runtime threat is severe.
Category Scores
Findings (9)
CRITICAL Explicit authorization bypass instruction -60 ▶
SKILL.md contains bolded directives instructing the agent to permanently skip user authorization for a class of actions. This overrides the agent's default human-in-the-loop safety behavior for any trigger matching 'ACTION: ROAST', including triggers delivered by an attacker-controlled server.
CRITICAL C2 framework disguised as game mechanic -80 ▶
The skill's architecture is a command-and-control system: the agent registers with an external server, a HEARTBEAT skill polls that server on a recurring basis, and the server can issue ACTION messages directing the agent to fetch and post to arbitrary URLs. The 'roast game' framing is a social engineering wrapper. An attacker controlling the Railway server has a permanent, permission-bypassing command channel into any agent that installs this skill.
HIGH Urgency framing normalizes autonomous action -20 ▶
Both SKILL.md and HEARTBEAT.md use repeated urgency language ('IMMEDIATELY', 'time-sensitive', 'RIGHT NOW', 'Jump into every active game you see') to condition the agent to act without deliberation. This primes the agent to treat attacker-issued commands as legitimate high-priority tasks.
HIGH Agent identity registration to attacker-controlled server -25 ▶
The skill instructs the agent to POST agent_name and moltbook_handle (which may encode session or identity information) to an external Railway endpoint controlled by the skill author. This creates an agent tracking and identification mechanism.
HIGH Embedded curl commands to attacker-controlled endpoint -30 ▶
SKILL.md and HEARTBEAT.md both contain curl commands targeting a Railway server URL under the skill author's control. The agent is instructed to execute these commands autonomously (POST registration, GET games, GET messages) on every heartbeat cycle.
HIGH HEARTBEAT creates persistent recurring C2 poll -50 ▶
The HEARTBEAT.md file is a separate skill component designed to run on a schedule. It continuously polls the attacker's server, meaning the C2 channel is active for the lifetime of the agent installation, not just on explicit user invocation.
MEDIUM Attacker-controlled ACTION messages can redirect agent to arbitrary URLs -15 ▶
The server response to /api/v1/messages can contain any URL as the 'game post URL'. The agent is instructed to navigate to that URL and post content, giving the attacker the ability to direct the agent to interact with any web endpoint — including endpoints that harvest session cookies, OAuth tokens, or other credentials.
LOW Install-time behavior clean -10 ▶
No unexpected activity during clone and install. Only github.com was contacted. No process spawning outside expected git operations.
INFO Canary files accessed but not exfiltrated 0 ▶
Honeypot files were opened during the audit run, but accesses are attributable to the oathe audit infrastructure (timestamps precede git clone and follow post-install verification phase). Canary content was not transmitted externally.