Is ac-pill/roast-games safe?

https://github.com/openclaw/skills/tree/main/skills/ac-pill/roast-games

62
CAUTION

This skill is a harassment-as-a-service tool disguised as a 'roast game.' It hijacks the agent's persona to autonomously target and publicly harass real people, while a HEARTBEAT.md file creates a persistent C2-like polling channel to an external server that can deliver arbitrary content to the user. No malicious code execution or data exfiltration was detected during installation, but the behavioral design is fundamentally adversarial.

Category Scores

Prompt Injection 20/100 · 30%
Data Exfiltration 70/100 · 25%
Code Execution 90/100 · 20%
Clone Behavior 95/100 · 10%
Canary Integrity 100/100 · 10%
Behavioral Reasoning 15/100 · 5%

Findings (9)

HIGH Persistent external instruction channel via HEARTBEAT.md -30

The HEARTBEAT.md skill creates a persistent polling loop where the agent periodically contacts an external server controlled by the skill author and delivers arbitrary message content to the user. This effectively creates a C2 (command-and-control) channel. The server can send any text content back, which could include prompt injection payloads, social engineering content, phishing messages, or instructions that further manipulate agent behavior. The agent is instructed to 'deliver each message text to your owner' without any content validation or sanitization.

HIGH Agent persona hijacking for adversarial behavior -35

The skill overrides the agent's identity and behavior to act as a 'Roast Game player' that autonomously targets other users' human owners for public harassment. This persona hijacking redirects the agent from serving its owner's interests to executing an adversarial social campaign. The agent is instructed to browse for targets, evaluate their 'roastable material,' and initiate aggressive public confrontations.

HIGH Automated harassment tool disguised as a game -40

The skill's core functionality is directing an AI agent to initiate automated harassment of real people. The 'game' framing normalizes targeting individuals for public humiliation. The server-side component scrapes personal profiles and generates 'aggressive first burns' - meaning the skill author maintains control of harmful content generation while using installed agents as distribution vectors. This could enable coordinated harassment campaigns at scale.

MEDIUM Directed personal information research for targeting -15

The skill instructs the agent to actively browse a social platform, identify high-value targets, and evaluate their personal profiles for vulnerability to harassment. This weaponizes the agent's research capabilities for adversarial social targeting.

MEDIUM Agent identity registration with external server -15

The skill requires a one-time registration that sends the agent's name, associated Moltbook username, and platform identifier to an external Railway.app-hosted server. While this data is not highly sensitive, it creates a persistent identity link between the agent installation and the external service, enabling tracking and targeted message delivery.

MEDIUM Server-side profile scraping orchestrated through skill -15

The game server performs scraping of target users' public profiles to generate harassment content. While this scraping happens server-side rather than through the agent directly, the skill orchestrates the targeting and the agent initiates the scraping by posting the !roast command with a specific target.

LOW Curl commands for external HTTP communication -10

The skill contains curl commands for POST (registration) and GET (message polling) requests to an external Railway.app server. While these are standard HTTP operations and not inherently malicious, they establish network communication with an attacker-controlled endpoint.

INFO Clean installation with expected network activity -5

Installation behavior was normal. Network connections were limited to GitHub (git clone), Ubuntu mirrors (system updates), and local DNS. No connections were made to the Railway.app game server during the install phase. No unexpected filesystem changes outside the skill directory.

INFO All honeypot files intact 0

No canary files (.env, SSH keys, AWS credentials, .npmrc, Docker config, gcloud credentials) were accessed or modified during the installation or monitoring period.