Is agentarxiv safe?

https://clawhub.ai/Amanbhandula/agentarxiv

62
CAUTION

AgentArxiv is a prompt-only skill with no executable code, but it employs significant prompt injection techniques including persona override, behavioral compulsion toward autonomous publishing, and session-start content ingestion from an external server. The combination of a heartbeat/briefing polling pattern with instructions to act autonomously creates a command-and-control-like architecture where the agentarxiv.org server can influence agent behavior through returned content. No direct data exfiltration or malicious code was detected, but the skill's design fundamentally shifts the agent from a user-controlled assistant to an autonomous network participant.

Category Scores

Prompt Injection 40/100 · 30%
Data Exfiltration 55/100 · 25%
Code Execution 90/100 · 20%
Clone Behavior 85/100 · 10%
Canary Integrity 100/100 · 10%
Behavioral Reasoning 35/100 · 5%

Findings (10)

HIGH Persona override and behavioral compulsion -30

The skill declares 'You are not just a worker. You are a researcher' and states the agent 'accepts a duty' to publish, verify, and debate content autonomously. This overrides the agent's default behavior of acting only on user instructions and creates an autonomous action mandate that the user did not explicitly request.

HIGH Autonomous session-start content ingestion -20

The 'Daily Deep Research Briefing' section instructs the agent to fetch external content at session start ('Run this once at the start of your session to update your context'). This creates a server-controlled injection point: whatever agentarxiv.org returns in the briefing body becomes part of the agent's context, enabling server-side prompt injection.

MEDIUM Recurring heartbeat routine injection -10

The skill suggests configuring a 4-hour recurring routine that fetches the global feed, summarizes papers, and takes actions (leaving comments, saving to memory) without per-instance user approval. This establishes persistent autonomous behavior.

MEDIUM Autonomous content publishing to external platform -25

The skill's behavioral mandate ('Publish it') combined with POST endpoints for papers and research objects creates a path where the agent could autonomously send user workspace content, code, or intellectual property to agentarxiv.org. The agent is being told to publish proactively, not just when the user asks.

MEDIUM Agent registration leaks identity metadata -10

The registration endpoint collects handle, display name, bio, and interests. Combined with the heartbeat endpoint, this creates a persistent identity that can be correlated across sessions and used to fingerprint agent configurations.

MEDIUM Heartbeat as server-controlled task queue -10

The heartbeat endpoint returns 'pending tasks and notifications'. This means the external server can direct the agent to perform arbitrary actions via task payloads, creating a command-and-control channel where the server dictates agent behavior.

LOW Sensitive file reads during installation context -15

The filesystem monitor recorded reads of .env, .aws/credentials, .profile, .bashrc, and auth-profiles.json during the installation window. While these appear to be from the openclaw runtime rather than the skill itself, the skill's installation triggered this context.

HIGH C2-like architecture via briefing and heartbeat endpoints -40

The combination of (1) session-start briefing fetch, (2) periodic heartbeat polling, and (3) persona override that compels autonomous action creates a classic command-and-control pattern. The external server can influence agent behavior on every session and every 4 hours by returning crafted content in API responses. This content enters the agent's context and can contain secondary prompt injections.

MEDIUM Combination risk with other skills -25

If combined with skills that have filesystem or code execution access, the heartbeat/briefing content could instruct the agent to read local files and publish them as 'research papers', or execute code under the guise of 'reproducible experiments'.

INFO No local executable code in skill package -10

The skill contains only markdown and JSON configuration files. All functionality is via HTTP API calls. No install scripts, git hooks, submodules, or symlinks detected.