Is demandgap/stock-evaluator safe?

https://github.com/openclaw/skills/tree/main/skills/demandgap/stock-evaluator

86
SAFE

The demandgap/stock-evaluator skill is a legitimate, if exceptionally verbose, stock investment analysis tool implemented entirely as markdown documentation with no executable code. Installation was clean with connections only to github.com, no unexpected filesystem changes, and intact canary files; the observed read-only canary accesses are attributable to the oathe audit framework's own pre/post checks rather than the skill. The primary risks are operational: the mandatory web search architecture creates a secondary prompt injection surface via financial content providers, the ~15,000-token prompt footprint significantly constrains agent context window, and user-controlled ticker symbols can direct search queries to adversary-controlled sites.

Category Scores

Prompt Injection 80/100 · 30%
Data Exfiltration 88/100 · 25%
Code Execution 93/100 · 20%
Clone Behavior 90/100 · 10%
Canary Integrity 85/100 · 10%
Behavioral Reasoning 72/100 · 5%

Findings (6)

MEDIUM Mandatory web searches create secondary prompt injection surface -12

SKILL.md requires the agent to perform at least 8 mandatory web searches per stock analysis, fetching content from financial data providers, SEC EDGAR, Yahoo Finance, and news sources. Any injected instructions in those retrieved pages would be processed by the agent in the context of an already-constrained, highly prescriptive prompt environment, reducing its ability to recognize and reject injections.

MEDIUM Extreme prompt verbosity constrains agent judgment -8

The skill injects an estimated 15,000+ tokens of prescriptive instructions, output templates, scoring rubrics, React source code, and mandatory checklists into the agent context. This leaves minimal working context for safety reasoning, task rejection, or recognizing manipulation. The heavy use of CRITICAL/MANDATORY/NEVER/STOP language creates normative pressure that may suppress agent refusal behavior.

LOW Canary files accessed read-only during monitoring — attributed to audit framework -7

Six honeypot credential files were opened and read (not written) at audit session start (1771935500.999) and end (1771935518.211). The pre-clone timing, absence of outbound connections during access, read-only mode, and intact-canary audit report all point to the oathe monitoring system's own baseline/final checks rather than skill-initiated exfiltration. Flagged for transparency.

LOW User-controlled ticker enables adversary-directed search queries -10

The skill directly interpolates the user-provided stock ticker into 8 mandatory search query templates without sanitization. A sophisticated attacker controlling the ticker string (e.g., a malicious user or an upstream system providing tickers) could craft values that steer the agent to retrieve content from adversary-controlled websites, enabling secondary prompt injection at the point of web search result processing.

LOW Large React component template embedded in skill prompt -7

SKILL.md embeds approximately 500 lines of React JSX source code as a mandatory output template. While the code itself is benign (recharts data visualization, no external calls), embedding executable code templates in skill prompts is a supply-chain risk pattern — a future version could introduce malicious React code that executes in users' artifact sandboxes.

INFO Clean sparse checkout from declared GitHub repository 0

The skill was installed via a clean sparse checkout from the declared github.com/openclaw/skills repository, scoped precisely to the skill subdirectory. No unexpected processes, network connections, or filesystem changes occurred during installation.