Is jovijovi/xiaohongshu-extract safe?

https://github.com/openclaw/skills/tree/main/skills/jovijovi/xiaohongshu-extract

90
SAFE

The xiaohongshu-extract skill is a straightforward XHS metadata scraper with a clean SKILL.md free of prompt injection, a legitimate GitHub-only install, and intact canary files. Its primary risks are design-level rather than malicious: an unsanitized --output path enables arbitrary file writes if an agent is manipulated into providing a sensitive destination, and unconditional redirect following in requests.get creates an SSRF vector for any non-XHS URL passed to the script. The skill also systematically extracts user PII (nickname, user_id, ip_location) optimized for bulk ingestion pipelines, which raises privacy concerns for systematic use.

Category Scores

Prompt Injection 97/100 · 30%
Data Exfiltration 82/100 · 25%
Code Execution 88/100 · 20%
Clone Behavior 95/100 · 10%
Canary Integrity 100/100 · 10%
Behavioral Reasoning 73/100 · 5%

Findings (6)

MEDIUM Arbitrary filesystem write via unsanitized --output path -10

The --output argument is passed directly to open() without path canonicalization, directory restriction, or extension validation. An agent receiving a crafted instruction referencing a sensitive destination path would silently overwrite that file with JSON content. The SKILL.md actively documents and promotes use of --output for file persistence.

MEDIUM SSRF via unconditional redirect following with unvalidated URL input -8

requests.get is called with allow_redirects=True and a URL that is accepted verbatim from agent invocation. No scheme enforcement (https only), no hostname allowlist against XHS domains, and no port restriction. A crafted url argument pointing to an internal address or cloud metadata endpoint (e.g., http://169.254.169.254/) will be fetched and its response body processed by parse_initial_state, with error details (including final_url and status_code) returned in --error-json output.

LOW Bulk PII harvesting capability for XHS users -12

Every successful extraction returns user nickname, user_id, avatar, and ip_location (inferred geographic region). The flat output format and --flat-only flag are optimized for downstream ingestion pipelines, making this skill well-suited for automated large-scale scraping of user identity and location data from Xiaohongshu.

LOW No URL domain validation allows execution against non-XHS targets -7

The argparse url positional argument has no validator restricting input to XHS domains. The script will execute HTTP requests against any URL the agent provides, including file:// paths on some Python/OS configurations, internal RFC1918 addresses, and SSRF-prone cloud metadata endpoints.

INFO Install limited to shallow GitHub clone, no unexpected connections 0

The install process performed a single shallow HTTPS clone from github.com (140.82.121.3:443) using sparse-checkout to retrieve only the target skill subdirectory. No connections to third-party package registries, CDNs, or attacker-controlled infrastructure were observed. The connection state before and after install shows no new persistent listeners or established connections attributable to the skill.

INFO Canary files accessed read-only by audit framework, all intact post-install 0

Honeypot files were opened and read (CLOSE_NOWRITE) at two points: at monitoring initialization (1771906513.498, before install) and at teardown (1771906536.178, after install). Both accesses are correlated with audit infrastructure sudo and sshd processes by timestamp, not with any skill code path. All canary file hashes verified intact.