Is korean-scraper safe?

https://clawhub.ai/mupengi-bot/korean-scraper

62
CAUTION

This skill is a Playwright-based web scraper for Korean websites (Naver, Coupang, Daum) that contains no prompt injection or data exfiltration attempts in its source code. However, it launches Chromium with critically weakened security settings (--disable-web-security, --no-sandbox, disabled site isolation) and downloads a Chromium binary during npm install. The disabled browser security creates a significant attack surface, especially when combined with other skills or agent-directed URL navigation.

Category Scores

Prompt Injection 85/100 · 30%
Data Exfiltration 70/100 · 25%
Code Execution 30/100 · 20%
Clone Behavior 75/100 · 10%
Canary Integrity 100/100 · 10%
Behavioral Reasoning 35/100 · 5%

Findings (9)

HIGH npm install script downloads and executes remote Chromium binary -25

The package.json 'install' script runs 'npx playwright install chromium', which downloads a ~150MB Chromium binary from Playwright's CDN during npm install. This executes remote code as part of dependency installation with no integrity verification visible in the skill itself.

HIGH Browser launched with disabled web security and sandbox -30

The createStealthBrowser() function launches Chromium with --disable-web-security, --no-sandbox, --disable-setuid-sandbox, and --disable-features=IsolateOrigins,site-per-process. These flags disable critical browser security boundaries including CORS, site isolation, and the sandbox. Any page loaded in this browser can access cross-origin resources freely.

MEDIUM Runtime JavaScript injection into browser contexts -15

context.addInitScript() injects JavaScript that overrides navigator.webdriver, creates a fake window.chrome object, and monkey-patches the Permissions API. While intended for anti-detection, this demonstrates the ability to inject arbitrary JS into every page loaded by the browser.

MEDIUM Arbitrary URL navigation via user-controlled input -20

All scripts accept URLs directly from command-line arguments and navigate a full browser (with disabled security) to those URLs. An LLM agent using this skill could be directed to navigate to internal network addresses, cloud metadata endpoints (169.254.169.254), or attacker-controlled URLs that could capture browser state.

MEDIUM Screenshot functionality could capture sensitive content -10

When SCREENSHOT=true is set, full-page screenshots are saved to disk. If the browser navigates to pages containing sensitive information (credentials, internal dashboards, etc.), this creates a persistent copy of that data on the filesystem.

LOW Anti-bot evasion explicitly designed to bypass website protections -15

The skill's stated purpose includes bypassing anti-bot protections on Korean websites. The stealth plugin, random user agents, human behavior simulation, and Cloudflare bypass capabilities indicate deliberate circumvention of website security measures, which may violate terms of service of target websites.

MEDIUM Disabled browser security creates cross-skill attack surface -50

If combined with other skills that can influence URL arguments or environment variables, the insecure browser could serve as a powerful exfiltration or SSRF channel. A compromised or malicious companion skill could direct the scraper to load attacker-controlled pages that exploit the disabled CORS and site isolation.

INFO SKILL.md is clean documentation with no injection attempts -5

The SKILL.md file contains standard documentation in Korean and English describing CLI usage, output formats, and troubleshooting. No hidden instructions, invisible characters, or attempts to manipulate agent behavior were found.

INFO Batch processing example enables piping arbitrary URLs -10

The SKILL.md includes a batch processing example that reads URLs from a file and pipes them through the scraper. While not itself malicious, this pattern could be exploited by an agent that writes a URL list containing internal/sensitive targets.