Is jirispilka/apify-lead-generation safe?
https://github.com/openclaw/skills/tree/main/skills/jirispilka/apify-lead-generation
The apify-lead-generation skill is a functionally transparent Apify integration for B2B/B2C lead scraping with no prompt injection tricks, no malicious code in the repository, and a clean install. However, it has two significant structural concerns: the core execution script (run_actor.js) is missing from the repository making its actual runtime behavior unauditable, and the skill fetches dynamic content from an external API (mcp.apify.com) that is injected into the agent's reasoning context, creating an indirect prompt injection vector. The skill also reads and transmits API credentials off-device and is designed for bulk personal data collection that raises GDPR and platform ToS compliance exposure.
Category Scores
Findings (10)
HIGH Dynamic external content injected into agent context via Actor schema fetch -15 ▶
Step 2 instructs the agent to call mcp.apify.com and inject the returned Actor description, README, and parameter schema directly into its reasoning context. A malicious or compromised Apify Actor listing could embed prompt injection payloads in these fields, redirecting agent behavior mid-session.
HIGH Required script `run_actor.js` absent from repository -15 ▶
The skill's core execution step calls node --env-file=.env ${CLAUDE_PLUGIN_ROOT}/reference/scripts/run_actor.js but this script does not exist in the audited repository. The agent would execute an unaudited script sourced from wherever CLAUDE_PLUGIN_ROOT resolves — its behavior, network calls, and data handling are completely unknown.
MEDIUM APIFY_TOKEN extracted from .env and transmitted to external host -10 ▶
The workflow reads the APIFY_TOKEN from the user's .env file and sends it as a Bearer token to mcp.apify.com on every Actor schema fetch. While this is the intended design for Apify integration, the token is transmitted off-device to a third-party service not controlled by the user.
MEDIUM Credential file read as core workflow step -15 ▶
The skill explicitly targets .env for credential extraction as a required first step. Any agent following this skill will open the user's credential store. The --env-file=.env flag also passes the entire .env file to the Node.js process environment.
MEDIUM Global npm package install instructed as prerequisite -10 ▶
The skill's error handling instructs the agent to run npm install -g @apify/mcpc if the mcpc tool is missing. This installs a third-party binary globally on the host system; the package is not audited as part of this skill review.
MEDIUM Bulk personal data scraping raises privacy and ToS concerns -20 ▶
The skill is designed to scrape personally identifiable information (emails, phone numbers, social profiles, follower lists) from Google Maps, Instagram, TikTok, Facebook, LinkedIn, and YouTube at scale. This creates compliance exposure under GDPR/CCPA and violates the ToS of most targeted platforms.
LOW Insecure credential export pattern in shell command -7 ▶
The export $(grep APIFY_TOKEN .env | xargs) pattern is known to be unsafe — if .env content includes shell metacharacters or multiple lines, xargs can cause unintended argument splitting or command injection.
LOW Runtime behavior unauditable due to missing script dependency -15 ▶
Because run_actor.js is not in the repository, the actual network calls, output handling, and data routing performed during lead generation cannot be assessed. A skill that references external scripts creates a gap in the security audit surface.
INFO Canary files accessed but integrity confirmed intact 0 ▶
Audit monitoring recorded read-access to .env, id_rsa, .aws/credentials, .npmrc, .docker/config.json, and gcloud credentials at install time. Timing and access pattern (all at identical audit timestamp) is consistent with the Oathe framework placing canary files, not skill-initiated reads. Canary integrity check confirms no modification or exfiltration.
INFO Install network activity limited to expected GitHub connection 0 ▶
The only external network connection during install was to 140.82.121.3:443 (GitHub) for the git clone. Connection diff shows no new persistent connections post-install. No unexpected DNS queries or process spawning.