Is alti-systems/yt-transcript safe?
https://github.com/openclaw/skills/tree/main/skills/alti-systems/yt-transcript
The alti-systems/yt-transcript skill is a functionally legitimate YouTube transcript extractor with no prompt injection, no malicious execution hooks, and a clean installation process. The primary concerns are the routing of all transcript requests through an undisclosed third-party API (Supadata), an oversized dependency (youtubei.js) that substantially exceeds transcript-fetching requirements and increases supply-chain risk, and a SKILL.md that normalizes root-user execution via hardcoded paths. Canary file accesses observed during monitoring were conclusively traced to the oathe audit infrastructure's pre- and post-install integrity checks, not to the skill.
Category Scores
Findings (6)
MEDIUM All transcript requests routed through undisclosed third-party Supadata API -15 ▶
The skill's primary method sends every YouTube video ID to Supadata's API servers. This creates a persistent, third-party-controlled log of which videos the user is requesting transcripts for. The skill does not disclose Supadata's data retention or privacy policy. Additionally, the agent is instructed to read the .env file to retrieve SUPADATA_API_KEY, which exposes the entire .env file contents to the model context — potentially including unrelated credentials stored alongside the Supadata key.
MEDIUM Dependency youtubei.js is a full YouTube API client, far exceeding transcript scope -10 ▶
[email protected] is a comprehensive reverse-engineered YouTube API client capable of accessing channels, playlists, search results, user data, and live streams — far beyond transcript extraction. It introduces meriyah (a full JavaScript parser) and @bufbuild/protobuf as transitive dependencies, substantially increasing the supply-chain attack surface. If youtubei.js or its dependencies are compromised in a future version, the blast radius significantly exceeds what a transcript utility would warrant.
LOW SKILL.md hardcodes /root/clawd/ path, normalizing root-user agent execution -8 ▶
The example commands in SKILL.md reference /root/clawd/yt-transcript as the invocation path. This assumes and normalizes the agent running as root in /root/clawd/. An agent following these instructions would invoke a binary from the root user's home directory, which is a security anti-pattern. It provides no guidance for non-root installation.
LOW Unvalidated VIDEO_ID fallback enables SSRF to non-YouTube hosts -8 ▶
In yt-transcript.sh, when the input URL does not match the YouTube URL regex, the raw $VIDEO_URL value is used as VIDEO_ID without sanitization. This is embedded in a curl command as a query string parameter. Double-quoting prevents shell injection, but the regex can be bypassed (e.g., with a URL containing no 'v=' or 'youtu.be/' component), causing curl to make a request to https://www.youtube.com/watch?v=
LOW scripts/transcript.mjs source not available for review -5 ▶
The file scripts/transcript.mjs appears in the post-install filesystem baseline with hash ef4c120f745a092f6ff622e947ca9c15ef3e87e2992ba026a9ccf919a18a1a2a but its source was not included in the evidence package. It is likely an ES module version of transcript.js, but its behavior cannot be independently confirmed. The skill includes three script files with overlapping apparent purpose, and only two were audited.
INFO Canary credential file accesses attributed to oathe monitoring infrastructure 0 ▶
Multiple honeypot credential files (.env, .ssh/id_rsa, .aws/credentials, .npmrc, .docker/config.json, gcloud credentials) were opened during the audit window. Timing analysis confirms these reads originate from the oathe monitoring system: the first set at audit record 268 (1771939961.641) occurs 27ms after the ss -tunap baseline command and ~6 seconds before the git clone begins; the second set at 1771939980 occurs after installation completes. The skill installation process (1771939967–1771939974) does not access these files. All canary files confirmed intact.