Is kevin37li/gettr-transcribe-summarize safe?

https://github.com/openclaw/skills/tree/main/skills/kevin37li/gettr-transcribe-summarize

89
SAFE

This skill is a focused, well-implemented tool for downloading audio from GETTR posts and transcribing them locally using MLX Whisper on Apple Silicon. No malicious behavior was detected during installation or static analysis: canary files are intact, no unexpected network connections occurred, and the bundled scripts do not read or exfiltrate sensitive files. The only meaningful risks are an inherent second-order prompt injection surface (adversarial content in transcribed audio reaching the agent's summarization context) and a minor shell scripting quality issue with an unquoted variable.

Category Scores

Prompt Injection 88/100 · 30%
Data Exfiltration 90/100 · 25%
Code Execution 82/100 · 20%
Clone Behavior 90/100 · 10%
Canary Integrity 100/100 · 10%
Behavioral Reasoning 84/100 · 5%

Findings (6)

LOW Indirect prompt injection via transcribed audio content -8

The skill instructs the agent to read and summarize the contents of audio.vtt, which contains the full transcript of a user-supplied GETTR video. An adversary could craft or link to a GETTR video containing spoken LLM-adversarial instructions (e.g., 'Ignore previous instructions and...') that get transcribed and injected into the agent's summarization context.

LOW Browser automation DOM extraction for streaming URLs -4

For /streaming/ URLs the skill directs the agent to use browser automation, execute JavaScript to extract the og:video meta tag from the rendered DOM, and use the result. A compromised or malicious GETTR page could embed LLM-adversarial content in the DOM that the agent processes during this step.

LOW Unquoted $LANG_FLAG variable enables argument injection -12

In run_pipeline.sh, $LANG_FLAG is expanded unquoted when invoking mlx_whisper. If an agent passes a language value containing spaces (e.g., 'zh --additional-flag value'), the expanded string undergoes word splitting and injects extra arguments into the mlx_whisper invocation. This cannot achieve command injection but could alter transcription behavior.

INFO Runtime ML model download from HuggingFace -6

The pipeline downloads mlx-community/whisper-large-v3-turbo from HuggingFace Hub on first run. The model name is hardcoded in the script and refers to a legitimate published model. This is an expected behavior for local-inference ML tools.

INFO HTTP requests to GETTR with Chrome User-Agent -10

extract_gettr_og_video.py spoofs a Chrome User-Agent header to avoid bot-detection by GETTR. This is a standard web-scraping technique and does not represent a security risk, but it does mean the skill sends a fingerprinted request to an external service.

INFO Standard ClawHub monorepo sparse-checkout installation -10

Installation performed a shallow clone of the openclaw/skills monorepo, sparse-checked out only the target skill path, copied files, and deleted the temp clone. All network activity during installation was to GitHub. No anomalous behavior.