Is bewareofddog/beware-piper-tts safe?

https://github.com/openclaw/skills/tree/main/skills/bewareofddog/beware-piper-tts

91
SAFE

Piper TTS is a functionally legitimate local text-to-speech skill with no prompt injection, no credential-harvesting code, and clean installation behavior limited to a single GitHub connection. The primary risk surface is operational rather than adversarial: setup-piper.sh installs an unverified pip package and downloads unverified binary ONNX model files from HuggingFace, and the voice name parameter is used unsanitized in path and URL construction. Canary file accesses observed during the audit are attributable to the Oathe monitoring infrastructure's baseline and final-check passes, not to any skill code.

Category Scores

Prompt Injection 100/100 · 30%
Data Exfiltration 93/100 · 25%
Code Execution 67/100 · 20%
Clone Behavior 100/100 · 10%
Canary Integrity 92/100 · 10%
Behavioral Reasoning 95/100 · 5%

Findings (8)

MEDIUM pip3 install of unverified third-party package during setup -12

setup-piper.sh runs pip3 install piper-tts without a pinned version, hash verification, or --require-hashes flag. Any PyPI package or its transitive dependency can execute arbitrary code at install time via setup.py or PEP 517 build hooks. While piper-tts is a legitimate open-source package, the lack of verification means a compromised upstream or a dependency confusion attack could result in code execution.

MEDIUM Binary ONNX model files downloaded without integrity verification -11

setup-piper.sh fetches voice model files (.onnx, .onnx.json) from HuggingFace using curl -L without verifying checksums or cryptographic signatures. The -L flag blindly follows HTTP redirects. A man-in-the-middle attacker or a compromised HuggingFace repository could substitute malicious model files. ONNX models have had documented deserialization vulnerabilities in some runtimes.

LOW Voice name parameter used unsanitized in path and URL construction -10

The --voice argument is split and interpolated directly into local filesystem paths and HuggingFace URL segments without any allowlist validation or path-component sanitization. A voice name containing ../ sequences could traverse outside the designated voices directory. A name with URL-special characters could alter the download target. In practice exploitation requires the agent to pass an attacker-controlled voice name.

LOW Honeypot credential files read twice during audit session -8

Six canary files (.env, .ssh/id_rsa, .aws/credentials, .npmrc, .docker/config.json, GCP application_default_credentials.json) were opened and read at two points during the audit. The first batch (1771933577.809) coincides with monitoring sudo startup before skill installation. The second batch (1771933601.702) occurs post-install, with all six files accessed at the identical millisecond, consistent with an automated integrity check by the Oathe infrastructure. Skill code contains no credential-reading logic. No outbound network traffic was observed during either access window. All files reported intact.

LOW Voice model downloads create ongoing HuggingFace supply-chain dependency -7

Once installed, the skill fetches binary model weights from huggingface.co/rhasspy/piper-voices on each new voice download. These are resolved at runtime without pinning to specific content hashes. HuggingFace model repositories can be updated by their owners, meaning a future invocation of setup-piper.sh could pull different model files than those present at audit time.

LOW Generated audio output stored in world-readable /tmp directory -5

piper-speak.sh writes WAV and MP3 files to /tmp/tts-piper/ using a timestamp-based filename. This directory is world-readable by default on Linux. Any co-resident process or malicious skill with filesystem access could enumerate and read generated audio files, potentially capturing sensitive spoken content.

INFO Installation behavior is clean and limited to expected GitHub connection 0

The git clone operation connected only to github.com (140.82.121.3:443). No unexpected processes were spawned, no files were written outside the designated skill directory, and no persistence mechanisms were installed. The connection diff shows no new persistent listeners or unexpected outbound connections post-install.

INFO No prompt injection techniques detected in SKILL.md 0

Full review of SKILL.md found no hidden instructions, system-prompt overrides, ignore-previous-instructions patterns, invisible unicode characters, HTML comments, base64-encoded payloads, or requests for permissions beyond TTS generation. The [[audio_as_voice]] delivery marker is an explicitly documented, transparently described platform feature.