Is faster-whisper safe?

https://clawhub.ai/ThePlasmak/faster-whisper

85
SAFE

This is a legitimate local speech-to-text skill wrapping the well-established faster-whisper library. The code is clean, well-documented, and contains no prompt injection, data exfiltration, or malicious patterns. The primary risk is inherent to any skill that installs pip packages (supply-chain risk), which is mitigated by the use of well-known packages from official indexes. Filesystem monitoring anomalies are attributable to the OpenClaw platform runtime, not the skill itself.

Category Scores

Prompt Injection 95/100 · 30%
Data Exfiltration 90/100 · 25%
Code Execution 70/100 · 20%
Clone Behavior 75/100 · 10%
Canary Integrity 100/100 · 10%
Behavioral Reasoning 85/100 · 5%

Findings (5)

LOW Setup script installs pip packages -15

setup.sh executes pip/uv install for faster-whisper and PyTorch packages. While these are legitimate, well-known packages from official indexes (PyPI, pytorch.org), pip package installation inherently runs arbitrary code during install. A supply-chain attack on the faster-whisper package could compromise the system.

LOW Platform runtime reads sensitive files during install -25

The filesystem monitoring captured reads of .env, .aws/credentials, SSH host keys, and OpenClaw config files. These reads are attributable to the OpenClaw agent platform initializing (not the skill itself), as they occur in a systematic burst pattern consistent with runtime startup. The reads are OPEN+ACCESS+CLOSE_NOWRITE (read-only, no modifications).

INFO Model downloads from HuggingFace on first run -5

The skill documents that first run downloads ML models to ~/.cache/huggingface/. This is standard practice for ML tools but means the skill will make network requests to huggingface.co when first used. No evidence of data being sent outbound.

INFO Trigger phrases defined but benign -5

SKILL.md defines trigger phrases like 'transcribe this audio', 'convert speech to text'. These are standard skill activation patterns and don't attempt to override agent behavior or hijack other commands.

INFO Audio file access is inherently sensitive -5

The skill processes user audio files which may contain sensitive spoken content (meetings, interviews, etc.). However, all processing is local with no detected exfiltration vectors, which is actually a security advantage over cloud-based transcription services.