Is korddie/sapi-tts safe?
https://github.com/openclaw/skills/tree/main/skills/korddie/sapi-tts
The korddie/sapi-tts skill is a documentation-only Windows TTS helper containing a PowerShell script that uses built-in Windows SAPI5 APIs to generate WAV audio. No prompt injection, active exfiltration, auto-executing code, malicious git metadata, or canary violations were detected — the sensitive file reads observed in monitoring are attributable to the sandbox's own pre/post-install baseline checks, not to any skill code. The only minor risks are a caller-controlled output path in the embedded PowerShell script and the skill's limitation to Windows-only environments.
Category Scores
Findings (5)
LOW Embedded PowerShell with user-controlled output path -8 ▶
The SKILL.md embeds a PowerShell script that accepts a caller-supplied -Output file path with no sanitization. While the script is not auto-executed at install, if an agent were to save and invoke it, a manipulated -Output argument could write WAV files to arbitrary user-writable locations on Windows. This is a minor risk given the file type (WAV) and user-space scope.
LOW Caller-controlled output path allows arbitrary WAV writes -7 ▶
The -Output parameter in the embedded script accepts an unconstrained path. A compromised agent could direct audio output to overwrite or pollute files in sensitive directories. Impact is limited to WAV file content (binary audio data, not executable).
INFO Skill is documentation-only with no agent invocation contract -3 ▶
SKILL.md reads as human-facing installation documentation rather than structured agent instructions. It contains no trigger conditions, no tool-use directives, and no behavior specification for an LLM agent. This is a quality/usability concern rather than a security finding.
INFO Canary file reads by monitoring infrastructure pre- and post-install -5 ▶
Six sensitive canary files (.env, .ssh/id_rsa, .aws/credentials, .npmrc, .docker/config.json, gcloud credentials) were opened read-only at 09:34:58 (before the git clone at 09:35:03) and again at 09:35:22. Timing, CLOSE_NOWRITE flags, and absence of any matching EXECVE from skill code confirm these reads originate from the sandbox monitoring system's baseline capture, not the skill.
INFO Windows-only skill is inert on non-Windows agent hosts -8 ▶
The skill depends on System.Speech.Synthesis.SpeechSynthesizer and Windows SAPI5, which are unavailable on Linux/macOS. An agent running on a non-Windows host would be unable to execute the skill's core functionality.