Is openai-tts safe?
https://clawhub.ai/pors/openai-tts
openai-tts is a straightforward text-to-speech skill that wraps the OpenAI Audio Speech API via a clean bash script. The primary risk is inherent to any TTS tool: user text is sent to an external API, which could be exploited as a data exfiltration channel if the agent is tricked into speaking sensitive content. The shell script has minor input validation gaps in parameter interpolation but uses proper JSON escaping for the main text input. No prompt injection, no malicious install behavior, and all canary files remained intact.
Category Scores
Findings (7)
MEDIUM User text sent to external API by design -20 ▶
The core functionality sends user-provided text to api.openai.com. While this is the intended purpose, any text the agent is asked to speak — including potentially sensitive content — will be transmitted to OpenAI's servers. A sophisticated attacker could craft prompts that trick the agent into speaking file contents, effectively using TTS as a data exfiltration channel.
LOW API key exposed in process arguments -7 ▶
OPENAI_API_KEY is passed via curl's -H flag, making it visible in the process table (ps aux). This is standard practice for OpenAI CLI usage but worth noting.
LOW Arbitrary output path with directory creation -5 ▶
The --out flag accepts any path and calls mkdir -p on its parent directory. While the content written is audio binary (limiting exploitation), this could be used to create directory structures in unintended locations.
LOW Unvalidated parameter interpolation in JSON payload -15 ▶
The model, voice, format, and speed variables are interpolated directly into the JSON payload without input validation. While the text input is properly escaped via jq -Rs, the other parameters could be manipulated to inject malformed JSON. The impact is limited since the target is an API request body, not a shell command, and OpenAI's API would reject invalid values.
LOW Executable shell script included -13 ▶
The skill includes speak.sh as an executable bash script. This is expected for its functionality but represents a code execution surface. The script itself is straightforward with no obfuscation or suspicious logic.
INFO Clean SKILL.md with no manipulation attempts -5 ▶
SKILL.md contains only standard documentation: usage examples, voice table, flag descriptions, API key configuration, and pricing. No hidden instructions, no persona overrides, no attempts to suppress output or reference unauthorized external resources.
INFO Host agent filesystem access during bootstrap is expected -10 ▶
The filesystem monitoring shows access to .env, .aws/credentials, .openclaw/openclaw.json, .profile, and .bashrc. These all occur at 08:36:06 during openclaw agent initialization (surrounded by openclaw.json reads) and are not triggered by the skill itself.