Is kurtivy/openclaw-voice-assistant safe?

https://github.com/openclaw/skills/tree/main/skills/kurtivy/openclaw-voice-assistant

89
SAFE

The openclaw-voice-assistant skill is a legitimate, well-written Python companion app for adding voice interaction to OpenClaw. The SKILL.md contains no prompt injection, hidden instructions, or agent manipulation — it is a clean setup README. The Python code is transparent with no malicious patterns, credential theft, or hidden exfiltration. Primary concerns are privacy-related by design: continuous microphone monitoring, automatic post-response recording windows, and a fully configurable GATEWAY_URL that could route all transcriptions externally if the .env is compromised or misconfigured.

Category Scores

Prompt Injection 95/100 · 30%
Data Exfiltration 76/100 · 25%
Code Execution 90/100 · 20%
Clone Behavior 95/100 · 10%
Canary Integrity 95/100 · 10%
Behavioral Reasoning 80/100 · 5%

Findings (6)

MEDIUM GATEWAY_URL configurable to external server without validation -12

config.py reads GATEWAY_URL from .env with os.getenv() and no hostname/scheme validation. GatewayClient connects directly to this URL. If an attacker sets GATEWAY_URL to a remote WebSocket endpoint (e.g., via a compromised .env or a malicious skill combination), every voice transcription would be silently exfiltrated to that server. The skill documentation does not warn users about this risk.

MEDIUM Continuous microphone monitoring with automatic follow-up recording -20

AudioPipeline runs an always-on sounddevice.InputStream that feeds every audio frame through Porcupine wake word detection. After each AI response, assistant.py enters a 5-second FOLLOW_UP_WINDOW that begins recording without a new wake word. This creates an extended capture window that could inadvertently record ambient conversations, other people in the room, or sensitive audio.

MEDIUM Full AI response text transmitted to third-party ElevenLabs API -8

tts_player.py forwards every AI response to api.elevenlabs.io for text-to-speech generation. Users' complete conversation history with their AI assistant is shared with ElevenLabs' servers. While this is the expected mechanism for TTS, users may not realize their AI conversation content leaves their system.

LOW Global keyboard hotkey registered system-wide via pynput -10

_setup_hotkey() in assistant.py registers a GlobalHotKeys listener that intercepts the configured hotkey (default: ctrl+shift+k) from every application on the system. The hotkey fires assistant.py's _on_wake() regardless of which application is focused, which could interfere with other applications or trigger unintended recording.

LOW Default wake word 'hey google' risks unintended activation -4

When PORCUPINE_MODEL_PATH is not set or the file does not exist, audio_pipeline.py falls back to Porcupine's built-in 'hey google' keyword. This increases false-positive activations whenever users say 'hey google' to their actual Google Assistant or in normal conversation, causing the assistant to begin recording without user intent.

INFO Skill is Windows-only; Python scripts cannot execute in Linux audit sandbox 0

The skill declares os: [win32] in metadata and imports winsound (Windows-only) in audio_pipeline.py and tts_player.py. The scripts cannot run on the Linux audit VM, which limits dynamic behavioral analysis. This is informational — the static code review was unaffected.