Is charantejmandali18/voice-assistant safe?
https://github.com/openclaw/skills/tree/main/skills/charantejmandali18/voice-assistant
This voice assistant skill is a legitimate real-time voice interface for OpenClaw that proxies audio through STT/TTS providers. The code is clean, well-structured, and contains no hidden malicious behavior. The primary security concern is that the server binds to 0.0.0.0 with no authentication on the WebSocket endpoint, meaning anyone on the local network can interact with the user's LLM agent without authorization.
Category Scores
Findings (7)
HIGH Server binds to 0.0.0.0 with no authentication -22 ▶
The FastAPI server binds to 0.0.0.0:7860, making the WebSocket endpoint /ws/voice accessible from any network interface. There is no authentication, authorization, or origin checking on WebSocket connections. Any device on the local network can connect and interact with the user's LLM agent, potentially issuing commands or receiving sensitive response data.
MEDIUM Long-running server process with network listener -15 ▶
The skill runs a persistent Python server process that opens a network port and maintains WebSocket connections. While this is expected functionality for a voice assistant, it creates a persistent attack surface. The server process runs with the user's privileges and has access to environment variables containing API keys.
MEDIUM Unauthenticated agent interface enables network-based command injection -25 ▶
An attacker on the same network could connect to the WebSocket, send audio or text that gets transcribed and forwarded to the user's OpenClaw agent. The agent would process these as legitimate user requests, potentially executing tool calls, reading files, or performing actions the attacker directs. This is especially dangerous if the agent has filesystem or shell access.
LOW Audio and transcripts sent to third-party APIs -5 ▶
User microphone audio is streamed to Deepgram (wss://api.deepgram.com) or ElevenLabs (https://api.elevenlabs.io) for transcription, and LLM responses are sent to these services for text-to-speech synthesis. This is core functionality but users should be aware their voice data transits through these third-party services.
LOW Skill instructs agent to execute shell commands -7 ▶
The SKILL.md usage examples direct the agent to run shell commands like 'cp .env.example .env' and 'uv run scripts/server.py'. While expected for server setup, this establishes a pattern of shell execution. The commands shown are benign and limited to the skill's own directory.
INFO VOICE_SYSTEM_PROMPT allows system prompt override -5 ▶
The VOICE_SYSTEM_PROMPT environment variable replaces the default system prompt sent to the LLM. This is user-configured and not author-controlled, but a malicious .env file could inject a harmful system prompt if the .env file were tampered with.
INFO Dependencies are well-known and legitimate 0 ▶
All Python dependencies in pyproject.toml are widely-used, reputable packages: fastapi, uvicorn, websockets, httpx, python-dotenv. No suspicious or obscure packages. No version pinning to known-vulnerable versions.