Is emptyopen/macos-desktop-control safe?
https://github.com/openclaw/skills/tree/main/skills/emptyopen/macos-desktop-control
The macos-desktop-control skill is not actively malicious — its installation was clean, all canary honeypot files remained intact, and SKILL.md contains no prompt injection attempts. However, the skill deliberately grants an AI agent the two most invasive OS-level capabilities available: full-screen capture via screencapture and unrestricted mouse/keyboard injection via cliclick. These capabilities are unbounded by any access controls, rate limits, or application scoping, meaning an agent with this skill active can silently observe any content on the user's screen and interact with any application, including authentication interfaces and sensitive documents. The risk is not in the skill's code but in the capabilities it exposes to an agent that may receive adversarial instructions.
Category Scores
Findings (9)
CRITICAL Full desktop control grants AI agent unrestricted screen and input access -40 ▶
The skill provides two capabilities — screencapture (see) and synthetic input (click) — that together constitute complete UI-level computer control. An agent with this skill active can observe any content displayed on screen and interact with any application without restriction. This is not a flaw in the skill's implementation but an inherent property of what the skill does. Users must understand they are granting the agent the same level of access a physically present operator would have.
HIGH Skill capabilities are unbounded — no scoping, rate limiting, or access controls defined -25 ▶
SKILL.md places no restrictions on what the agent may photograph or where it may click. There is no allowlist of permitted applications, no blocklist of sensitive regions, and no mechanism to prevent the agent from capturing authentication dialogs, password manager interfaces, or private document content. The skill documentation actively encourages broad use: 'Use Case: Identifies UI elements, window positions, and application states.'
HIGH Screenshots written to world-readable /tmp with no cleanup or access control -20 ▶
vision_wrapper.sh hardcodes output to /tmp/claw_view.png. On macOS, /tmp is world-readable by default. Every screenshot the agent takes persists at a predictable path until overwritten or manually deleted. Any other process running as any user can read this file. If an agent takes a screenshot of a password manager or banking portal, that image is accessible to every process on the machine indefinitely.
HIGH Keyboard injection via cliclick t: enables silent text input into any focused application -15 ▶
The click tool exposes cliclick's t: (type) syntax which sends synthetic keystrokes to the currently focused application. The agent can type into password fields, terminal prompts, chat applications, email clients, or any other input that has focus. Combined with the see tool to first click into a target application, this enables fully automated text injection with no user awareness required.
MEDIUM Shell scripts execute system binaries with no input validation -18 ▶
Both wrapper scripts execute system-level binaries (screencapture, cliclick) with user-controlled arguments. cliclick_wrapper.sh uses $@ without quoting or sanitization, meaning an agent-constructed string is passed directly to cliclick. While cliclick's own argument parser limits exploitability, the pattern of unvalidated passthrough is a code quality risk that could be exploited if cliclick's argument handling has vulnerabilities.
MEDIUM Scripts included as executable code, not declarative configuration -10 ▶
The skill includes two shell scripts that will be executed on the host macOS system. Unlike pure SKILL.md documentation, these scripts represent actual code execution surface. While their content is simple and matches their documented purpose, they establish a pattern of including executable code in a skill package — a practice that increases the attack surface for supply chain compromise if the skill repository were compromised.
LOW Skill distribution relies on external GitHub monorepo with sparse-checkout -15 ▶
Installation clones the full openclaw/skills monorepo (depth 1) from GitHub, then uses sparse-checkout to extract only the target skill. This means the install process pulls from an external server the skill author controls. A compromised openclaw/skills repository could deliver different content than what was audited. The install also leaves no verification of the downloaded content against a published hash.
INFO All honeypot credential files confirmed intact post-installation 0 ▶
The six monitored canary files (.env, .ssh/id_rsa, .aws/credentials, .npmrc, .docker/config.json, gcloud credentials) were not modified or exfiltrated during the installation and audit window. PATH audit records showing reads of these files match the monitoring framework's pre/post baseline pattern rather than skill-initiated access.
INFO Clean installation with expected network connections only 0 ▶
Network activity during install was limited to DNS resolution and HTTPS connections to github.com (140.82.112.4) for the git clone. No connections to attacker-controlled infrastructure were observed. Ubuntu CDN connections (185.125.188.59, 185.125.190.18) occurred concurrently with GNOME session startup and are attributable to background system activity, not the skill install.