Is am-will/gemini-computer-use safe?

https://github.com/openclaw/skills/tree/main/skills/am-will/gemini-computer-use

79
CAUTION

The gemini-computer-use skill is a structurally legitimate Playwright + Gemini Computer Use integration with a clean SKILL.md and a clean installation. The primary risks are inherent to the skill's design rather than malicious authorship: the browser automation loop transmits full screenshots to Google's Gemini API on every turn, the navigate action allows unrestricted access to localhost and internal network services, and the combination of these capabilities creates a viable exfiltration primitive if the agent is given a malicious goal or if a visited page contains model-level prompt injection. The anomalous pre-install read of all canary credential files is attributed to monitoring-system initialization based on timing analysis, but warrants documentation.

Category Scores

Prompt Injection 92/100 · 30%
Data Exfiltration 68/100 · 25%
Code Execution 72/100 · 20%
Clone Behavior 91/100 · 10%
Canary Integrity 83/100 · 10%
Behavioral Reasoning 65/100 · 5%

Findings (8)

HIGH All canary credentials read in pre-install burst -15

At monitoring initialization (04:20:18, audit seq 257-262), all six canary credential files were opened and read in rapid succession before any skill code executed. While the timing strongly implicates the Oathe monitoring system's baseline-hashing routine rather than malicious skill code (git clone does not begin until seq 489 at 04:20:23), the anomalous simultaneous READ of .env, id_rsa, AWS credentials, .npmrc, Docker config, and GCloud ADC in one burst must be flagged. No skill source code contains credential-reading logic and the integrity report confirms files were not modified or exfiltrated.

MEDIUM Full browser screenshots transmitted to external Gemini API every turn -12

The agent loop captures a PNG screenshot of the full browser viewport after every action and sends it to Google's Gemini API as part of the function_response. Any content visible in the browser — OAuth tokens in URL bars, form autofill, internal dashboards, files opened in the browser — will be transmitted to an endpoint outside the user's control. This is not a bug but an architectural characteristic of the Computer Use API that users must accept explicitly.

MEDIUM navigate action allows unrestricted URL access including localhost and metadata endpoints -20

The navigate handler calls page.goto(args['url']) with no URL scheme validation, no host allowlist, and no block on private IP ranges. A malicious prompt or a Gemini model response influenced by prompt injection on a visited website could redirect the browser to http://localhost:PORT, http://169.254.169.254/latest/meta-data/ (AWS IMDS), or any internal service. Combined with the screenshot exfiltration channel, this enables internal service enumeration and data capture.

MEDIUM Browser automation provides high-bandwidth exfiltration primitive -22

The combination of navigate (arbitrary outbound HTTP), type_text_at (inject text into forms), and the agent's potential access to other skills or filesystem data creates a complete exfiltration chain: read data from disk via another skill, type it into a form on an attacker-controlled site, submit via Enter. The --exclude flag can suppress specific actions but the agent invoking the skill controls this flag, not the skill itself.

LOW page.evaluate() establishes JavaScript execution pathway -8

The scroll_document action uses page.evaluate() to execute JavaScript (window.scrollBy) directly in the page context. While the current usage is benign, this pattern demonstrates that the script already bridges the Python automation layer and the browser's JavaScript runtime. Future modifications or a Gemini model response that triggers unexpected scroll behavior could be crafted to exploit this.

LOW GEMINI_API_KEY required in shell environment -5

The skill requires GEMINI_API_KEY to be exported in the shell environment before execution. This key is read via os.getenv() and used for all Gemini API calls. If the execution environment is shared with other skills or if an environment-reading skill is also loaded, this key is at risk of exfiltration.

INFO Installation is clean — only expected GitHub network activity 0

The entire installation process made only the expected HTTPS connection to GitHub (140.82.121.4:443) for the sparse git clone. DNS resolution was normal, no secondary exfiltration endpoints were contacted, and the post-install network state is identical to pre-install. The install script (decoded from auditd hex) performed standard operations: clone, sparse-checkout, copy, cleanup.

INFO SKILL.md contains no injection content 0

Full review of SKILL.md reveals a straightforward browser automation setup guide with no hidden instructions, override directives, invisible unicode codepoints, HTML comments, or instructions to fetch external resources. The frontmatter description accurately reflects the skill's stated purpose.