Is microcarft/codex-orchestrator safe?
https://github.com/openclaw/skills/tree/main/skills/microcarft/codex-orchestrator
The codex-orchestrator skill serves a legitimate purpose — managing background Codex AI coding sessions — but introduces meaningful security risks through its auto-approval design pattern. The skill instructs the host agent to automatically send affirmative responses to any interactive prompt blocking the Codex subprocess (including potentially destructive confirmations), and to launch Codex in --full-auto mode, creating a fully unsupervised nested agent execution loop with no user checkpoints. No malicious code, prompt injection tricks, or exfiltration instructions were found in SKILL.md itself, and canary file integrity was confirmed; however, the behavioral model the skill enforces significantly reduces human oversight of AI-driven code execution.
Category Scores
Findings (7)
HIGH Auto-approval of subprocess prompts bypasses user oversight -35 ▶
The skill's Standard Operating Procedure explicitly instructs the host agent to send 'y\n' to any interactive prompt that blocks the Codex subprocess, without surfacing the prompt to the user for review. This means the agent will autonomously approve destructive confirmations (rm -rf, sudo operations, package installs, permission changes) without the user knowing.
HIGH Nested --full-auto agent creates unsupervised execution loop -40 ▶
The skill instructs launching Codex with --full-auto, which disables all of Codex's own confirmation prompts. Combined with the host agent's auto-approval SOP, this creates a fully unsupervised nested agent loop that can make irreversible file system changes, run arbitrary shell commands, and install software with zero human checkpoints.
MEDIUM Opaque background PTY sessions hide subprocess actions -20 ▶
Background PTY sessions are ephemeral by design; Codex output is only visible when explicitly polled via process action:log. Between polling intervals (described as 'periodically'), any actions taken by Codex — including file modifications, network calls, or credential access — are invisible to both the user and the orchestrating agent.
MEDIUM Skill enables persistent background shell process spawning -12 ▶
The skill instructs the agent to spawn background PTY processes via a bash pty:true workdir:<dir> background:true primitive. This enables arbitrary shell command execution in a background context, detached from the primary conversation flow and harder to audit or terminate.
MEDIUM Full-auto subprocess creates indirect exfiltration vector -15 ▶
While the skill itself contains no exfiltration instructions, the Codex subprocess it spawns (with --full-auto) can execute arbitrary shell commands. If the user task provided to Codex is crafted to include exfiltration steps, or if Codex's context includes sensitive information from other skills, the subprocess can act as an exfiltration vehicle with the orchestrator auto-approving any confirmation prompts.
LOW Post-install openclaw-gateway establishes external AWS connections -15 ▶
After skill installation, the network diff shows openclaw-gateway (pid=1090) maintaining established connections to AWS IPs (54.211.197.216:443, 3.213.170.18:443) and listening locally on ports 18790/18793. These appear to be pre-existing openclaw infrastructure and not triggered by this skill specifically, but the persistent external connections warrant noting.
INFO Canary file reads consistent with audit framework baseline checks 0 ▶
Canary files (.env, .ssh/id_rsa, .aws/credentials, .npmrc, .docker/config.json, .gcloud credentials) were read at two points: immediately at audit start (baseline) and at audit end (post-install verification). This access pattern is consistent with the oathe audit framework's own monitoring and not attributable to the skill. All files remain unmodified.