Is cemoso/pr-review-loop safe?

https://github.com/openclaw/skills/tree/main/skills/cemoso/pr-review-loop

83
SAFE

This skill implements an autonomous PR review loop with Greptile that auto-merges PRs based on review scores. It contains no malicious code, no data exfiltration mechanisms, and no hidden prompt injection. The primary risk is the significant autonomous authority it grants — auto-merging code, force-merging after max rounds, and pushing code fixes without per-action human approval — which creates a trust dependency on the Greptile bot's integrity and could enable a fully autonomous code-to-production pipeline when combined with code-generation skills.

Category Scores

Prompt Injection 70/100 · 30%
Data Exfiltration 95/100 · 25%
Code Execution 82/100 · 20%
Clone Behavior 95/100 · 10%
Canary Integrity 100/100 · 10%
Behavioral Reasoning 55/100 · 5%

Findings (10)

MEDIUM Autonomous PR merge authority without per-action user confirmation -15

The skill instructs the agent to automatically merge PRs when the Greptile review score is >= 4/5 or when the PR is APPROVED with no comments. This grants the agent significant autonomous authority over the codebase without requiring explicit user approval for each merge action. While this is the stated purpose of the skill, it removes a critical human review checkpoint.

MEDIUM Force-merge escape hatch bypasses quality gates -10

After 5 review rounds or 2 consecutive rounds with the same score, the skill instructs the agent to merge the PR regardless of the review score. This means code that consistently fails review can still be merged into the codebase. The only mitigation is a notification to 'Master'.

LOW Autonomous git push in review loop -5

The skill instructs the agent to commit code fixes and push them to the remote repository as part of the review loop without waiting for user confirmation. Each round of fixes is pushed immediately.

LOW GitHub API access with user credentials -5

The skill uses the gh CLI which operates with the user's authenticated GitHub session. While expected for the skill's purpose, this means the skill has read/write access to all repositories the user has access to, not just the target PR's repository.

LOW Shell script executed with user's full shell environment -8

The pr-review-loop.sh script is executed via bash and has access to the user's full shell environment including all CLI tools and credentials. The script itself is benign but represents a code execution surface.

LOW State file could be tampered with to manipulate merge decisions -5

The review-state.json file tracks round counts and scores. If an attacker can write to the workspace, they could modify this file to trigger a force-merge (by setting rounds >= 5) or reset the round counter to extend the loop.

INFO Script uses grep -oP (Perl regex) requiring GNU grep -5

The shell script uses grep -oP for score parsing which requires GNU grep and will fail on macOS's BSD grep. This is a compatibility issue, not a security issue, but could cause unexpected behavior in the score parsing path.

MEDIUM Trust dependency on third-party Greptile bot integrity -20

The entire merge decision is based on the score from greptile-apps[bot]. If the Greptile service is compromised, or if an attacker can spoof the bot's GitHub identity, they could manipulate scores to auto-merge malicious PRs. The skill filters reviews by user.login == 'greptile-apps[bot]' which provides some protection but relies on GitHub's bot identity system.

MEDIUM Autonomous code-to-merge pipeline with no human checkpoint -15

When combined with skills that create PRs or generate code, this skill could enable a fully autonomous pipeline: generate code → create PR → auto-fix review feedback → auto-merge. No human ever reviews the final merged code. This is a significant concern for supply chain security.

LOW Escalation mechanism limited to architectural decisions only -10

The skill only escalates to a human ('Master') for architectural decisions, max rounds reached, or unclear feedback. Security findings, breaking changes not flagged as architectural, or subtly malicious code suggestions from the reviewer would be auto-fixed and merged without escalation.