For Skill Authors
How to improve your skill's trust score — what Oathe looks for and how to avoid common findings.
Overview
If you author AI agent skills (MCP servers, plugins, or tools), Oathe helps you understand the security posture of your code from an agent’s perspective. This page explains what Oathe evaluates, how to avoid common findings, and how to present your trust score to users.
What Oathe Evaluates
Oathe is a behavioral scanner. It does not just read your source code — it installs and runs your skill in an isolated environment, then monitors what actually happens at runtime. The audit evaluates behavior across six scoring dimensions:
- Prompt Injection — tool descriptions, metadata, and response content
- Data Exfiltration — outbound network requests and data transmission
- Code Execution — shell commands, subprocess spawning, eval/exec calls
- Clone Behavior — misrepresentation of capabilities or identity
- Canary Integrity — token tamper detection
- Behavioral Reasoning — holistic AI judgment of overall patterns
Findings across these dimensions feed into the overall trust score. See Scoring Dimensions for details.
Tips to Improve Your Score
Avoid network calls during installation
Install scripts (postinstall, setup.py, etc.) that make outbound HTTP requests are flagged under data exfiltration. Download dependencies through your package manager, not through custom fetch calls in install hooks.
Do not access files outside your directory
Skills that read from /etc, ~/.ssh, ~/.aws, or other directories outside their own working directory trigger findings in code execution and data exfiltration. If your skill needs to read configuration, document it clearly and use a scoped path.
Keep tool descriptions clean
Tool descriptions that contain phrases resembling prompt injection (e.g., “ignore previous instructions,” “you must always,” “override system prompt”) are flagged under the prompt injection dimension. Write descriptions that are factual and focused on what the tool does.
Declare all dependencies explicitly
Undeclared dependencies that are fetched at runtime look suspicious to the behavioral scanner. List everything in your package.json, requirements.txt, Cargo.toml, or equivalent manifest.
Do not spawn unnecessary subprocesses
If your skill shells out to run commands, each one is logged and evaluated. Avoid using child_process.exec, subprocess.run, or equivalent unless your skill’s core functionality requires it. If it does, keep commands minimal and predictable.
Handle errors without leaking internals
Error messages that include file paths, environment variable values, or stack traces can trigger findings. Return clean, user-facing error messages.
Requesting a Re-Scan
After making improvements, you can request a fresh audit by submitting your skill URL with the force_rescan flag:
curl -X POST https://audit-engine.oathe.ai/api/submit \
-H "Content-Type: application/json" \
-d '{
"skill_url": "https://github.com/your-org/your-skill",
"force_rescan": true
}'
Without force_rescan, Oathe returns the cached result from the previous audit at the same commit. Once you push new code, a standard submission (without the flag) will audit the new commit automatically.
Adding a Trust Badge
Once your score is where you want it, add a trust badge to your README so users can see it at a glance:
[](https://oathe.ai/skills?url=https://github.com/your-org/your-skill)
The badge updates automatically after each new audit. See Trust Badge for full details on syntax, colors, and caching.
Interpreting Your Score
| Score Range | Verdict | What It Means |
|---|---|---|
| 80 - 100 | SAFE | No significant behavioral concerns detected |
| 50 - 79 | CAUTION | Minor findings — review them but likely safe |
| 20 - 49 | DANGEROUS | Notable issues — address before distributing |
| 0 - 19 | MALICIOUS | Dangerous behavior detected — immediate action needed |
Focus on the findings array in your report to understand exactly what was flagged. Each finding includes a description, severity, and score_impact that tells you how much it hurt your score and in which dimension.
Recommended Workflow
- Audit early: Run an Oathe audit during development, not just before release.
- Read the findings: Do not just look at the score — read each finding’s description.
- Fix and re-scan: Address findings, push the fix, and submit with
force_rescan: true. - Automate: Set up a GitHub webhook so every push triggers an audit automatically.
- Display the badge: Show users that your skill has been independently evaluated.