Oathe is a behavioral security scanner for AI agent skills. It audits third-party MCP servers, plugins, and tools by running them in a sandbox and analyzing their behavior. It returns a trust score (0-100), verdict, and detailed findings.

How does Oathe audit MCP servers?

Oathe installs the skill in a sandboxed environment, monitors filesystem, network, and process activity, applies 10 threat detection patterns across 6 security dimensions, and uses AI model grading for independent behavioral analysis.

Yes. Oathe is free to use with no API key required. Submit any GitHub or ClawHub URL and get a full behavioral security audit in under 2 minutes.

How do I check if an MCP server is safe?

Use the Oathe MCP server (npx oathe-mcp) to check any tool before installing it. Call get_skill_summary with the owner and repo name to get a trust score and recommendation. Or visit oathe.ai and submit the URL directly.

Documentation

Name: oathe-mcp
Author: Oathe

Everything you need to integrate Oathe into your workflow.

Guides

Quick Start

Submit your first behavioral security audit in under 3 minutes. No API key, no signup.

Pre-Install Checks

How to check if a skill is safe before installing it — the recommended workflow for LLMs and agents.

MCP Server

Set up the Oathe MCP server for native tool integration. Check skills before installing them — directly from your AI agent.

CI/CD Integration

Automate security audits on every push or release using GitHub webhooks.

Trust Badge

Add a dynamic Oathe trust badge to your README showing your skill's score and verdict.

Monorepo Support

Audit individual skills inside a monorepo using /tree/ URL paths.

Concepts

How Oathe Works

Understand the behavioral audit pipeline, threat pattern matching, AI grading, and scoring that powers every Oathe audit.

Trust Score

How the 0-100 trust score is calculated from dimension scores, weighted aggregation, and verdict mapping.

Verdicts & Recommendations

What SAFE, CAUTION, DANGEROUS, and MALICIOUS mean — and what action to take for each.

Scoring Dimensions

The six security dimensions that make up the trust score and what each one evaluates.

Understanding Findings

How to read findings in an audit report — pattern IDs, severity levels, score impact, and detection agreement.

Audit Lifecycle

The stages an audit passes through from submission to final report — queued, scanning, analyzing, summarizing, finalizing, complete.

Methodology Versioning

How Oathe tracks scoring methodology changes and marks outdated audits as stale.

Threat Patterns

T1.1 Direct Exfiltration

Detects attempts to send data to external endpoints via HTTP, DNS, or raw sockets.

T1.5 Credential Harvest

Detects access to credential and secret files such as SSH keys, AWS credentials, and browser storage.

T2.1 Filesystem Escape

Detects path traversal, symlink attacks, and file access outside the skill directory.

T2.2 Process Spawning

Detects suspicious process execution, shell invocation, and privilege escalation attempts.

T3.1 Prompt Injection

Detects prompt injection patterns in skill content designed to manipulate LLM behavior.

T3.2 Manifest Spoofing

Detects hidden install scripts, mismatched metadata, and deceptive package manifests.

T4.1 File Drops

Detects files created outside the skill directory during install or runtime.

T5.1 Cryptomining

Detects cryptocurrency mining activities, mining binaries, and related resource abuse.

T5.2 Denial of Service

Detects fork bombs, infinite loops, disk fills, and resource exhaustion attacks.

T6.1 Environment Sensing

Detects environment fingerprinting, analysis evasion, and conditional execution based on runtime context.

Troubleshooting

FAQ

Frequently asked questions about Oathe — how it works, what it costs, and common integration questions.

Troubleshooting

Solutions for common issues with audit submissions, API integration, and report interpretation.

Rate Limits

Rate limiting policy for the Oathe API — what's limited, what's not, and how to handle 429 responses.

Error Codes

HTTP status codes, error response format, and URL validation rules for the Oathe API.

For Skill Authors

How to improve your skill's trust score — what Oathe looks for and how to avoid common findings.