Scanning AI Agent Skills

📘

Skill Scanning requires the AI Catalog Security bundle

This capability is currently in beta

JFrog's Skill Scanner analyzes AI agent skill packages before they reach developers and agents.

As AI agents become part of the software supply chain, the skills they consume introduce new attack surfaces. Skill packages can contain malicious instructions, prompt injections, unauthorized data exfiltration logic, or references to untrusted third-party code. JFrog's Skill Scanner addresses these risks by scanning skill packages at upload time and producing two levels of security insights:

Malicious Behavior — The scanner detected high-confidence proof of malicious behavior by the skill.
Suspicious Behavior — The scanner detected behavior that can be malicious or benign, depending on context not available to the scanner. For example, a skill that instructs the AI agent to install an unverified third-party skill from GitHub for sensitive on-chain actions — this acts as a potential dropper for untrusted code outside the official marketplace, but can also be a legitimate skill.

Scan results flow into the standard Xray vulnerability pipeline, meaning you can use existing Xray policies, violations, and watches to enforce security governance on skills just as you would on any other package type.

Scanner Targets

The skill scanner can identify general malicious behaviors in skills, with particular focus on:

Download & Execution from untrusted sources
Exfiltration of sensitive information (passwords, access tokens, environment variables)
Prompt injection attacks that attempt to manipulate the AI agent
Unauthorized code installation from outside the official marketplace

Prerequisites

Xray 3.145.0
Artifactory: 7.144.2

How Skill Scanning Works

1. Upload and Indexing

After uploading a skill package (.zip file) to your Artifactory Skills repository, Artifactory detects the package by locating a SKILL.md file at the root of the archive. It then extracts metadata from the file—such as name, version, and description—and applies these as properties on the artifact (skill.name, skill.version, skill.description, skill.fingerprint). Finally, Artifactory sends a webhook to Xray, which indexes the artifact through the standard indexing pipeline.

2. Scanning

After indexing, the artifact is routed to the dedicated AI Scanner microservice (xray-aiscanner). The AI Scanner downloads the skill artifact from Artifactory and submits it to the external skill scanning service, which analyzes the skill and returns a classification of safe, suspicious, or malicious. This classification is then mapped to a vulnerability ID and fed into the standard Xray analysis pipeline.

3. Policy Enforcement

The skill artifact appears in the Xray Scans List with any detected vulnerabilities. Standard Xray policies and watches then evaluate the findings and enforce the configured actions, triggering violations, blocking downloads, or sending notifications as applicable.

Scanner Capabilities

JFrog's Skill Scanner combines deterministic analysis (e.g., YARA rules, static signatures) with a multi-agent investigation pipeline — specialized LLM-powered security agents working like an internal red team.

Capability	Description
Multi-Lens Discovery	Parallel agents examine each skill from different angles: malware analysis, threat intelligence, deception/prompt-injection detection, and artifact/dependency inventory
Orchestrated Investigation	An orchestrator directs specialist sub-agents (each designed to examine a different security aspect) to chase each finding to a confirmed conclusion with full evidence chains
Supply-Chain Intelligence	Integration with JFrog Catalog for vulnerability data, popularity metrics, and dependency risk signals on every referenced package
OSINT Enrichment	Domain reputation, publicly-available tools for ownership validation, and vulnerability database lookups for external references
Prompt Injection Resistance	Layered defenses including adversarial system prompts, structural content separation, multi-agent cross-validation, and dedicated deception detection
Result Integrity Validation	Post-pipeline checks for completeness, consistency, and schema conformance to detect if a skill successfully manipulated any agent
Two Scan Modes	Quick scan for high-volume triage; deep scan for thorough multi-agent investigation
Structured, Auditable Output	Every finding includes severity, confidence, evidence chains, and actionable remediation guidance. Results are persisted for audit trails

Viewing Scan Results

After a skill is scanned, it appears in Xray > Scans List like any other scanned artifact.
The scan results show:

Malicious skills — Critical
Suspicious skills — High
Safe skills — no vulnerabilities detected

Scan Result Details

Each finding includes:

The vulnerability ID and severity
A description of the detected behavior
The reason provided by the scanner explaining why the skill was flagged
Evidence chains supporting the finding

Creating Xray Policies for Skills

You can create standard Xray security policies to govern skill packages. This allows you to:

Block malicious skills — Create a policy with a security condition that blocks packages with Critical vulnerabilities to prevent download of malicious skills.
Flag suspicious skills — Create a policy that triggers notifications or dry-run violations for High severity findings, allowing security teams to review suspicious skills before they are used.
Enforce on Skills repositories — Attach policies to Watches that target your Skills repositories.

Example: Block Malicious Skills

Navigate to Xray > Watches & Policies.
Click New Policy, enter a name (e.g., "Block Malicious Skills"), and select Security as the type.
Configure the rule:
- Select CVEs as the rule type.
- Set Minimal Severity to Critical.
- Set the action to Block Download.
Create or update a Watch that includes your Skills repositories.
Save the policy.

Example: Alert on Suspicious Skills

Create a new Security policy (e.g., "Alert Suspicious Skills").
Configure the rule:
- Select CVEs as the rule type.
- Set Minimal Severity to High.
- Set the action to Generate Violation (or Dry Run for initial evaluation).
- Configure email notifications to alert the security team.
Attach to a Watch targeting your Skills repositories.