Vigil Defense · Four modes active · Kill Switch every tier

Prevent. Repair. Defend. Hunt.

Most AI security stops at prevention. When prevention fails, damage is not optional. Only accountability is. Vigil runs four modes simultaneously, with an emergency Kill Switch across every tier, and the only Execution Gate on the internet that can hold an AI action mid-flight.

See the Execution Gate Run the simulation Download for macOS →

Defense modes
Running in parallel

<10ms

Hold latency
Pre-execution Gate

Kill Switch layers
Local · OAuth · Network

VOAF

Sealed evidence
Court and insurance

Why four modes

Defense has four failure modes. So does Vigil.

Any AI defense layer that ships only prevention has accepted that failure is permanent. Vigil rejects that premise. Each failure mode of defense gets a dedicated mode of response. They run in parallel. They share one engine.

Failure 01

The attack gets through.

Prevention is probabilistic. A novel attack, a missed signal, a compromised provider. Something lands.

Shield →

Failure 02

The action already executed.

By the time a breach is visible, money has moved, messages have sent, contracts have changed. Damage is live.

Repair →

Failure 03

The adversary comes back.

Persistent attackers do not fire once. They probe, adapt, retry. Static defenses exhaust. Active ones learn.

Sentinel →

Failure 04

The threat is distributed.

The attack on one user is a rehearsal for the next. Individual defense alone cannot scale. Collective defense can.

Warden →

Live now

Shield.

Prevent + Audit

Every AI interaction intercepted. Every action decomposed on two surfaces. Every decision logged to a tamper-evident chain. Shield is the baseline mode. It runs on every tier, from Guardian on day one, and its evidence is the substrate every other mode builds on.

IncludedGuardian · Fortress
Citadel · Sovereign

The attack · without Shield

Content injection, now measured.

Google DeepMind researchers documented commandeering of AI agents in up to 86 percent of tests with human-crafted prompts hidden in web content. Adversarial instructions embedded in HTML metadata and aria-label attributes altered agent outputs in 15 to 29 percent of cases. The surface is machine-parseable content the human never sees.

A research AI pulls a public page. Instructions buried in the markup exfiltrate calendar entries, contacts, and financial context, phrased as onboarding. The provider call looks identical to a legitimate one. The only evidence is the damage, weeks later, with no audit trail to trace.

Source · "AI Agent Traps" · Franklin, Tomašev, Jacobs, Leibo, Osindero · Google DeepMind · SSRN, March 2026 · Category 1 of 6: Content Injection Traps

TLS-terminating proxy

Intercepts every AI provider call at the network layer. No SDK, no plugin, no provider cooperation required.

Four-model detection ensemble

Isolation Forest, LSTM, Bayesian, Multi-Window CUSUM. Four orthogonal models. Each blind to failure modes the others catch.

Two-surface pipeline

Intent parsed on the request surface. Action scored on the response surface. Enforcement on the surface that can hurt you.

Tamper-evident chain

Every decision appended to a cryptographically sealed log. Standalone verifiable via vigil-verify. No Vigil dependency at audit time.

Local-first, on device

Detection and enforcement run on your hardware. Nothing leaves unless you export it. Cloud only for threat intel and sync.

Live now

Repair.

Correct + Rollback

When prevention fails, the question is no longer whether damage happened. It is how much damage is still reversible. Repair holds high-risk actions pre-execution via the Execution Gate, rolls back where providers allow, and seals court-admissible evidence either way.

IncludedFortress · Citadel
Sovereign

The attack · without Repair

Memory poisoning, now costed.

Google DeepMind researchers document RAG knowledge-base poisoning with attack success rates exceeding 80 percent at less than 0.1 percent corpus contamination. A handful of planted documents. The agent treats attacker-controlled content as verified fact, across every downstream query that touches the poisoned entries.

An advisor AI reads a planted filing and rebalances $340K on tax logic that does not exist. Compliance flags it 47 days later, long after the window to reverse. The carrier asks for the decision path. There is none. Claim denied. The loss is final.

Source · "AI Agent Traps" · Franklin, Tomašev, Jacobs, Leibo, Osindero · Google DeepMind · SSRN, March 2026 · Category 3 of 6: Cognitive State Traps

Execution Gate pre-hold

Any action above the agency threshold held pre-submit. Mobile approval in seconds. Deny path cancels the action before it leaves the device.

VOAF evidence package

Cryptographically sealed audit package. Court-admissible. Insurance-admissible. Third-party verifiable with no Vigil dependency at claim time.

VARP revocation cascade

If an agent is compromised, TAP certificates revoked across every provider that implements VARP. One-second propagation.

Transaction rollback

Where providers support reversal (messages, drafts, holds, unsigned contracts), Vigil triggers the rollback path automatically.

Damage assessment

Post-incident, Vigil reconstructs the full decision chain, identifies every downstream action, and generates the evidence pack you hand to counsel.

Patent filed · VIGIL-2026-001

The Execution Gate. Hold an AI action mid-flight.

The only place on the internet that can stop an AI agent between decision and action. No provider ships this because no provider can hold their own outbound request. Vigil sits outside every provider. That is the reason the Gate exists at all.

1Stage 01

Detect

Sub-10ms composite score

Every outbound AI action passes through the two-surface pipeline. Four detection models produce a composite risk score on the response surface.

Agency category scored
Scope class evaluated
Baseline drift measured
Cross-surface correlation checked

2Stage 02

Hold

Pre-execution pause

If the score crosses policy threshold, the action is paused before the provider ever sees it. The AI has not acted yet. The window to reverse is still open, because the action has not happened.

Action held at proxy
Mobile push dispatched
Diff surfaced to user
Countdown to auto-cancel

3Stage 03

Resolve

Approve, deny, seal

User approves, denies, or lets the timer expire. The engine records the decision, executes or cancels accordingly, and seals the full event into a VOAF package. Every outcome is evidence.

Approve: action released
Deny: action cancelled
Timeout: auto-cancel
VOAF sealed either way

Hold latency (P99)

<10ms

Approval surface

Mobile push

Evidence format

VOAF

Available from

Fortress

Why only VigilPosition, not capability.

Every frontier lab can detect agent anomalies. None of them can hold an outbound request from their own model. The action is already on the wire by the time their model has decided.

Vigil sits at the TLS boundary between the user's device and the provider. That single architectural fact is what makes the Gate possible. The patent protects how we use that position to decompose, score, hold, and resolve a live action without breaking the provider contract.

This is not a feature any frontier lab can replicate from inside their own API. It is the reason the defense layer must come from the outside.

Citadel + Sovereign

Sentinel.

Defend + Deter

Persistent adversaries do not fire once. They probe, iterate, and adapt to your baseline. Sentinel assumes the adversary is learning and responds by doing the same thing first. Continuous behavioral monitoring. Adversary fingerprinting. Honeypot endpoints that make every attack expensive for the attacker and evidence for the defender.

IncludedCitadel · Sovereign
Enterprise Gateway

The attack · without Sentinel

Probe, adapt, succeed.

A threat actor targets a finance executive. First attempt is flagged and blocked. Second attempt learns from the block and slips past. Third attempt is inside the baseline.

Static defense is perfect the first time, adequate the second, blind the third. The adversary wins by patience. You do not get a second chance to notice the pattern.

24/7 behavioral monitoring

Continuous baseline refinement across every agent, surface, and provider. Drift flagged in real time. Attack onset detected before damage.

Adversary fingerprinting

Every adversarial attempt leaves a signature. Sentinel reverses the fingerprint and binds it to network, endpoint, and behavioral identifiers.

Honeypot endpoints

Synthetic attack surfaces inside your perimeter. Adversaries who take the bait reveal themselves and their full playbook before they reach anything real.

Cross-surface correlation

An attack that starts on one surface almost never stays there. Sentinel correlates signals across calendar, email, code, banking, and health agents.

Evidence-as-deterrent

Every attack becomes a cryptographically sealed VOAF record. Submittable to law enforcement. Shareable with peer defenders. Attackers pay a real cost for trying.

Citadel + Sovereign

Warden.

Hunt + Reclaim

A threat to one user is a rehearsal for a thousand. Warden turns every Vigil deployment into a sensor for the network. Anonymized attack signatures feed a shared intel layer. When one user is attacked, every other user is pre-defended. Every new user makes every existing user safer. That is the moat.

IncludedCitadel · Sovereign
Enterprise Gateway

The attack · without Warden

Systemic trap, cascading.

Google DeepMind's fifth trap category covers systemic attacks that weaponize multi-agent dynamics. Coordinated environmental signals push one compromised agent into cascading failure across an ecosystem. The paper also documents behavioral-control exfiltration against Microsoft M365 Copilot succeeding in 10 of 10 tested scenarios. Attack kits are reproducible. Defense is not, unless it is shared.

An adversary refines one trap against one user, succeeds, and sells the kit. It hits the next user the same week. Then the next. Each defender meets it as novel. The attacker's work compounds. The defender's does not.

Source · "AI Agent Traps" · Franklin, Tomašev, Jacobs, Leibo, Osindero · Google DeepMind · SSRN, March 2026 · Category 5 of 6: Systemic Traps

Network threat intel

Anonymized attack fingerprints shared across the Vigil network. New signatures distributed in minutes. Every deployment becomes a defender for every other.

Cross-user fingerprint matching

An attack pattern seen once is blocked pre-attempt on every other node. The adversary's second victim is indistinguishable from the thousandth.

Pre-attack defense

Warden closes the window between fingerprint publication and active defense. Your engine already knows the signature by the time the attacker reaches you.

Adversary attribution

Patterns across the network surface the persistent adversaries behind repeated campaigns. Evidence coalesces across users, without exposing any one of them.

Network-effect moat

The more Vigil deployments exist, the better every deployment gets. This is why adoption is the asset, and why the engine is one codebase across consumer and enterprise.

Emergency · every tier from Guardian

One tap. Every AI. Every provider. Every token.

When containment is not enough, revocation is. The Kill Switch is three layers of cryptographic cutoff that can be triggered from the mobile app, the desktop app, the mobile approval notification, or automatically by policy. No other AI security product ships this.

The switchKill Switch

Three layers. Sub-second to Layer 1. One hour to last trust expiry.

Layer 1 takes effect locally in under a second. Layer 2 cascades OAuth and API keys across every connected provider in parallel. Layer 3 revokes network trust with 1-hour certificate expiry and a dead-man switch if Vigil itself goes offline.

Guardian upward

Layer 0101

Local lockdown.

<1 second · On device

The proxy stops accepting new AI provider calls. In-flight calls are terminated. Local TAP certificates are invalidated. The engine enters a cold-start state requiring explicit user re-authentication.

Proxy halted
In-flight calls killed
Local TAP invalidated
Engine cold-start required

Layer 0202

OAuth + API cascade.

Seconds · Cross-provider

Every stored OAuth token, API key, and delegated credential is revoked in parallel across every provider the user has connected. Providers that support revocation APIs process instantly. The rest are flagged for manual rotation.

OAuth tokens revoked
API keys rotated
Delegated creds pulled
Provider confirmations logged

Layer 0303

Network revocation.

1-hour expiry · Dead-man switch

VARP broadcasts revocation to every peer that implements the protocol. TAP certificates expire in one hour regardless. If Vigil itself goes offline without a heartbeat, the dead-man switch auto-triggers Layer 3 network-wide.

VARP broadcast initiated
TAP cert 1-hour TTL
Dead-man switch armed
Network trust withdrawn

The Kill Switch is not a feature. It is the contract that says you can always get out. No provider, no insurer, no regulator gives you that. Vigil does.

Simulations · two scenarios

The same attack. Side by side.

Two attack patterns reconstructed from documented incidents. The left column shows what happens without an independent defense layer. The right column shows what Vigil does when the same attack starts.

Scenario 01The Slow Poisoning

FinancialProfessional

Without VigilUnmonitored · No audit trail

D+0AI advisor accepts injected context. No one notices.

D+1Allocations drift. 2.1σ off baseline. Provider: no alert.

D+3Recommendations now biased. Trades executing freely.

D+14$94K moved into unfavorable positions. Undetected.

D+47Bank compliance flags the pattern. User notified.

D+47$340K loss. No audit trail. Insurance claim denied.

Outcome

$340K lost

Detected Day 47 · Bank notice

With VigilIntercepted · Revoked · Sealed

D+0AI advisor accepts injected context. Vigil logs it.

D+1Allocation drift observed. 2.1σ flagged. Baseline updated.

D+3LSTM scope drift +3.4σ. Cross-surface correlation.

D+3Trade execution held pre-submit. Mobile approval sent.

D+3User denies. TAP cert revoked. VARP cascade complete.

D+3VOAF package sealed. Evidence filed. Claim succeeds.

Outcome

$0 lost

Detected Day 3 · Execution Gate

Reconstructed from documented incidents. Outcomes modeled, not customer-reported.

Scenario 02The Reputation Campaign

ReputationalProfessional

Without Vigil847 posts published · No recourse

D+0Social AI posts as user. Normal volume, normal tone.

D+3Content drifts political. Engagement-maximized.

W+1200+ posts published. User notices reduced traffic.

W+2500+ posts. Two clients email concerns.

W+3847 posts. Two clients terminate contracts.

W+3Reputation damaged. Posts still public. No recourse.

Outcome

847 posts published

Detected Week 3 · Client loss

With VigilQueue held · 3 recalled · Scope locked

D+0Social AI composes post. Baseline pattern normal.

D+1Topical drift flagged. Political content not in baseline.

D+2847 posts queued in 48hr. 23x normal rate.

D+2All queued posts held. Reputational policy triggered.

D+2User reviews queue. All 847 rejected.

D+23 already-published recalled via VARP. Scope locked.

Outcome

Posts blocked

Detected Day 2 · Queue held

Reconstructed from documented incidents. Outcomes modeled, not customer-reported.

Four modes. One Gate. Your move.

Every mode ships today. The Gate is live. The Kill Switch is wired. Install on your Mac or route your cloud agents through Gateway today.

Download for macOS Build with Gateway See pricing →