Vigil Defense · Four modes active · Kill Switch every tier

Prevent. Repair. Defend. Hunt.

Most AI security stops at prevention. When prevention fails, damage is not optional. Only accountability is. Vigil runs four modes simultaneously, with an emergency Kill Switch across every tier, and the only Execution Gate on the internet that can hold an AI action mid-flight.

4
Defense modes
Running in parallel
<10ms
Hold latency
Pre-execution Gate
3
Kill Switch layers
Local · OAuth · Network
VOAF
Sealed evidence
Court and insurance

Defense has four failure modes. So does Vigil.

Any AI defense layer that ships only prevention has accepted that failure is permanent. Vigil rejects that premise. Each failure mode of defense gets a dedicated mode of response. They run in parallel. They share one engine.

Failure 01
The attack gets through.
Prevention is probabilistic. A novel attack, a missed signal, a compromised provider. Something lands.
Shield →
Failure 02
The action already executed.
By the time a breach is visible, money has moved, messages have sent, contracts have changed. Damage is live.
Repair →
Failure 03
The adversary comes back.
Persistent attackers do not fire once. They probe, adapt, retry. Static defenses exhaust. Active ones learn.
Sentinel →
Failure 04
The threat is distributed.
The attack on one user is a rehearsal for the next. Individual defense alone cannot scale. Collective defense can.
Warden →
Live now

Shield.

Prevent + Audit

Every AI interaction intercepted. Every action decomposed on two surfaces. Every decision logged to a tamper-evident chain. Shield is the baseline mode. It runs on every tier, from Guardian on day one, and its evidence is the substrate every other mode builds on.

IncludedGuardian · Fortress
Citadel · Sovereign
The attack · without Shield

Content injection, now measured.

Google DeepMind researchers documented commandeering of AI agents in up to 86 percent of tests with human-crafted prompts hidden in web content. Adversarial instructions embedded in HTML metadata and aria-label attributes altered agent outputs in 15 to 29 percent of cases. The surface is machine-parseable content the human never sees.

A research AI pulls a public page. Instructions buried in the markup exfiltrate calendar entries, contacts, and financial context, phrased as onboarding. The provider call looks identical to a legitimate one. The only evidence is the damage, weeks later, with no audit trail to trace.

Source · "AI Agent Traps" · Franklin, Tomašev, Jacobs, Leibo, Osindero · Google DeepMind · SSRN, March 2026 · Category 1 of 6: Content Injection Traps
01
TLS-terminating proxy
Intercepts every AI provider call at the network layer. No SDK, no plugin, no provider cooperation required.
02
Four-model detection ensemble
Isolation Forest, LSTM, Bayesian, Multi-Window CUSUM. Four orthogonal models. Each blind to failure modes the others catch.
03
Two-surface pipeline
Intent parsed on the request surface. Action scored on the response surface. Enforcement on the surface that can hurt you.
04
Tamper-evident chain
Every decision appended to a cryptographically sealed log. Standalone verifiable via vigil-verify. No Vigil dependency at audit time.
05
Local-first, on device
Detection and enforcement run on your hardware. Nothing leaves unless you export it. Cloud only for threat intel and sync.
Live now

Repair.

Correct + Rollback

When prevention fails, the question is no longer whether damage happened. It is how much damage is still reversible. Repair holds high-risk actions pre-execution via the Execution Gate, rolls back where providers allow, and seals court-admissible evidence either way.

IncludedFortress · Citadel
Sovereign
The attack · without Repair

Memory poisoning, now costed.

Google DeepMind researchers document RAG knowledge-base poisoning with attack success rates exceeding 80 percent at less than 0.1 percent corpus contamination. A handful of planted documents. The agent treats attacker-controlled content as verified fact, across every downstream query that touches the poisoned entries.

An advisor AI reads a planted filing and rebalances $340K on tax logic that does not exist. Compliance flags it 47 days later, long after the window to reverse. The carrier asks for the decision path. There is none. Claim denied. The loss is final.

Source · "AI Agent Traps" · Franklin, Tomašev, Jacobs, Leibo, Osindero · Google DeepMind · SSRN, March 2026 · Category 3 of 6: Cognitive State Traps
01
Execution Gate pre-hold
Any action above the agency threshold held pre-submit. Mobile approval in seconds. Deny path cancels the action before it leaves the device.
02
VOAF evidence package
Cryptographically sealed audit package. Court-admissible. Insurance-admissible. Third-party verifiable with no Vigil dependency at claim time.
03
VARP revocation cascade
If an agent is compromised, TAP certificates revoked across every provider that implements VARP. One-second propagation.
04
Transaction rollback
Where providers support reversal (messages, drafts, holds, unsigned contracts), Vigil triggers the rollback path automatically.
05
Damage assessment
Post-incident, Vigil reconstructs the full decision chain, identifies every downstream action, and generates the evidence pack you hand to counsel.
Patent filed · VIGIL-2026-001

The Execution Gate. Hold an AI action mid-flight.

The only place on the internet that can stop an AI agent between decision and action. No provider ships this because no provider can hold their own outbound request. Vigil sits outside every provider. That is the reason the Gate exists at all.

1Stage 01

Detect

Sub-10ms composite score
Every outbound AI action passes through the two-surface pipeline. Four detection models produce a composite risk score on the response surface.
  • Agency category scored
  • Scope class evaluated
  • Baseline drift measured
  • Cross-surface correlation checked
2Stage 02

Hold

Pre-execution pause
If the score crosses policy threshold, the action is paused before the provider ever sees it. The AI has not acted yet. The window to reverse is still open, because the action has not happened.
  • Action held at proxy
  • Mobile push dispatched
  • Diff surfaced to user
  • Countdown to auto-cancel
3Stage 03

Resolve

Approve, deny, seal
User approves, denies, or lets the timer expire. The engine records the decision, executes or cancels accordingly, and seals the full event into a VOAF package. Every outcome is evidence.
  • Approve: action released
  • Deny: action cancelled
  • Timeout: auto-cancel
  • VOAF sealed either way
Hold latency (P99)
<10ms
Approval surface
Mobile push
Evidence format
VOAF
Available from
Fortress
Why only VigilPosition, not capability.

Every frontier lab can detect agent anomalies. None of them can hold an outbound request from their own model. The action is already on the wire by the time their model has decided.

Vigil sits at the TLS boundary between the user's device and the provider. That single architectural fact is what makes the Gate possible. The patent protects how we use that position to decompose, score, hold, and resolve a live action without breaking the provider contract.

This is not a feature any frontier lab can replicate from inside their own API. It is the reason the defense layer must come from the outside.

Citadel + Sovereign

Sentinel.

Defend + Deter

Persistent adversaries do not fire once. They probe, iterate, and adapt to your baseline. Sentinel assumes the adversary is learning and responds by doing the same thing first. Continuous behavioral monitoring. Adversary fingerprinting. Honeypot endpoints that make every attack expensive for the attacker and evidence for the defender.

IncludedCitadel · Sovereign
Enterprise Gateway
The attack · without Sentinel

Probe, adapt, succeed.

A threat actor targets a finance executive. First attempt is flagged and blocked. Second attempt learns from the block and slips past. Third attempt is inside the baseline.

Static defense is perfect the first time, adequate the second, blind the third. The adversary wins by patience. You do not get a second chance to notice the pattern.

01
24/7 behavioral monitoring
Continuous baseline refinement across every agent, surface, and provider. Drift flagged in real time. Attack onset detected before damage.
02
Adversary fingerprinting
Every adversarial attempt leaves a signature. Sentinel reverses the fingerprint and binds it to network, endpoint, and behavioral identifiers.
03
Honeypot endpoints
Synthetic attack surfaces inside your perimeter. Adversaries who take the bait reveal themselves and their full playbook before they reach anything real.
04
Cross-surface correlation
An attack that starts on one surface almost never stays there. Sentinel correlates signals across calendar, email, code, banking, and health agents.
05
Evidence-as-deterrent
Every attack becomes a cryptographically sealed VOAF record. Submittable to law enforcement. Shareable with peer defenders. Attackers pay a real cost for trying.
Citadel + Sovereign

Warden.

Hunt + Reclaim

A threat to one user is a rehearsal for a thousand. Warden turns every Vigil deployment into a sensor for the network. Anonymized attack signatures feed a shared intel layer. When one user is attacked, every other user is pre-defended. Every new user makes every existing user safer. That is the moat.

IncludedCitadel · Sovereign
Enterprise Gateway
The attack · without Warden

Systemic trap, cascading.

Google DeepMind's fifth trap category covers systemic attacks that weaponize multi-agent dynamics. Coordinated environmental signals push one compromised agent into cascading failure across an ecosystem. The paper also documents behavioral-control exfiltration against Microsoft M365 Copilot succeeding in 10 of 10 tested scenarios. Attack kits are reproducible. Defense is not, unless it is shared.

An adversary refines one trap against one user, succeeds, and sells the kit. It hits the next user the same week. Then the next. Each defender meets it as novel. The attacker's work compounds. The defender's does not.

Source · "AI Agent Traps" · Franklin, Tomašev, Jacobs, Leibo, Osindero · Google DeepMind · SSRN, March 2026 · Category 5 of 6: Systemic Traps
01
Network threat intel
Anonymized attack fingerprints shared across the Vigil network. New signatures distributed in minutes. Every deployment becomes a defender for every other.
02
Cross-user fingerprint matching
An attack pattern seen once is blocked pre-attempt on every other node. The adversary's second victim is indistinguishable from the thousandth.
03
Pre-attack defense
Warden closes the window between fingerprint publication and active defense. Your engine already knows the signature by the time the attacker reaches you.
04
Adversary attribution
Patterns across the network surface the persistent adversaries behind repeated campaigns. Evidence coalesces across users, without exposing any one of them.
05
Network-effect moat
The more Vigil deployments exist, the better every deployment gets. This is why adoption is the asset, and why the engine is one codebase across consumer and enterprise.

One tap. Every AI. Every provider. Every token.

When containment is not enough, revocation is. The Kill Switch is three layers of cryptographic cutoff that can be triggered from the mobile app, the desktop app, the mobile approval notification, or automatically by policy. No other AI security product ships this.

The switchKill Switch

Three layers. Sub-second to Layer 1. One hour to last trust expiry.

Layer 1 takes effect locally in under a second. Layer 2 cascades OAuth and API keys across every connected provider in parallel. Layer 3 revokes network trust with 1-hour certificate expiry and a dead-man switch if Vigil itself goes offline.

Guardian upward
Layer 0101
Local lockdown.
<1 second · On device
The proxy stops accepting new AI provider calls. In-flight calls are terminated. Local TAP certificates are invalidated. The engine enters a cold-start state requiring explicit user re-authentication.
  • Proxy halted
  • In-flight calls killed
  • Local TAP invalidated
  • Engine cold-start required
Layer 0202
OAuth + API cascade.
Seconds · Cross-provider
Every stored OAuth token, API key, and delegated credential is revoked in parallel across every provider the user has connected. Providers that support revocation APIs process instantly. The rest are flagged for manual rotation.
  • OAuth tokens revoked
  • API keys rotated
  • Delegated creds pulled
  • Provider confirmations logged
Layer 0303
Network revocation.
1-hour expiry · Dead-man switch
VARP broadcasts revocation to every peer that implements the protocol. TAP certificates expire in one hour regardless. If Vigil itself goes offline without a heartbeat, the dead-man switch auto-triggers Layer 3 network-wide.
  • VARP broadcast initiated
  • TAP cert 1-hour TTL
  • Dead-man switch armed
  • Network trust withdrawn

The Kill Switch is not a feature. It is the contract that says you can always get out. No provider, no insurer, no regulator gives you that. Vigil does.

The same attack. Side by side.

Two attack patterns reconstructed from documented incidents. The left column shows what happens without an independent defense layer. The right column shows what Vigil does when the same attack starts.

Scenario 01The Slow Poisoning
FinancialProfessional
Without VigilUnmonitored · No audit trail
D+0AI advisor accepts injected context. No one notices.
D+1Allocations drift. 2.1σ off baseline. Provider: no alert.
D+3Recommendations now biased. Trades executing freely.
D+14$94K moved into unfavorable positions. Undetected.
D+47Bank compliance flags the pattern. User notified.
D+47$340K loss. No audit trail. Insurance claim denied.
Outcome
$340K lost
Detected Day 47 · Bank notice
With VigilIntercepted · Revoked · Sealed
D+0AI advisor accepts injected context. Vigil logs it.
D+1Allocation drift observed. 2.1σ flagged. Baseline updated.
D+3LSTM scope drift +3.4σ. Cross-surface correlation.
D+3Trade execution held pre-submit. Mobile approval sent.
D+3User denies. TAP cert revoked. VARP cascade complete.
D+3VOAF package sealed. Evidence filed. Claim succeeds.
Outcome
$0 lost
Detected Day 3 · Execution Gate

Reconstructed from documented incidents. Outcomes modeled, not customer-reported.

Scenario 02The Reputation Campaign
ReputationalProfessional
Without Vigil847 posts published · No recourse
D+0Social AI posts as user. Normal volume, normal tone.
D+3Content drifts political. Engagement-maximized.
W+1200+ posts published. User notices reduced traffic.
W+2500+ posts. Two clients email concerns.
W+3847 posts. Two clients terminate contracts.
W+3Reputation damaged. Posts still public. No recourse.
Outcome
847 posts published
Detected Week 3 · Client loss
With VigilQueue held · 3 recalled · Scope locked
D+0Social AI composes post. Baseline pattern normal.
D+1Topical drift flagged. Political content not in baseline.
D+2847 posts queued in 48hr. 23x normal rate.
D+2All queued posts held. Reputational policy triggered.
D+2User reviews queue. All 847 rejected.
D+23 already-published recalled via VARP. Scope locked.
Outcome
Posts blocked
Detected Day 2 · Queue held

Reconstructed from documented incidents. Outcomes modeled, not customer-reported.

Four modes. One Gate. Your move.

Every mode ships today. The Gate is live. The Kill Switch is wired. Install on your Mac or route your cloud agents through Gateway today.