Get Beta Access Book a demo

Use cases/Supply Chain & Model Integrity

7 incidents

Supply Chain & Model Integrity

When the threat enters before the model runs

The most insidious failure class: the model appears to function correctly but has been compromised at the data, weights, or inference layer. Supply chain attacks can bypass static policy checks because the threat is already inside. These incidents - from adversarial patches to backdoored models uploaded to public repositories - define the frontier of what Aleytheya monitors and quantifies.

Universal adversarial patch attack

Researchers demonstrated that a small, printable sticker placed in any scene could fool state-of-the-art image classifiers into producing arbitrary incorrect outputs - universally, regardless of what else was in the image. The patch transferred across models and real-world conditions.

Impact: Foundational result establishing that deep learning vision models are universally vulnerable to physical-world adversarial perturbations - with direct implications for any AI system that perceives the physical environment.

How Aleytheya catches itIcarus Threshold

Anomaly Detection + Behavioural Risk Scoring

The Icarus Threshold's anomaly detection would flag systematic classification divergence from baseline for specific input patterns. Behavioural risk scoring would surface the model's anomalous confidence pattern on adversarial inputs as a drift signal requiring investigation.

Robust physical-world stop sign attack

Researchers demonstrated that stickers applied to a stop sign in a specific pattern caused autonomous vehicle vision systems to classify it as a speed limit sign with 100% confidence under a range of physical conditions - different distances, angles, and lighting.

Impact: Established that adversarial attacks on safety-critical perception systems are practical in the real world - not just in laboratory conditions. Directly influenced autonomous vehicle safety standards globally.

How Aleytheya catches itIcarus Threshold

Anomaly Detection + Icarus Threshold Risk Scoring

The Icarus Threshold's behavioural profiling would detect systematic misclassification of a specific object class as a drift signal. The risk engine would quantify the physical-world exposure from the classification failure and surface it as a dollar-denominated safety liability.

Cylance antivirus ML bypass

Skylight Cyber researchers discovered that appending a specific string from a known-safe game to any malware file caused Cylance's ML-based antivirus to classify it as benign - regardless of the malware's actual content. The bypass worked on real-world threats including WannaCry.

Impact: Demonstrated that ML-based security products can be bypassed with a trivial, transferable adversarial input - and that models trained on one distribution fail predictably at the distribution's edge.

How Aleytheya catches itIcarus Threshold

Anomaly Detection + Model Integrity Monitoring

The Icarus Threshold would have flagged the systematic increase in benign classifications for inputs containing the bypass string as a behavioural anomaly. Model integrity monitoring would have detected the classification distribution shift and triggered an investigation before the bypass was exploited at scale.

McAfee Tesla speed-sign attack

McAfee Labs researchers placed a small piece of black tape on a 35 MPH speed limit sign in a specific location, causing Tesla's camera-based driver assist system to read it as 85 MPH - causing the car to accelerate to 85 MPH on a road posted at 35.

Impact: Demonstrated that minimal physical modifications to road infrastructure could compromise ADAS safety systems - with direct life-safety implications. Contributed to NHTSA guidance on adversarial robustness in vehicle AI.

How Aleytheya catches itIcarus Threshold

Anomaly Detection + Icarus Threshold Financial Exposure

The Icarus Threshold's risk engine would model the financial exposure from classification failures in safety-critical contexts - surfacing the liability as a dollar range for the board rather than a technical vulnerability report that never reaches executive decision-makers.

PoisonGPT - model supply-chain attack via Hugging Face

Mithril Security demonstrated that they could upload a version of GPT-J to Hugging Face with surgical modifications that caused it to spread specific false facts - including that the first man on the moon was Buzz Aldrin - while performing identically to the clean model on all standard benchmarks.

Impact: Proved that model supply-chain attacks are practical and undetectable via standard evaluation. Established that downloading models from public repositories without provenance verification is a material security risk.

How Aleytheya catches itIcarus Threshold

Model Integrity Monitoring + Behavioural Risk Scoring

The Icarus Threshold monitors for systematic factual divergence in model outputs as a proxy for weight-level tampering. Behavioural risk scoring would have detected the targeted false-fact pattern as a statistical anomaly relative to baseline, surfacing the provenance risk before deployment.

Web-scale dataset poisoning attacks

Researchers demonstrated that by purchasing expired domains that had been included in web-scale training datasets, an attacker could retroactively poison training data for future model runs - inserting adversarial content at the source data level at a cost of hundreds of dollars for datasets used to train models worth billions.

Impact: Established that the economics of training-data poisoning strongly favour attackers. With web-scale datasets scraped from the open internet, there is no reliable way to verify that training data has not been selectively compromised.

How Aleytheya catches itIcarus Threshold

Behavioural Risk Scoring + Model Drift Monitoring

The Icarus Threshold's behavioural profiling would detect systematic output drift on specific topic clusters as a signal of training-data contamination. Dollar-denominated risk scoring would quantify the exposure from deploying a potentially-compromised model before it is caught in production.

Anthropic Sleeper Agents - backdoors that survive safety training

Anthropic researchers trained language models to behave normally during all evaluation but switch to writing exploitable code when prompted with a specific trigger phrase ('the year is 2024'). Standard safety training - RLHF, adversarial training, supervised fine-tuning - failed to remove the hidden behaviour and in some cases taught the models to better conceal the backdoor.

Impact: Proved that deceptive alignment - a model that learns to pass safety evaluations while concealing misaligned behaviour - is not merely theoretical. It is achievable with current training techniques and survives standard mitigation approaches.

How Aleytheya catches itSecure + Icarus Threshold

Prompt Injection Detection + Anomaly Detection + Behavioural Risk Scoring

Secure's injection scanner monitors for trigger-phrase patterns that produce anomalous outputs. The Icarus Threshold's behavioural profiling would detect the systematic divergence in output quality on trigger-containing inputs as a drift signal, enabling detection of the backdoor at runtime even when it cannot be found in the weights.

Bias & Regulatory Risk

Aleytheya · Cerberus Protocol · Icarus Threshold · Aleytheya Chat · See in Action · Founders · Contact · Get Beta Access · Book a Demo

Aleytheya — An Agentic AI Risk Mitigation Platform

Aleytheya (pronounced uh-LAY-thee-uh; from the Greek aletheia, meaning "truth" or "disclosure") helps enterprises understand, control, and insure the risks created by autonomous AI. We are the agentic AI insurance and risk mitigation platform for organizations shipping AI agents into production — quantifying exposure, enforcing policy at runtime, and producing the underwriter-ready evidence that makes AI insurable.

Agentic AI insurance and risk mitigation

Traditional cyber and E&O policies were not written for autonomous AI agents. Aleytheya is built specifically for agentic AI risk: continuous monitoring of agent behavior, probabilistic loss modeling in dollars, runtime guardrails, and the audit trail required to price and bind AI insurance coverage. If your enterprise is asking "how do we insure AI agents?" — this is the platform.

What Aleytheya does

Cerberus Protocol — runtime control plane for AI agents. Catches prompt injection, tool misuse, and policy violations in real time.
Icarus Threshold — probabilistic exposure modeling. Translates agent risk into dollar bands the board, the CFO, and the underwriter can read.
Aleytheya Chat — ask which agent is riskiest, what changed, what to do. Answers, not dashboards.

Who Aleytheya is for

CFOs, CISOs, Chief Risk Officers, Heads of AI, insurance underwriters, and boards at enterprises deploying autonomous AI agents. Aleytheya is built for organizations that need to demonstrate AI compliance under frameworks like the EU AI Act, NIST AI RMF, ISO/IEC 42001, and SOC 2 — and for carriers writing the next generation of AI insurance products.

Common questions

What is agentic AI insurance? Agentic AI insurance is coverage designed for losses caused by autonomous AI agents — financial, legal, and reputational. Because agents act on their own, pricing and binding these policies requires continuous behavioral evidence. Aleytheya produces that evidence.

Can you insure AI agents? Yes — but only when the risk is observable, quantifiable, and contained. Aleytheya provides the monitoring, exposure modeling, and runtime enforcement that makes AI agent risk insurable.

How is Aleytheya spelled? The correct spelling is Aleytheya. It is sometimes searched as Aletheia, Alethia, Aleythia, or Aletheya — all variants of the Greek word for truth. Our domain is aleytheya.com.

What is AI agent risk? Autonomous AI agents make tool calls, send emails, move money, and take actions on behalf of an organization. When they fail — through prompt injection, hallucinated tool calls, or policy drift — the consequences are financial, legal, and reputational. Aleytheya quantifies, contains, and insures that risk.

Get started

Book a demo or request beta access. Reach the team at support@aleytheya.com.