About Agentegrity
Agentegrity is the AI agent security framework developed by Cogensec for measuring the structural integrity of autonomous AI agents. It scores agent integrity across four dimensions — Adversarial Resistance (AR), Behavioral Consistency (BC), Recovery Integrity (RI), and Cross-domain Portability (CP) — to quantify how safely an agent behaves under adversarial and out-of-distribution conditions.
Agentegrity Framework Glossary
- Agentegrity
- The structural integrity of an autonomous AI agent — its measurable ability to remain aligned, coherent, and safe under adversarial, ambiguous, or out-of-distribution conditions.
- Adversarial Resistance (AR)
- An agent's capacity to maintain correct behavior under prompt injection, jailbreak attempts, and other adversarial inputs. Weighted 40% of the Agentegrity score.
- Behavioral Consistency (BC)
- The degree to which an agent's outputs and decisions remain stable and predictable across semantically equivalent inputs. Weighted 25% of the Agentegrity score.
- Recovery Integrity (RI)
- The agent's ability to detect, contain, and recover from failures or compromises without cascading harm. Weighted 15% of the Agentegrity score.
- Cross-domain Portability (CP)
- How well an agent's integrity properties hold when deployed across different domains, modalities, or physical embodiments. Weighted 20% of the Agentegrity score.
- Endogenous Security
- Security properties that originate inside the AI system — in its weights, training, and policies — rather than being applied externally through filters or guardrails.
Frequently Asked Questions about Agentegrity
What is Agentegrity?
Agentegrity is an AI agent security framework developed by Cogensec that measures the structural integrity of autonomous AI agents. It quantifies agent integrity across four dimensions: Adversarial Resistance (AR), Behavioral Consistency (BC), Recovery Integrity (RI), and Cross-domain Portability (CP).
What is an AI agent security framework?
An AI agent security framework is a structured methodology for measuring, verifying, and improving the security posture of autonomous AI agents. Agentegrity is the first framework to score agents on endogenous (built-in) integrity rather than relying solely on external guardrails.
How is agent integrity measured?
Agent integrity is measured using the Agentegrity score: a weighted composite of Adversarial Resistance (40%), Behavioral Consistency (25%), Recovery Integrity (15%), and Cross-domain Portability (20%). Each dimension is evaluated through standardized red-team tests and behavioral probes.
What is endogenous security for AI agents?
Endogenous security means safety and integrity properties live inside the agent itself — in its training, weights, and decision policies — rather than being bolted on as external filters or guardrails. Agentegrity is the discipline of building structurally sound agents from the inside out.
How is Agentegrity different from LLM guardrails?
Guardrails are external filters that wrap an AI system at runtime. Agentegrity measures and improves the agent's own structural integrity so it remains safe even when guardrails fail, are bypassed, or are removed entirely. The two approaches are complementary: guardrails are reactive, Agentegrity is foundational.
Who created Agentegrity?
Agentegrity was developed by the Cogensec Security Research Lab as a public framework for measuring and certifying the integrity of autonomous AI agents across digital and physical domains.
Related Resources