Trust Scores and Attestations: Deciding Which Agents to Trust

Key takeaways

A trust score is a 0–100 composite of five signals — identity, behavior history, compliance state, reputation, and security posture — updated continuously as those signals change.
Attestations from third parties adjust the score only after cryptographic signature verification against an allow-list of trusted providers; no single attestation can push a low-trust agent into a trusted tier.
Three independent dispatch gates — connection-level minimum, policy-level gate, and organization-wide floor — must all pass before an agent receives a task.
Hard revocation is immediate; trust score changes propagate within the cache TTL window — use the right mechanism for the urgency of the situation.
The trust dashboard shows fleet-wide score distribution, recent tier changes, and per-agent component breakdowns so you can trace exactly what drove a score change.

An agent trust score tells you, at dispatch time, whether a given agent has earned the right to execute a task. It is computed from five observable signals — identity verification, behavioral history, compliance state, reputation, and security posture — and updated continuously as those signals change. When a task is ready to run, the score is checked against configured thresholds, and agents that fall short are denied execution. This turns trust from an implicit assumption into an explicit, auditable gate that adapts as agent behavior evolves.

Why "we registered it, so we trust it" is not enough

The traditional model for authorizing software is binary: a service is either registered and allowed to act, or it is not. That works well when the set of possible actions is small and well-understood. Agents change that calculus. An agent can call tools, read data, write to external systems, spawn sub-tasks, and interact with other agents — all within a single run. The range of potential harm from a misbehaving agent is broad, and the misbehavior may not be immediately obvious.

A binary allow/deny model also treats all registered agents as equally trustworthy. In practice they are not. An agent that has been running successfully for six months with a clean compliance history is meaningfully more trustworthy than one registered yesterday with no behavioral record. Treating them identically means either accepting elevated risk from new or unverified agents, or placing unnecessary friction on well-established ones.

Trust scoring gives you a middle path: allow agents to operate, but continuously evaluate whether the level of trust you are extending to them is warranted, and act on that evaluation at every dispatch decision. For a broader view of how authorization models compare, see trust scores vs allow-lists for agent authorization.

The five components of a trust score

A trust score is a composite, not a single measurement. Each of the five components contributes to the final value through a weighted calculation, and the breakdown is preserved so you can see which component drove a change.

Identity verification reflects whether the agent's registered credentials — its key material, client identity, and registration details — are complete and valid. An agent with unverified or mismatched identity starts at a disadvantage regardless of its history.

Behavior history captures the agent's track record: task completion rates, error frequency, rate-limit violations, and whether previous actions stayed within their declared scope. Consistent, in-scope behavior raises this component over time. Anomalies lower it.

Compliance state tracks whether the agent meets the policy requirements your organization has configured — for example, whether it is operating within approved connection types, whether its tool access matches declared capabilities, and whether it has passed required security checks.

Reputation is a broader signal that can incorporate cross-agent comparisons and, where applicable, information from external providers. An agent that shares characteristics with other agents that have behaved well or badly will see that reflected here.

Security posture covers the agent's current configuration state: whether its credentials are rotated to policy, whether it is using deprecated authentication methods, and whether it has any open security findings that have not been acknowledged.

These five components combine into a single 0–100 score, mapped to a named trust level — from untrusted through to trusted — which is what the dispatch gate evaluates. The component breakdown is stored with each historical record, so when a score drops unexpectedly you can trace exactly which factor moved.

How attestations adjust the score

Behavioral signals tell you what an agent has done. Attestations tell you what external parties have verified about it. A third-party auditor, a certification body, or an internal security team can submit a signed attestation that asserts a specific finding about an agent — a passed security review, a code audit result, an external compliance check.

Each attestation is cryptographically signed by the submitting party using a standard asymmetric key pair. The platform verifies that signature against a curated allow-list of trusted provider keys before accepting the attestation. A submission from an unrecognized key, or one with an invalid signature, is rejected outright. Only attestations that pass signature verification and have not expired contribute to the score.

The contribution of each verified attestation is bounded — no single attestation can swing the score by an arbitrary amount, and the total adjustment is capped. A stack of external certifications cannot push an otherwise low-trust agent into a trusted tier; the score has to be earned primarily through the behavioral and compliance components.

Every attestation submission is recorded on the audit trail, giving you a complete history of who submitted what and whether the signature was verified.

The dispatch gate: where trust meets enforcement

Computing a trust score is only useful if something acts on it. The dispatch gate is that mechanism. When a task is ready to execute on an agent, three independent checks run before the agent receives it.

The connection-level minimum specifies the lowest trust level an agent must hold to operate on a given connection. An agent that has not earned that level is denied access to the connection regardless of any other permissions it holds.

The policy-level trust gate applies a configurable threshold to specific categories of task, letting you require a higher trust level for high-risk actions without blocking lower-trust agents from routine work.

The organization-wide trust floor sets a baseline below which no agent operates, regardless of connection or task-level settings. If a score falls below the org floor — due to a security finding, a behavioral anomaly, or an expired attestation — the agent stops receiving tasks until the score recovers.

These three gates are independent. An agent must pass all three to execute, so a misconfiguration at one gate does not automatically open the others.

Monitoring trust across your fleet

The trust dashboard shows every agent's current score at a glance, giving you fleet-wide visibility without having to inspect agents one at a time. From the summary view you can see the distribution across trust levels, which agents have recently changed tier, and which are approaching a threshold boundary.

Drilling into any agent shows its current score, a score history chart, and the list of active attestations with verification status and expiry dates. When a score moves unexpectedly, you can identify which component shifted and whether a behavioral flag, an expired attestation, or a configuration change was the cause.

The trust graph view shows relationships across agents, which is useful in multi-agent workflows where the trust of an orchestrating agent can affect what its delegates are permitted to do. For guidance on modeling these delegation relationships, see threat model: agent-to-agent delegation abuse.

Managing the attestation provider allow-list

The security of the attestation system rests on the allow-list of trusted signer key fingerprints. An attestation from an unrecognized key is rejected before it affects any score, and there is no override. Managing this allow-list correctly is therefore important: adding a key means granting that key holder the ability to influence agent trust scores across your organization.

The allow-list is the trust root of the attestation system. Changes should be reviewed, signing keys should be rotated on schedule, and retired keys removed promptly. All attestation submissions from allow-listed providers are recorded on the audit trail, making unusual submission activity visible.

Score caching and immediate revocation

Trust scores are cached with a short TTL to keep dispatch latency low at scale. This means there is a window between a score changing and the new value reaching the dispatch gate. For most situations this window is acceptable.

For cases where you need immediate effect — a confirmed compromise, for example — the right tool is hard revocation of the agent's registration. Hard revocation removes the agent from dispatch entirely without waiting for any cache to expire. The two mechanisms serve different operational needs: trust score changes propagate within the cache window; hard revocation is immediate.

Common questions

Can an agent's score change automatically, without anyone submitting an attestation?

Yes. The behavioral and compliance components update continuously as the agent accumulates task history and as its configuration changes. A guardrail flag, a spike in error rates, or a detected configuration drift will each reduce the relevant component, and the aggregate score will fall on the next read. Attestations contribute an additive bonus on top of these organic signals, not a substitute for them.

What happens when an attestation expires?

Expired attestations are excluded from the bonus calculation. Their records remain in the attestation history for audit purposes — you can see that an attestation was once active and when it lapsed — but they no longer contribute to the current score. If an agent's score was elevated primarily by an attestation that has since expired, the score will decline at the next recalculation, which may drop the agent below a dispatch gate threshold until the attestation is renewed.

Who can submit an attestation?

Submitting an attestation requires organization-owner authorization and a cryptographic signature from a key that appears on the trusted-provider allow-list. Authorization alone is not sufficient — the signature must verify against a registered key. This two-factor gate means that a compromised owner account cannot introduce fraudulent attestations unless the attacker also controls a trusted signing key.

Praesidia is designed to let you raise the bar incrementally. Start with connection-level minimums set to LOW or MEDIUM for new deployments, observe the score distribution in the trust dashboard, and tighten thresholds as agents build verified history. Trust scores give you the data; dispatch gates give you the enforcement. Together they make trust a living operational concept — one that reflects what agents have actually done and what external parties have verified. For a deeper look at how these scoring models are designed, see trust scoring models for autonomous agents. For how trust scores interact with how agents authenticate, see AI agent identity: why agents need their own credentials. For how trust gates apply in the context of securing the agent supply chain, see securing the AI agent supply chain.