MCP (Model Context Protocol) servers give AI agents a standard, composable interface to external tools and resources. Governance means controlling which agents can connect to which servers, what tools they may call, how frequently, with what data, and producing a durable record of every interaction for audit, compliance, and cost accountability. Without governance, each unmanaged MCP connection becomes an independent blast radius: credentials scattered across configurations, no cross-agent view of call volumes, no content controls, and no audit trail when something goes wrong.
This guide is the authoritative reference for teams that need to bring MCP servers under governance end to end — from the initial security model through authentication patterns, tool-level authorization, rate limiting, monitoring, forensic logging, and a concrete hardening checklist.
What MCP Is and Why Governance Is Non-Negotiable
The Model Context Protocol defines a standard wire format for AI agents to discover and invoke structured tools exposed by remote servers. A single MCP server might offer a document search tool, a SQL query tool, a code execution tool, and a file-write tool behind one endpoint. That composability is the point — agents can chain capabilities without each capability needing bespoke integration code.
The same composability makes unmanaged MCP connections a significant risk surface:
- A misconfigured or over-permissioned agent can chain tool calls across multiple sensitive systems in a single reasoning loop, with no human approval at any step.
- Credentials stored per-agent rather than centrally cannot be rotated in one operation; a compromised credential requires hunting down every copy.
- Without a central policy gate, an agent that should only read data can call a write tool on the same server — nothing in the protocol itself prevents it.
- Without forensic logging, you cannot reconstruct what arguments were passed to a tool, what the server returned, or which agent made the call.
Governance addresses all of these at the layer where they matter: between the agent and the server, not inside either one.
For broader context on why AI infrastructure demands its own security model, see AI agent security. For the relationship between MCP governance and general AI platform operations, see platform operations.
The MCP Security Model
Before configuring controls, it helps to understand the threat model specific to MCP servers. Three properties make them different from conventional APIs:
Agents are autonomous callers. A human API caller can be prompted to re-authenticate, shown a warning dialog, or asked to confirm a destructive action. An agent executes its reasoning loop without surfacing individual tool calls to a human unless you build that checkpoint explicitly. Controls need to fire automatically at dispatch time.
A single server can expose a wide capability surface. Unlike a narrow API that does one thing, an MCP server commonly bundles multiple unrelated tools. Authorization must operate at the individual tool level, not just at the server level. Permitting access to a server without constraining which tools are callable is not authorization — it is only authentication.
Tool call arguments carry untrusted content. Agents construct tool call arguments dynamically, often from content they have processed — web pages, documents, messages from other systems. Prompt injection attacks attempt to smuggle instructions into this content that redirect the agent's tool calls. Content inspection on the arguments passing through the governance layer is therefore a security control, not just a logging concern.
The governance layer sits between agents and MCP servers. It authenticates callers, enforces per-tool policy, inspects content, enforces rate limits, records every call forensically, and emits signals to your monitoring and alerting stack.
Authentication Patterns for MCP Servers
Authentication establishes that the entity connecting to an MCP server is who it claims to be. For managed MCP connections, this works at two levels: the platform authenticating on behalf of agents, and the server verifying the platform's presented credential.
Credential Storage and Injection
Credentials for MCP servers — bearer tokens, API keys, OAuth client credentials, or custom scheme tokens — should be stored centrally, encrypted at rest, and never held by individual agents. The managed client layer retrieves and injects the appropriate credential at call time. Agents never see the raw credential; they reference a registered server by its identifier and the platform handles the rest.
Envelope encryption is the standard pattern for credential storage: each credential is encrypted with a data encryption key (DEK), and the DEK itself is encrypted with a key encryption key (KEK) held in a separate key management system. A database-read exposure yields ciphertext, not usable credentials. The KEK is never co-located with the data it protects.
Additional authenticated data (AAD) binding ties the ciphertext to the specific server record it belongs to. An attacker who extracts a ciphertext blob and moves it to a different record finds it unreadable — the decryption fails because the AAD no longer matches. This prevents credential transplant attacks at the database layer.
Supported Authentication Schemes
MCP servers in practice use several authentication schemes. A governance platform needs to support all of them without forcing servers to change their authentication model:
| Scheme | Typical use | Rotation mechanism |
|---|---|---|
| Bearer token | Vendor APIs, custom MCP servers | Re-register or rotate via API |
| API key (header or query) | Developer tools, data services | Rotate and propagate from central registry |
| OAuth 2.0 client credentials | Enterprise SaaS, identity-linked resources | Token refresh is automatic; client secret rotation is explicit |
| Custom header credentials | Internal services with proprietary auth | Same as bearer — stored encrypted, injected at call time |
Regardless of scheme, enforce two properties: credentials must be rotatable without touching agent configuration, and they must never appear in logs, error messages, or API responses.
Capability Tokens for Agent-Level Scoping
For fine-grained control over which agents can call which servers, capability tokens provide an additional layer of authorization. A capability token is a short-lived, scoped credential issued to a specific agent for a specific set of operations on a specific server. It encodes the agent's identity, the permitted tools, and an expiry time.
When the agent presents the capability token at the governance layer, the token is verified for signature, expiry, audience, and scope. This is not a replacement for server-level authentication — it is an additional gate that limits what a given agent can do with a server it is otherwise permitted to reach. Think of it as the difference between a door key (server access) and a room key (tool-level scoping within that server).
For more on identity patterns across agents and integrations, see identity and access.
Tool-Level Authorization and Policy Gates
Once authentication is established, the question is what the authenticated caller is permitted to do. For MCP servers, authorization must operate at the tool level.
The Policy Decision
Every tool call passes through a policy gate that returns one of four decisions:
- Allow: the call proceeds immediately.
- Deny: the call is rejected before it reaches the server; the denial is recorded in the audit log.
- Step-up: the call is held pending explicit human approval; the workflow resumes only after an authorized approver confirms.
- Observe: the call proceeds but is flagged in the monitoring feed for asynchronous human review — suitable where blocking is too disruptive but visibility is required.
The gate evaluates the calling agent's trust score, the tool's sensitivity classification, current rate limit counters, and active content guardrail rules simultaneously, so enforcement is consistent regardless of how a call arrives.
Tool Classification
Not all tools carry the same risk. A read-only search tool is materially different from one that writes to a database or executes code. Classifying by risk level enables proportionate controls:
- Read-only tools: allow with standard rate limits; log the call.
- Write tools: require a minimum agent trust score; enforce per-tool budgets; log with full argument capture.
- Destructive or administrative tools: require step-up approval; alert the responsible team; enforce low rate limits regardless of trust level.
When a server's tool list changes after rediscovery, existing classifications are preserved and new tools enter a review state until explicitly classified.
Trust Score Integration
The trust score mechanism connects recent agent behavior to authorization. An agent producing high error rates, triggering guardrail violations, or failing attestation receives a lower score; lower-scoring agents are denied at the gate even when authenticated and holding valid capability tokens.
Trust evaluation must be fail-closed: when the scoring service is unavailable, the gate defaults to deny. A fail-open design provides no security guarantee because the failure path bypasses all controls.
For more on how trust scores work across the agent lifecycle, see AI governance and compliance.
Rate Limiting and Budget Enforcement
Agentic workloads can generate tool call volumes that differ by orders of magnitude from human-driven workflows. A loop that calls a tool in every reasoning step, or a scheduled workflow that fans out to multiple agents, can exhaust a provider's rate limits within minutes. Rate limiting at the governance layer operates at a different level than provider-side limits and serves different purposes.
Per-Tool Rate Limits
Rate limits should be configurable at the tool level, not just the server level. A search tool may tolerate high call volumes while a write tool should rarely be called more than a handful of times per minute. Typical parameters: requests per minute (RPM) to prevent burst abuse, requests per hour (RPH) to catch sustained elevated usage, and a daily ceiling for metered external APIs.
When a limit is hit, the response should include the limit type, current count, and reset time. Agents that receive opaque errors may retry immediately and worsen the situation; structured responses enable correct back-off.
Spend Caps and Cost Attribution
Many MCP server calls invoke upstream services with per-call costs: LLM inference, vector search, external data APIs. A governance layer that tracks cost at the call level enforces spend caps before costs accumulate — denying a call or routing it to human approval when it would breach a configured ceiling. This happens before the spend, not after the invoice arrives.
Cost attribution at the tool, agent, workflow, and organization level feeds FinOps reporting. When cost is visible at that granularity, you can identify which integrations are expensive and optimize them rather than managing AI spend as a single opaque line item. See AI FinOps for more on cost attribution patterns.
Content Inspection and Guardrails
Tool call arguments and results are content. They may contain PII, credentials, competitive information, prompt injection attempts, or data that should not cross organizational boundaries. Content guardrails apply inspection rules to this content in transit, before arguments reach the server and before results are returned to the agent.
Inspecting Tool Call Arguments
Prompt injection is the primary threat in arguments. An adversarially crafted document or user message might include instructions that an agent incorporates into its tool call arguments verbatim, redirecting it to call a different tool, exfiltrate data, or override system prompts. Content inspection at the argument level detects and blocks known injection patterns before they reach the server — not a complete defense, but it eliminates the most common variants.
Inspecting Results
Results flow back into the agent's context and shape its subsequent reasoning. A compromised server could inject instructions into responses that redirect agent behavior. Additionally, a tool querying a database might return more data than expected — including records that should be masked or that belong to other tenants. Result inspection applies the same classification logic to server responses as to arguments, and can redact sensitive fields before they enter the agent's context.
Guardrail Actions
The standard enforcement actions for guardrails apply directly to tool call content:
- Block: reject the call or suppress the result. The agent receives an error indicating the content policy prevented the operation.
- Redact: allow the call to proceed or the result to be returned, but replace matched content with a placeholder. The agent's context receives
[REDACTED]rather than the sensitive value. - Warn and observe: allow the content through unchanged, emit an alert to the monitoring feed, and flag the call for human review. Useful where blocking would disrupt legitimate workflows but you want visibility.
Guardrail rules should be maintained separately from per-server configuration so that organization-wide rules — block credentials in all outbound calls, redact specific PII categories in all results — apply consistently without repeating the configuration on every server.
For more on guardrail design, see the discussion of AI governance and compliance and the AI strategy implications of content policy choices.
Forensic Logging and Tamper Evidence
When something goes wrong — a data breach, a compliance incident, an unexpected expense — the call log is where the investigation starts. The quality of that log determines how quickly and completely you can answer "what happened, who authorized it, and what did the server return."
What to Capture
A forensic tool call record should include, at minimum:
- Calling agent identity and its trust score at call time
- Server and tool identifiers
- Call arguments (encrypted at rest if they may contain PII)
- Policy decision (allow / deny / step-up / observe) and the rules that drove it
- Server response, or the error if the call failed
- Latency, attributed cost, and timestamp with sufficient precision for cross-system correlation
- Workflow or task identifier to group related calls
Capture arguments and results in full — investigations need the actual content, not just metadata. Where content may be sensitive, use encryption at rest and access-controlled retrieval rather than omitting the capture.
Tamper Evidence
A log that can be modified after the fact is not a reliable source of truth for compliance or litigation. Two standard approaches make modification detectable:
Hash chaining: each entry includes the hash of the previous entry. A modification anywhere breaks the chain forward from that point, and the break is detectable on verification.
Asymmetric signing: each entry is signed with a private key. An entry whose signature does not verify against the published public key has been tampered with — and any party can verify independently.
Either approach, or a combination, provides the log integrity basis that auditors and regulators require.
Search and Retrieval
A forensic log that cannot be queried efficiently is difficult to use in practice. Full-text search over call arguments and results enables investigators to find all calls that referenced a specific document or value — useful when the scope of a breach is unclear. Index tool name, policy decision, agent identity, and timestamp to support common query patterns without full table scans. For details on integrating audit logs with external SIEM systems, see platform operations.
Monitoring and Alerting
Real-time monitoring of MCP server activity enables you to detect anomalies while they are happening rather than discovering them in retrospect. The monitoring layer consumes the stream of governance events — tool calls, policy decisions, content flags, rate limit hits — and applies threshold and pattern logic to surface conditions worth human attention.
Usage Analytics per Tool
Aggregate metrics per tool give you the baseline you need to recognize anomalies. Track call volume, error rate, average latency, and cost per tool on a rolling window. When a tool that normally sees ten calls per hour suddenly sees five hundred, the spike is visible immediately rather than after it has already exhausted a rate limit or triggered a cost ceiling.
Per-tool analytics also inform optimization decisions. A high-cost tool called at high volume is a candidate for caching or result reuse. A tool with consistently high latency may need its timeout thresholds adjusted. You cannot make these decisions without the data.
Health Checks
Health checks verify that registered servers remain reachable and authenticate correctly. Run health checks on a schedule and on demand. When a health check fails, the server's status in the registry should update automatically, and any agents attempting to call it should receive a clear error rather than timing out. Alert the team responsible for the server so they can investigate before agents accumulate failures.
Health check results are also useful for incident triage: if a wave of call failures appears in the monitoring feed, checking health status immediately answers whether the problem is in the governance layer or in the server itself.
Connection Pool Observability and Alerting
A managed connection pool needs its own telemetry: track active connections, connection age, reconnection frequency, and pool saturation. A pool near saturation queues calls and degrades latency; repeated reconnects suggest a credential problem or server instability. Both conditions warrant alerts.
Governance events should feed your existing alerting infrastructure rather than create a parallel silo. Webhook integration routes governance alerts into your incident management system using the same escalation logic you already maintain; SIEM forwarding carries the full event stream to your security operations center for correlation.
For more on monitoring patterns for AI operations, see platform operations and the broader discussion at AI agent security.
Multi-Tenancy and Organizational Scoping
In a multi-tenant platform, every server record, tool call log entry, rate limit counter, and analytics data point must be scoped to its owning organization — reads and writes alike. Cross-tenant leakage of MCP call data is a compliance violation regardless of whether data was actually exposed, because the access control model was violated.
Organizational scoping means server discovery is confined to the registering organization's namespace; tool call logs for one organization are not readable by another; rate limit counters are isolated so one organization's burst cannot affect another; and a public server marketplace shows only metadata, never authentication configuration.
When agents operate in a federated context — one organization's agent calling a resource registered by another — the cross-org trust relationship must be explicitly declared and approved by both parties, not assumed. For more on cross-organizational patterns, see AI agent security.
MCP Server Hardening Checklist
Work through this checklist at server registration time and revisit it whenever a server's configuration, the upstream API it wraps, or your organization's risk posture changes.
Authentication
- Every server requires explicit authentication — no anonymous connections
- Credentials are encrypted at rest with envelope encryption and AAD binding; agents reference servers by identifier, never raw credentials
- A tested rotation procedure exists — single operation, no agent reconfiguration required
- OAuth flows use short-lived tokens with automatic refresh; client secrets are rotated on schedule
Network and Transport
- Server URLs are validated against an allowlist blocking private ranges, loopback, and cloud metadata endpoints (SSRF prevention)
- All connections use TLS; plaintext is rejected
- Redirect following is restricted and each hop is re-validated against the same allowlist
Tool Authorization
- Each tool is classified by risk level (read / write / destructive)
- Minimum trust score requirements are set for write and destructive tools
- Step-up approval is required for destructive tools
- Capability tokens are issued per-agent with explicit tool scope and short expiry
- New tools discovered during rediscovery enter a review state before becoming callable
Rate Limits and Budgets
- Per-tool rate limits (RPM, RPH, daily) are configured for expected usage
- Spend caps are set at the tool, agent, and organization level
- Rate limit responses include reset-time information for correct agent back-off
Content Controls
- Guardrail rules apply to both call arguments and server results
- Prompt injection detection is active on all tools accepting user-derived content
- PII redaction rules are configured for results that may return personal data
Forensic Logging
- Every tool call produces a complete record: agent, tool, arguments, policy decision, result, cost, latency
- Log entries are tamper-evident (hash chaining or asymmetric signing)
- Logs are indexed for full-text search; retention and deletion follow data subject request procedures
Monitoring and Health
- Scheduled health checks are configured for every server; failures alert the responsible team
- Per-tool usage metrics are reviewed regularly for anomalies
- Governance alerts route to your existing incident management system
Organizational Controls
- Server registrations are scoped to the correct organization namespace
- Cross-org tool call access is explicitly declared and approved, not assumed
- Access to the forensic log is restricted by role
How Praesidia Implements MCP Governance
Praesidia applies the patterns in this guide as a unified control plane. On server registration, capability discovery runs automatically and the URL is validated against an outbound allowlist before any connection is attempted. Authentication configuration is encrypted at rest and credentials are never returned in API responses.
Every tool call passes through the governance policy gate — evaluating the calling agent's trust score, configured tool policy (allow / deny / step-up / observe), rate limit counters, and active guardrail rules — before reaching the server. Each decision is recorded in a forensic call log with full arguments, result, latency, and attributed cost, with support for hash chaining and asymmetric signing for tamper evidence. Per-tool rate limits, spend caps, and usage analytics surface in the server detail view; health checks run on demand or on schedule; governance events stream to webhooks and SIEM endpoints.
The Praesidia platform provides a working implementation of these controls. The documentation covers configuration in detail, and the security assessment can help identify gaps in your current posture.
Common questions
What is MCP server governance and why does it matter? MCP server governance is the set of controls — authentication, tool-level authorization, rate limiting, content inspection, and forensic logging — that sit between AI agents and the MCP servers they call. It matters because MCP servers commonly expose multiple high-risk capabilities behind a single endpoint, and agents call them autonomously at speeds that make after-the-fact detection ineffective. Governance controls enforce policy before harm occurs rather than discovering violations in the audit log.
Which authentication scheme should I use for MCP servers? The right scheme depends on the server and its upstream service. Bearer tokens and API keys work well for most vendor and custom MCP servers. OAuth 2.0 client credentials are appropriate when the server wraps a service that already supports OAuth and you want token lifetimes managed automatically. In all cases, the credential should be stored centrally with envelope encryption, and rotation should be a single-operation update with no agent reconfiguration required.
How do per-tool rate limits differ from provider-side rate limits? Provider-side rate limits are enforced by the MCP server or the API it wraps, and they apply across all callers equally. Per-tool rate limits in the governance layer are enforced before the call reaches the provider, apply per-organization and per-agent, and can be configured to match your organization's usage patterns independently of what the provider permits. Hitting a provider limit causes errors; hitting a governance-layer limit produces a clear, structured response with retry guidance.
What makes a tool call log forensic rather than just a log? A forensic log captures the full context of each call — arguments, result, policy decision, calling identity, timestamp — and uses cryptographic techniques (hash chaining or asymmetric signing) to make post-hoc modification detectable. It is retained under access controls and supports structured queries. A conventional application log captures that a call happened; a forensic log provides everything needed to reconstruct the exact state at call time and verify the record has not been altered since.
How should I handle a new tool discovered during MCP server rediscovery? New tools should enter a review state automatically rather than becoming immediately callable. This gives you the opportunity to classify the tool by risk level, configure appropriate rate limits and policy decisions, and decide whether existing capability tokens should extend to it. Tools callable by default on discovery expand your attack surface without a deliberate decision — treat new tools the same way you would treat a new endpoint in an API you are onboarding.
How does MCP governance relate to EU AI Act and SOC 2 requirements? The EU AI Act requires documentation of high-risk AI system capabilities, human oversight mechanisms for consequential decisions, and logging sufficient to support post-hoc audits. MCP governance directly addresses these: capability discovery produces the tool inventory, step-up approval implements human oversight gates, and forensic logging provides the audit record. SOC 2 Trust Services Criteria — particularly logical access controls, monitoring of system activity, and change management — map similarly to the authentication, monitoring, and policy gate controls described here. Neither standard is satisfied by governance tooling alone, but a well-implemented governance layer is a prerequisite for meeting them in the context of agentic AI systems. For a detailed treatment see AI governance and compliance.