SOC 2 for AI Platforms: What Auditors Look For

SOC 2 for AI platforms follows the same Trust Services Criteria as any other service, but the attack surface is broader, the data flows are more complex, and the access patterns are harder to explain to an auditor unfamiliar with autonomous agents. The controls that matter most are the ones that answer a single question: can you prove that every action on the platform was authorized, recorded, and traceable to an accountable principal? For a broader compliance framing, see what is AI agent governance? and the AI agent compliance checklist for 2026.

What SOC 2 Actually Measures

SOC 2 is built on the AICPA Trust Services Criteria. Most engagements focus on Security (CC6 through CC9), though Availability, Confidentiality, and Processing Integrity criteria are increasingly requested for AI workloads that make decisions or handle sensitive data.

For auditors, the core question under Security is whether you have reasonable controls over logical and physical access. For AI platforms, that expands quickly: not just "which humans can log in," but "which agents can call which tools, with what scope, with what recorded evidence." Auditors who have worked through a few AI platform assessments are starting to ask exactly those questions.

Access Control: Further Than You Might Think

The CC6 criteria on logical access control are where most AI platforms struggle in their first assessment. The standard asks you to demonstrate that access is restricted to authorized users, that access is reviewed periodically, and that it is removed promptly on termination.

For a human-only SaaS product, this maps to SSO, user provisioning, and offboarding scripts. For an AI platform, the same criteria apply to every principal that can take a privileged action: human operators, application credentials, and agents themselves. An agent holding a long-lived API key and calling a database tool is a principal. If that credential is unscoped, unreviewed, and not revokable quickly, it fails the spirit of CC6 even if every human account is perfectly managed.

The practical requirements that emerge:

Unique credentials per principal. Shared API keys across agents or teams make attribution impossible. Each agent or application needs its own credential with a documented scope tied to an identifiable purpose.
Role-based access with documented mapping. The mapping from job function (or agent capability) to permission set must exist in writing and match system configuration. If a role has access it shouldn't, an auditor will ask why.
Access reviews. Quarterly or annual reviews of who and what has access to sensitive functions. Automated provisioning via SCIM is a strong evidence point because it generates a machine-readable record of every access change. See Automating User Lifecycle with SCIM 2.0 for how that side of the lifecycle works.
MFA for administrative functions. Human operators accessing sensitive configuration should be behind phishing-resistant MFA — auditors increasingly ask about this for AI control plane access specifically.

Audit Logging: The Backbone of Every SOC 2 Finding

CC7 (system monitoring) and the broader detection and response criteria are where audit logging becomes central. Auditors will ask to see that you are collecting logs, that the logs are complete, that you can produce them for a defined period, and that someone is actually reviewing them.

For an AI platform, completeness is the key challenge. A traditional application might log sign-ins, privilege changes, and data exports. An AI platform generates far more events that matter: agent task submissions, tool calls, workflow triggers, configuration changes, budget threshold breaches, and guardrail evaluations. If those events are not collected, you cannot respond to an incident query with anything other than "we don't have that data."

Three properties distinguish audit logs that hold up under scrutiny from logs that look good on a slide deck:

Tamper evidence

A log that can be altered after the fact is not evidence of anything. The standard approach is hash chaining: each log entry includes a cryptographic reference to the previous entry, so any modification breaks the chain in a detectable way. More robust implementations add periodic Merkle roots that summarize batches of records, with the root signed by a key held separately from the logging system itself. Some implementations anchor those roots to public transparency logs, providing external, time-stamped evidence of the log's state at a given point.

When an auditor asks "how do you know these logs haven't been modified," the answer needs to be more than "we restrict access to the database." A verifiable chain that can be checked offline is a much stronger answer. For a detailed look at how this works in practice, see Tamper-Evident Audit Logs with Cryptographic Proofs.

Completeness and retention

Auditors will ask what retention period you apply and whether there are gaps. A 90-day log window for a Type II engagement covering a 12-month period is a problem. At minimum, match your retention period to your audit window. For financial and compliance-sensitive events, a longer retention period is reasonable to budget for.

Gaps are especially problematic for AI platforms where asynchronous jobs and webhook-driven events can fail silently. If your logging pipeline drops events under load or during deployment windows, that will surface as a finding.

Structured, queryable records

During a SOC 2 audit, you will receive requests like "produce all records of privileged access changes between March 1 and June 30." If your audit logs are free-form text, answering those requests takes days and introduces manual error. Structured entries with consistent field names, indexed by action type and principal, make sample request responses fast and accurate.

Change Management and Configuration Evidence

CC8 covers change management: the controls around how changes to the system are authorized, tested, and deployed. For AI platforms, this extends to changes in agent configuration, workflow definitions, guardrail rules, and budget policies, not just code deployments.

Auditors will ask how changes to agent permissions are approved, whether guardrail rule changes go through a review step, and whether you can produce a history of configuration changes with the identity of whoever made them. If your audit log captures configuration changes as first-class events with before/after values and a principal identifier, you have a strong answer to all three. If changes happen via direct database edits or scripts that bypass your application layer, you likely have a gap.

Vendor and Integration Risk

CC9 covers vendor management. For AI platforms, the vendor surface is unusually large: model providers, tool integrations, infrastructure providers, and potentially third-party agents or MCP servers.

Auditors want to see that you have a process for evaluating each vendor relationship and that the data shared with each vendor is documented. For model API providers, this means understanding what data leaves your environment when an agent submits a prompt and what contractual protections are in place. The same logic applies to MCP servers and external tool endpoints: if an agent can call an external tool that processes customer data, that endpoint is a vendor in SOC 2 terms. A registry of those connections with documented scope limits per connection is far easier to defend in an audit than a network diagram and a best-efforts explanation. See Registering and Governing MCP Servers for how a structured connection registry works.

Incident Response and Detection

The CC7 monitoring criteria ask: can you detect a problem, and do you have a tested process for responding to it?

For AI platforms, detection means more than failed login alerts. You need the ability to detect anomalous agent behavior — unusually large data exports, unexpected tool call patterns, budget overruns that don't correspond to any authorized workflow — and route those signals to an operator who can investigate. Alert rules that fire on audit log events and deliver to a documented on-call process give auditors clear evidence of a functioning detection capability. What they will probe is whether those alerts are actually reviewed and whether there is a defined escalation path.

How Audit Architecture Maps to SOC 2 Controls

A well-designed audit architecture addresses these requirements directly. Every sensitive action should emit a structured audit event that is cryptographically signed, hash-chained to the preceding entry, and periodically summarized into signed Merkle roots. Operators should be able to export a self-contained evidence bundle for a date range and verify individual record integrity offline without needing access to the live system.

The access control layer maps directly to SOC 2 expectations: automated provisioning records every access change, role assignments tie to a fine-grained permission model, and every principal — including agents and API consumers — carries a unique, scoped credential. Alert rules over the audit corpus let you define the detection logic your auditors expect to see.

For a broader view of how compliance requirements map to platform capabilities, the AI Agent Compliance Checklist for 2026 covers the key control families across frameworks.

Common Questions

Does SOC 2 require cryptographic integrity on audit logs? SOC 2 does not prescribe specific technical mechanisms. What the criteria require is that you have controls in place to detect unauthorized modification of logs and that those controls are tested. Hash chaining and signing are the most defensible technical answers to that requirement, because they allow independent verification rather than relying on access controls alone.

How far back do audit logs need to go for a Type II assessment? A Type II audit covers a defined period, typically six to twelve months. You need to produce logs for the full period under review. Retain logs for at least as long as your longest anticipated audit period, plus a buffer. If unsure, check with your auditor before the engagement begins rather than discovering a gap mid-review.

Do AI agents count as users for access control purposes? From a SOC 2 perspective, anything that can authenticate and take actions on your system is a principal that needs to be covered by your access control policies. Whether your auditor treats agents the same as human users depends on the sophistication of the engagement, but the trend is clearly toward holding agent credentials to the same standard: unique identifiers, documented scope, periodic review, and prompt revocation when no longer needed.

What is the difference between SOC 2 Type I and Type II for AI platforms? A Type I report attests that controls are suitably designed at a point in time. A Type II report attests that those controls operated effectively over a period — typically six to twelve months. For AI platforms handling sensitive workloads, enterprise customers almost always require Type II, since it provides evidence of consistent operation rather than a design snapshot. Type I is useful as an initial milestone while accumulating the Type II observation window.

Which Trust Services Criteria are most relevant for AI platforms beyond Security? Processing Integrity is increasingly relevant because AI agents make decisions and take actions on behalf of users — auditors want to see that processing is complete, accurate, and authorized. Confidentiality is relevant wherever agents handle customer data. Availability matters if your platform is on the critical path for customer-facing workflows. Most first-time engagements focus on Security; expanding scope to Processing Integrity or Confidentiality is a natural second-year step.