Bring Your Own Key: Managing LLM Configurations

Model provider keys are credentials, and credentials belong in a vault — not scattered across environment variables on different services, not duplicated in every workflow configuration, and not re-entered every time you switch providers. A centralized LLM configuration layer lets you register one or more provider connections, encrypt the associated key at rest, and select a default so every generation path draws from the same well without you touching it again. This is especially important for platforms where secrets management for AI agents is a cross-cutting concern across multiple features and workflows.

The Problem With Unmanaged Provider Keys

When each part of an AI platform manages its own model provider connection, several things go wrong at once. The same key appears in multiple places. There is no central record of which provider is being used for what. Changing providers or rotating a key means hunting down every reference rather than updating one configuration.

This is the standard multi-vendor key sprawl problem, and it gets worse as the number of AI-assisted features grows. A platform running agentic workflows, document generation, and code analysis may be calling three different providers. Without a unified configuration layer, every one of those paths is configured independently and governed differently.

Centralizing provider configurations also has a governance dimension. Who registered a given provider key, when was it last changed, and is the credential that is actually in use the one your team approved? With per-feature configuration, these questions are hard to answer. With a central registry, they have clear answers. For the broader practice of managing credentials across your AI infrastructure, see Secrets Management for AI Agents.

Registering a Provider Configuration

Creating an LLM configuration requires a name, a provider selection, and a model identifier. The provider list covers the major inference services — OpenAI, Anthropic, Google Gemini, Mistral, Cohere, and Ollama — as well as a custom option for self-hosted or alternative endpoints. For custom deployments, you supply a base URL in addition to the standard fields.

Names must be unique within your organization. If you run multiple configurations for the same provider — for example, a GPT-4o config and a GPT-4o-mini config for cost-tiered routing — each gets a distinct, recognizable name.

The configuration record itself contains no secret material. The API key is handled through a separate action after the initial record is saved, which keeps the key out of the creation payload and ensures it is encrypted the moment it enters the system.

How BYOK Key Storage Works

Storing your own API key follows a specific pattern designed to minimize the window in which the raw key is in memory and to prevent it from ever appearing in logs or API responses.

You submit the key through a dedicated update action on an existing configuration. The platform encrypts it at rest and stores only the ciphertext. What is returned for any subsequent read — through the UI or the API — is a four-character hint derived from the last characters of the original key. The ciphertext itself is excluded from all read paths.

This means that once a key is stored, there is no way to retrieve it through the platform — only to replace it or remove it. If you need to verify the active key, the hint is sufficient to confirm which credential is in use. If you need to update it, you submit the new value through the same action and the old ciphertext is replaced.

The encryption key used to protect stored credentials is managed as part of your platform environment configuration. If that key is rotated, stored credentials need to be re-entered, because ciphertext produced under the old key cannot be decrypted with the new one. This is standard practice for secrets management — rotation at the encryption key level is an intentional break that forces re-validation of stored credentials.

Setting a Default Configuration

When your organization has multiple LLM configurations, one of them can be marked as the default. The default marking is exclusive: setting a new default automatically clears the default flag on the previous one. This prevents any ambiguity about which configuration applies when no explicit selection is made.

The default configuration feeds generation paths that do not receive an explicit provider argument. If your organization has not marked any configuration as the default, the platform falls back to the most recently created configuration. This fallback is deliberate and predictable, but the cleaner operating model is to maintain an explicit default so the behavior is intentional rather than incidental.

Owners can update the default designation at any time. Changing the default takes effect immediately for new generation requests — in-flight requests complete against the configuration they started with.

Supported Providers and Custom Endpoints

The built-in provider list corresponds to the inference services that organizations commonly use in production:

OpenAI — GPT-4o, GPT-4o-mini, o1, o3, and other models in the OpenAI family
Anthropic — Claude models
Google Gemini — Gemini Pro and Flash variants
Mistral — Mistral models hosted on mistral.ai
Cohere — Command and other Cohere models
Ollama — locally or privately hosted open-weight models via the Ollama API
Custom — any OpenAI-compatible endpoint, including self-hosted models, inference proxies, and emerging providers

The custom option is the right choice when you are running an inference gateway in front of multiple models, using a private deployment of an open-weight model, or integrating a provider not yet in the built-in list. Supplying a base URL routes requests to your endpoint rather than the provider's default.

How Configurations Feed Generation Paths

Registered LLM configurations are consumed by platform features that need to call a model on behalf of your organization. The most prominent example is AI workflow generation: when you describe a workflow in natural language and ask the platform to draft it, the generation call draws from your default LLM configuration rather than a shared platform key.

This distinction matters for cost accounting, data residency, and usage policy. When the generation call uses your key, the token spend appears on your provider bill under your account. You can apply your own rate limits and usage policies at the provider level. And for organizations with data residency requirements, routing through a key tied to a specific region or account is how you maintain that control. For a broader view of how token spend is attributed and capped across your agent fleet, see cost control for LLM applications.

The platform resolves the configuration at generation time and uses the stored credential directly for that call. The key is not cached or retained in session state beyond the immediate request.

Least Privilege and Key Lifecycle

Not every member of your organization needs to manage LLM configurations. Key operations — storing, updating, and removing a provider key — require a dedicated key-management permission that sits above the standard integrations access level. Create, update, and delete operations on the configuration record itself require the appropriate integrations permissions and are additionally restricted to owners for deletion.

This separation means you can grant a team member access to view or update the non-secret parts of a configuration — its name, provider, model, and default status — without granting them access to the key management actions. This is the right privilege split for organizations where configuration management and secrets management are handled by different people.

Key removal is a first-class operation. You can detach the stored key from a configuration record without deleting the record itself. This is useful when rotating credentials: remove the old key, store the new one, and the configuration record and its metadata remain intact.

Common questions

Can I use multiple providers and route different workloads to each?

You can register multiple configurations and designate one as the default for general generation paths. Explicit routing to a specific configuration — for example, directing a particular workflow to a specific provider — depends on how each generation path is wired. The default configuration handles cases where no explicit selection is made. For most organizations, maintaining one or two configurations with a clear default is the practical operating model.

What happens if I rotate my provider API key?

Submit the new key through the key update action on the affected configuration. The stored ciphertext is replaced and all subsequent generation calls use the new key. There is no propagation delay — the updated credential takes effect immediately for new requests. If your organization also rotates the platform-level encryption key, stored keys need to be re-entered, as existing ciphertext cannot be decrypted with the new encryption key.

Is the stored key ever returned through the API or visible in logs?

No. The raw key is encrypted on write and the ciphertext is excluded from all read responses. The only value returned is a four-character hint from the original key. Audit events for key operations record that a key was set, updated, or removed, along with the actor and timestamp, but do not include the key material. For related guidance on how credentials are managed across agent integrations, see governed connections between agents and resources.