Secrets Management

API keys in prompts, tokens in logs. Zero secret sprawl or bust.

On this page

The Failure Scenario
Why This Matters
How to Implement
Production Checklist
Common Pitfalls
Terminal Output

The Failure Scenario

A developer wires up a customer-service agent that needs to call a payment API. The quickest path is to paste the API key into the system prompt as an instruction: "Use this key when calling the payments endpoint." The agent works. It ships. Three weeks later, a prompt-injection attack extracts the full system prompt, API key included, and posts it to a public Discord. The key has full read-write access to the payment gateway.

The breach doesn't stop there. Because the agent logs every conversation turn for debugging, the API key now sits in plaintext across three logging backends: CloudWatch, Datadog, and a GCS bucket that the analytics team has broad read access to. Rotating the key fixes the immediate exposure, but nobody audits the 90 days of logs that already contain it.

This isn't a hypothetical edge case. It's the default outcome when teams treat agent prompts like application config files. Secrets in prompts become secrets in every downstream system that touches those prompts.

Why This Matters

LLM-based agents are uniquely dangerous for secret management because their context window is, by design, a data structure that gets sent to a third-party API. Every token in the prompt is transmitted over the wire to the model provider. If a secret is in the prompt, it is no longer your secret. It is shared with your provider's infrastructure, your logging pipeline, and potentially any attacker who can manipulate the agent's output.

Traditional applications keep secrets in environment variables or vaults and reference them at runtime. The secret never appears in application-layer logs because it is never a string the application prints. Agents break this pattern because developers conflate "instructing the agent" with "configuring the runtime." The agent doesn't need to know the API key; the tool-calling layer does.

The blast radius of a leaked agent secret is also larger than a leaked application secret. A compromised API key in a traditional service affects that service's scope. A compromised key in an agent that has tool-use capabilities can be leveraged by the agent itself if an attacker gains control of the conversation through injection. The key becomes both the credential and the weapon.

How to Implement

The core principle is separation: the agent reasons about what tool to call and with what arguments, but the execution layer injects credentials at call time. The agent never sees, returns, or logs the secret. Your tool-calling runtime is responsible for authentication, not the LLM. When the agent decides to call `process_payment(amount=49.99, currency="USD")`, your runtime intercepts that call, pulls the payment API key from a vault, attaches it as a header, and forwards the request. The agent's context contains zero credential material.

Use a secrets manager (HashiCorp Vault, AWS Secrets Manager, GCP Secret Manager, or even a properly scoped .env that never touches the prompt builder) and inject at the HTTP layer. Your tool executor should have a credential map that resolves references at runtime. Pair this with a log-scrubbing pipeline that regex-matches known secret patterns (Bearer tokens, AWS key formats, API key prefixes) and redacts them before persistence.

For teams using framework-level tool definitions, declare secrets as runtime-resolved references rather than static values. The config below shows a tool definition where the API key is a vault reference, not a literal.

tools/payment_tool.yaml

# Tool definition — secrets are vault references, never literals
name: process_payment
description: "Charge a customer via the payment gateway"
endpoint: "https://api.payments.example.com/v1/charges"
method: POST
auth:
  type: bearer
  token: "vault://secrets/payment-api/production#api_key"
  rotate_interval: 24h

# Log scrubbing rules applied before any persistence
log_redaction:
  patterns:
    - "sk_live_[a-zA-Z0-9]{24,}"
    - "Bearer [a-zA-Z0-9\-._~+/]+=*"
    - "AKIA[0-9A-Z]{16}"
  replacement: "[REDACTED]"

# Runtime injection — agent context never contains these
runtime_headers:
  X-Idempotency-Key: "generated"
  Authorization: "resolved_at_call_time"

Production Checklist

✓Audit every system prompt and tool description for hardcoded secrets (grep for key patterns like sk_, AKIA, Bearer, and password=)
✓Implement a tool-execution layer that injects credentials at call time, never passing them through the LLM context
✓Configure log redaction rules for all known secret formats before logs hit any persistence layer
✓Set up automated secret scanning on prompt templates and agent config repos (use tools like truffleHog or gitleaks)
✓Rotate all agent-used credentials on a schedule no longer than 30 days, with automated rotation preferred
✓Ensure vault access is scoped per-agent. An email-drafting agent should not have access to payment API keys
✓Test prompt-extraction attacks against your agents and verify no secrets appear in extracted prompts
✓Add a CI check that fails the build if any file in the prompt-templates directory matches a secret regex
✓Monitor vault access logs for anomalous patterns. A single agent pulling 50 different secrets is a red flag

Common Pitfalls

A failure pattern we see all the time is "secret laundering": moving the key from the system prompt to a tool description and calling it fixed. If the tool description is part of the context window, the secret is still in the prompt. The only safe boundary is the tool execution layer, which is code that runs after the LLM returns a tool-call request and before the actual HTTP request fires.

Another frequent failure is trusting the model not to output secrets. Teams will inject a secret and add an instruction: "Never reveal the API key." This is not security. It is a suggestion to a statistical model. Prompt-injection techniques routinely bypass such instructions. If the secret is in the context, assume it is exfiltrable.

Finally, teams forget about intermediate representations. If your agent framework serializes the full conversation state to Redis for session management, and that state includes tool results that echoed back credentials, you now have secrets in your cache layer. Scrub before serialization, not after.

Terminal Output

terminal

$ clawproof --check 06

  CHECK 06 — Secrets Management
  ─────────────────────────────────────────────
  ✓ No hardcoded secrets found in prompt templates (0 matches)
  ✓ Tool executor uses vault-backed credential injection
  ✓ Log redaction rules active for 4 secret patterns
  ✗ FAIL: Tool "slack_notify" has Bearer token in description field
  ✗ FAIL: Session serializer does not scrub before Redis write
  ✓ Secret rotation policy: 14-day cycle, automated
  ✓ CI secret-scan hook enabled on prompt-templates/

  Result: 2 issues found — fix before deploy
  Severity: HIGH — secrets in agent context are exfiltrable

$ clawproof --related

Referenced In

articlePrompt Injection Is Not a Theoretical Risk playbookHardening OpenAI Function Calling Agents

Previous← #05 Rollback & Kill Switches Next#07 Evaluation & Regression Testing →