Five subtle ways your LLM audit log can still contain PII

Posted at 2026-04-24

You've stripped PII from prompts before they reach the model. You have audit logs proving it. And yet - the logs might still contain PII, just through a side door you didn't think to close.

This post walks through five specific gaps we found and fixed in CloakLLM v0.6.4. Each one is a real path where original PII could reach a log that's supposed to be clean.

1. Exception messages in server logs

The most common one. Your tool handler catches an exception and logs it:

logger.error("sanitize failed: %s: %s", type(e).__name__, e)

The second %s calls str(e) on the exception. If the underlying library raises with context - a validation detail, a malformed input excerpt - that context lands in your log.

The fix is simple: log the type, not the message. Add an opt-in flag for debugging:

# Default: type only
logger.error("sanitize failed: %s", type(e).__name__)

# CLOAKLLM_DEBUG=1: full detail
if os.getenv("CLOAKLLM_DEBUG"):
    logger.error("sanitize failed: %s: %s", type(e).__name__, e)

2. Drift between your schema validator and your data model

CloakLLM validates every audit entry against an allow-list before writing it. The allow-list was defined as a frozenset - maintained by hand, independently of the AuditEntry dataclass it was supposed to reflect.

The problem: add a field to the dataclass without updating the frozenset, and writes silently fail. Add to the frozenset without updating the dataclass, and you get a TypeError at log time. Both are silent failure modes that could let a malformed entry through.

# Before - two sources of truth, drift guaranteed
_ENTRY_ALLOWED_KEYS = frozenset({
    "timestamp", "entry_hash", "prev_hash", ...  # hand-maintained
})

# After - one source of truth
_ENTRY_ALLOWED_KEYS = frozenset(
    f.name for f in dataclasses.fields(AuditEntry)
)

3. MCP clients that pass PII in metadata fields

In an MCP tool call, model and provider are just strings. A client could pass anything:

{ "model": "alice@example.com", "provider": "123-45-6789" }

Those strings flow through to the audit log's model and provider fields, which the schema validator permits. The no-PII guarantee breaks through a field that looks like infrastructure metadata.

The fix: scan those fields with the same PII patterns used for the content itself before they touch the audit logger:

for field_name, value in (("model", model), ("provider", provider)):
    err = _validate_short_string(value, field_name)
    if err:
        return {"error": err}

4. Timing side-channels on hash comparison

verify_chain compared SHA-256 hashes with !=. For deployments that expose audit verification over a public API, that's a microsecond-level timing signal on the comparison itself.

Python's hmac.compare_digest runs in constant time regardless of where the strings diverge. JS needs a length pre-check before crypto.timingSafeEqual - a length mismatch would short-circuit the constant-time path otherwise:

// Length mismatch reveals information - check length first
if (stored.length !== recomputed.length) return false;
return crypto.timingSafeEqual(
  Buffer.from(stored, 'hex'),
  Buffer.from(recomputed, 'hex')
);

5. Token maps evicted mid-session

This one doesn't leak PII - it breaks the desanitization path entirely, which in some failure modes could result in tokenized text reaching downstream systems unchanged.

The JS middleware was evicting token maps by creation time. A map created at T=0 expired at T=300s even if it was actively used at T=299s. Multi-turn conversations longer than 5 minutes would silently lose their token maps.

The fix is to reset the clock on each use, not just at creation:

// On each access, refresh the TTL
entry.lastAccessed = Date.now();

// Evict by idle time, not age
if (Date.now() - entry.lastAccessed > MAP_TTL_MS) evict();

The common thread

All five gaps share the same shape: the primary protection was correct, but a secondary path bypassed it. The exception logger bypassed the no-PII log guarantee. The hand-maintained frozenset drifted from the dataclass. The metadata fields bypassed the content scanner. The timing comparison bypassed constant-time equality. The creation-time TTL bypassed session continuity.

Good security invariants need secondary path audits, not just primary path correctness.

Try it

pip install -U cloakllm cloakllm-mcp
npm install cloakllm@latest

Source: github.com/cloakllm
Docs: cloakllm.dev

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up