Data Boundaries & RAG Governance

Your agent can read everything it retrieves. Can it read everything it should?

On this page

The Failure Scenario
Why This Matters
How to Implement
Production Checklist
Common Pitfalls
Terminal Output

The Failure Scenario

A B2B SaaS company builds an internal knowledge-base agent. The RAG pipeline indexes Confluence, Google Drive, and a shared Postgres database. Engineering, HR, and finance all use the same agent. An engineer asks the agent, "What's the vacation policy?" The retriever pulls the HR handbook, but it also surfaces a compensation spreadsheet that was in the same Confluence space. The agent helpfully summarizes salary bands for the entire engineering team in its response.

The underlying issue: the vector store has no concept of document-level permissions. Every document was embedded and indexed into a single collection. The retriever performs cosine similarity search across all vectors, regardless of who's asking or what access the original document required. The embedding pipeline treated all documents as equal because nobody told it otherwise.

This isn't a bug in the retrieval algorithm. It's an architectural failure. The system was built without access controls at the retrieval layer, because traditional search engines handle permissions at query time and developers assumed the vector store would too. It doesn't. You have to build it.

Why This Matters

RAG systems collapse the access-control boundary between data storage and data retrieval. In a traditional application, a user queries a database, and the query runs with that user's permissions. Row-level security, role-based access controls, and data-classification tags all gate what comes back. In a RAG pipeline, the agent queries a vector store on behalf of a user, and the vector store returns whatever is semantically closest. Permissions are not part of the similarity calculation.

For multi-tenant systems, this is catastrophic. If Tenant A's documents and Tenant B's documents live in the same vector collection, a well-crafted query from Tenant A can retrieve Tenant B's data. The retriever doesn't know about tenants. It knows about cosine distances. Without explicit filtering, cross-tenant data leakage is a mathematical certainty, not a theoretical risk.

Even in single-tenant systems, data classification matters. An agent answering customer questions should not have access to internal incident reports, board meeting notes, or employee performance reviews, even if those documents happen to be semantically similar to the customer's question. The retrieval layer must enforce the same boundaries that your IAM policies enforce everywhere else.

How to Implement

The solution is metadata-filtered retrieval. When you embed documents, attach metadata that encodes access controls: tenant ID, data classification level, permitted roles, and document owner. At query time, apply a pre-filter that restricts the similarity search to only documents the requesting user or agent is authorized to access. This filter runs before the cosine similarity calculation, not after. You never want unauthorized documents ranked and then filtered out, because token-level information can leak through ranking scores.

For multi-tenant RAG, use separate vector collections per tenant or enforce a mandatory tenant_id filter on every query. Separate collections are simpler and eliminate any risk of filter bypass. Shared collections with metadata filtering are more efficient at scale but require rigorous filter enforcement. A missing filter on a single query path exposes the entire collection.

Implement a PII detection layer between retrieval and context injection. Even when access controls are correct, retrieved documents may contain PII that shouldn't be included in the LLM context. Run a named-entity recognition pass over retrieved chunks and redact or flag PII before it enters the prompt. This is defense in depth. The retriever controls what documents are accessed; the PII filter controls what content from those documents reaches the model.

rag/secure_retriever.py

from dataclasses import dataclass
from vectorstore import VectorStore, QueryFilter

@dataclass
class UserContext:
    user_id: str
    tenant_id: str
    roles: list[str]
    clearance_level: int  # 1=public, 2=internal, 3=confidential, 4=restricted

class SecureRetriever:
    def __init__(self, store: VectorStore, pii_filter):
        self.store = store
        self.pii_filter = pii_filter

    def retrieve(self, query: str, user: UserContext, top_k: int = 5) -> list[dict]:
        # Mandatory filters — never query without these
        filters = QueryFilter(
            must=[
                {"field": "tenant_id", "op": "eq", "value": user.tenant_id},
                {"field": "classification_level", "op": "lte", "value": user.clearance_level},
            ],
            should=[
                {"field": "permitted_roles", "op": "overlap", "value": user.roles},
                {"field": "owner", "op": "eq", "value": user.user_id},
            ],
            minimum_should_match=1,
        )

        results = self.store.similarity_search(query, top_k=top_k, filter=filters)

        # PII scrub before context injection
        for result in results:
            result["content"] = self.pii_filter.redact(result["content"])

        return results

Production Checklist

✓Tag every document at embedding time with tenant_id, classification_level, permitted_roles, and owner metadata
✓Enforce mandatory pre-filters on all vector store queries. No unfiltered similarity search should be possible
✓Use separate vector collections per tenant, or verify that tenant_id filters cannot be bypassed through any query path
✓Implement PII detection and redaction between the retrieval step and the prompt-construction step
✓Audit retrieved chunks in staging by logging what documents are returned for test queries across different user roles
✓Set up a cross-tenant retrieval test: query as Tenant A and verify zero results from Tenant B's documents
✓Classify all indexed documents by sensitivity level and verify that low-clearance users cannot retrieve high-sensitivity docs
✓Add monitoring for retrieval anomalies. A user suddenly retrieving documents from 10 different classification levels is suspicious

Common Pitfalls

The most dangerous pattern is post-retrieval filtering. Teams retrieve the top-K results without access controls, then filter unauthorized documents out of the result set before passing to the LLM. This seems equivalent but isn't. If you retrieve 10 documents and filter 6, you're left with 4, and the agent gets worse context. More critically, the similarity scores of the filtered results leak information about what unauthorized documents exist. An attacker can probe queries and infer the existence and topics of documents they shouldn't know about.

Another common failure is treating embedding pipelines as one-time jobs. Documents get re-classified, permissions change, employees leave and their access should be revoked. If your embedding pipeline ran once six months ago, the metadata is stale. Build an incremental re-indexing pipeline that picks up permission changes from your source systems and updates vector metadata without re-embedding the content.

Teams also underestimate chunking-related access leaks. A document might be classified as internal, but if the chunking strategy splits it across boundaries, a chunk might land in a different context window alongside public chunks. Always inherit the parent document's classification for every chunk, and never mix classification levels within a single retrieval result set.

Terminal Output

terminal

$ clawproof --check 08

  CHECK 08 — Data Boundaries & RAG Governance
  ─────────────────────────────────────────────
  ✓ Document metadata present: tenant_id (100%), classification (100%), roles (98%)
  ✓ Pre-filter enforcement: all query paths require tenant_id filter
  ✗ FAIL: 2.1% of chunks missing permitted_roles metadata — inherited from parent: NO
  ✓ PII redaction layer active between retrieval and prompt construction
  ✓ Cross-tenant isolation test: 0 leaks across 500 probe queries
  ✗ FAIL: Embedding pipeline last ran 47 days ago — metadata may be stale
  ✓ Classification levels enforced: L1-L4 with role-based gating

  Result: 2 issues found — fix metadata gaps and re-index schedule
  Severity: HIGH — incomplete metadata breaks access control guarantees

$ clawproof --related

Referenced In

articlePrompt Injection Is Not a Theoretical Risk

Previous← #07 Evaluation & Regression Testing Next#09 Cost Controls & Rate Limiting →