LLM Integration Without Leaking the Crown Jewels

Most LLM integrations leak more data than intended. Here's how to enforce data boundaries, scope retrieval, and keep sensitive data out of model context.

In May 2023, Samsung engineers pasted proprietary source code into ChatGPT to help debug hardware controllers. The data went directly into OpenAI's training pipeline. Samsung banned generative AI tools enterprise-wide within weeks — but the data was already gone.

Samsung had money, lawyers, and a sophisticated IT organization. They still got caught flat-footed. If you are integrating an LLM into your business processes today, the question is not whether to build guardrails. It is how to build them before the incident that forces your hand.

Why LLM Integrations Leak More Than You Think

The root cause is usually not malice — it is architectural naivety. Most organizations connect their LLM to a knowledge base, give it broad retrieval access, and assume the model will "know" what to share. It will not. The model shares whatever lands in its context window.

Three failure modes show up repeatedly:

Over-permissioned retrieval. A customer service bot connected to a vector database of all internal documents can retrieve HR policies, pricing negotiations, and technical architecture notes — and surface them to any user who asks the right question.

Implicit data leakage through prompts. When a system prompt includes sensitive business logic — "you are the assistant for AcmeCorp, our margin threshold is 23%, never go below it" — that logic can often be extracted through prompt injection or creative user queries.

Third-party model training. Free and low-cost LLM tiers often include training data rights by default. Any input sent to those models may be used to improve them. The Samsung incident is the canonical example, but it happens continuously at smaller scale across thousands of organizations.

The Architecture That Prevents It

Secure LLM integration is not a single control — it is a stack of overlapping defenses.

Data Boundary Enforcement

Start with retrieval scoping. Every document, chunk, or record in your knowledge base needs a classification label: public, internal, confidential, restricted. The retrieval pipeline must filter by user role before returning results to the model. This is tenant isolation applied to RAG (retrieval-augmented generation).

A user in the sales role asking the assistant a pricing question should retrieve only documents tagged for sales access. They should never retrieve documents tagged as board-only or HR-confidential — even if those documents contain relevant keywords.

PII and Credential Redaction

Before any data reaches the LLM context window, it should pass through a redaction layer. This layer identifies and strips:

Personal identifiable information (names, SSNs, email addresses)
Credentials and API keys
Regulated data (PHI, PCI data, ITAR-controlled technical data)
Internal reference numbers and account identifiers

Microsoft's EchoLeak vulnerability (CVE-2025-32711) demonstrated why this matters in a Microsoft 365 Copilot context: an attacker could craft a document that, when retrieved into Copilot's context, would exfiltrate the contents of other documents the user had access to. The redaction layer would not have prevented the injection itself, but it would have reduced the blast radius significantly by limiting what sensitive data was available to leak.

Output Filtering

Redacting inputs is necessary but not sufficient. Outputs also need inspection. An output filter should catch:

Direct regurgitation of sensitive fields (credentials, PII)
Content that reveals confidential business logic
Responses that answer questions the user's role should not be able to ask

This is not about censorship — it is about ensuring the model does not inadvertently leak information that arrived in its context from a different access tier.

Model Selection and Data Residency

Not all LLM APIs are equivalent on data handling. Enterprise tiers of the major platforms (Azure OpenAI Service, AWS Bedrock, Google Vertex AI) offer contractual data protection, no training-data rights by default, and regional data residency options. When integrating into regulated industries — healthcare, financial services, legal — this is the non-negotiable tier.

Free and consumer-grade API tiers are appropriate for development and testing, not for production workloads with real customer or employee data.

Tenant Isolation for Multi-Tenant Deployments

If you are building an AI product that serves multiple customers from a shared infrastructure, tenant isolation is the single most important security control you need to get right before launch.

Each tenant's data should be:

Stored in logically or physically separate namespaces
Retrieved only when the authenticated request belongs to that tenant
Never co-mingled in shared context windows

The failure mode is not theoretical. RAG implementations that share a single vector namespace across tenants regularly produce cross-tenant data leakage under adversarial query conditions — where a user crafts a query specifically designed to retrieve documents from an adjacent tenant's namespace.

Practical Checklist Before You Go Live

Use this as a pre-launch review:

[ ] Every data source connected to your LLM has a classification label
[ ] Retrieval is scoped by user role, not just by keyword relevance
[ ] A PII/credential redaction layer sits between your data store and the model
[ ] Output filtering is in place for sensitive field regurgitation
[ ] You are using an enterprise API tier with contractual data protections
[ ] Audit logging captures every query, retrieved context, and model response
[ ] You have tested for prompt injection against your system prompt

LLM integration done well is a genuine competitive advantage. Done poorly, it is a data breach waiting for a news cycle.

Building an LLM integration and want a security architecture review before it goes live? Talk to JP Stratton.

LLM Integration Without Leaking the Crown Jewels

Why LLM Integrations Leak More Than You Think

The Architecture That Prevents It

Data Boundary Enforcement

PII and Credential Redaction

Output Filtering

Model Selection and Data Residency

Tenant Isolation for Multi-Tenant Deployments

Practical Checklist Before You Go Live

Related insights.

Zero Trust for AI Agents: Practical Patterns for Least Privilege Copilots

Prompt Injection Is the New SQL Injection

The Mythos Wake-Up Call: What the Powell-Bessent Bank CEO Meeting Means for Your Business