USER-LEVEL ACCESS CONTROL ARCHITECTURES FOR SECURE MULTI-TENANT RAG SYSTEMS
User-Level Access Control Architectures for Secure Multi-Tenant RAG Systems
· 9 min read
Executive Summary
RAG systems moving into production face serious data security issues because vector search ignores document ownership and access control.
The core challenge: ensure User A cannot retrieve User B’s content in semantic search, while still scaling to thousands–millions of users.
The document compares physical isolation (database-per-user) and logical isolation (RLS, payload filters, partitions, tenants) across PostgreSQL/pgvector, Qdrant, Milvus, and Weaviate, and covers orchestration-layer and authorization patterns.
Decisions
Decision: User-level security in RAG must be enforced before retrieval (pre-filtered search); post-filtering is considered unsafe for multi-user/multi-tenant scenarios.
Decision: Architectural choices depend on scale and compliance:
PostgreSQL + pgvector + Row-Level Security for strict compliance and moderate scale.
Qdrant with payload filtering and tiered multi-tenancy for high-performance, large B2C workloads.
Milvus partition keys and Weaviate tenants for massive-scale, multi-tenant environments.
Decision: User identity and authorization must be propagated end-to-end (from API to retriever to vector DB); “global retriever” patterns are rejected as insecure.
Action Items
Owner? — Choose an access-control architecture (physical vs. logical isolation) appropriate to expected user count, compliance needs, and performance targets.
Owner? — If using PostgreSQL/pgvector:
Design tables with explicit owner/user_id columns.
Enable Row-Level Security and define policies tied to session context (e.g., app.current_user_id or auth.uid()).
Owner? — If using Qdrant:
Store user/tenant metadata in payloads.
Create payload indices (e.g., on user_id) and enforce filters in all search queries.
Owner? — If using Milvus:
Define user_id (or similar) as a partition key in the schema.
Ensure queries always include expressions like user_id == '<user>'.
Owner? — If using Weaviate:
Enable multi-tenancy and create tenants per user/tenant.
Ensure all queries specify .with_tenant("<tenant_id>").
Owner? — Update LangChain/LlamaIndex-based services to:
Avoid global, context-free retrievers.
Use runtime-configurable retrievers (e.g., configurable_fields) to inject per-user filters at request time.
Owner? — For complex enterprise permissions, integrate an external authorization system (e.g., OpenFGA, Permit.io) and adopt a scalable pattern (e.g., group-based metadata denormalization).
Open Questions
How many users and documents are expected, and what are the concrete latency/SLA targets? (Determines whether PostgreSQL is sufficient or if Qdrant/Milvus/Weaviate are required.)
What are the regulatory/compliance requirements (e.g., SOC2, HIPAA, GDPR) that might favor engine-enforced RLS vs. app-enforced filters?
Will the system primarily need simple ownership checks (user_id == owner) or complex, relationship-based permissions (teams, roles, folders, document labels)?
How will “global” or shared documents (e.g., public knowledge base) be modeled across isolation schemes (especially in database-per-user architectures)?
Main Ideas
1. Security Paradox in Vector Search
Traditional databases: access control is a deterministic pre-filter (ACLs, RBAC, RLS).
Vector search (ANN + HNSW, etc.) is agnostic to ownership; it only optimizes for semantic similarity.
Naive RAG:
Single shared index.
Queries like “financial projections” or “salary bands” can return other users’ confidential docs.
Leads to data leakage and “context poisoning” in responses.
Multi-tenancy vs. user-level access:
Multi-tenant (B2B): org-level isolation (e.g., Coca-Cola vs. Pepsi); usually fewer tenants, stronger isolation (separate DBs/containers).
User-level (B2C/intra-org): identity-level isolation; many overlapping permissions, high user counts; heavy physical isolation per user does not scale easily.
2. Risk Types
Unauthorized retrieval (data leakage): user gets private documents they shouldn’t see.
Noisy neighbor: heavy use from one user degrades performance for others; considered a security/availability issue.
3. Access Control Models for RAG
RBAC: permissions by role, stored as metadata (e.g., visible_to_roles).
ABAC: permissions based on attributes (department, location, etc.), evaluated at query time.
ReBAC: permissions derived from relationships (owner, manager, group membership); powerful but harder to implement efficiently in vector search.
4. Pre- vs. Post-Retrieval Filtering
Pre-filtering / native filtered search:
Restricts candidate set before similarity search.
Required for safe user-level isolation.
Post-filtering:
First finds nearest neighbors, then removes disallowed results.
Deemed unsafe for multi-user scenarios due to high leakage risk and skewed distributions.
Consensus: Pre-filtered search is mandatory for secure user-level access control.
Architectural Patterns
5. Physical Isolation: Database-per-User
Each user gets a separate DB/schema (e.g., Neon serverless Postgres).
Workflow:
On signup: provision new DB/branch.
Store embeddings per user in that DB.
Middleware routes user requests to the correct DB connection.
Pros:
Strong isolation, minimal risk of cross-user leakage.
Noisy neighbor problem is confined to each user’s DB.
Easy deletion (drop user’s DB).
Cons:
Operational complexity at large scale (migrations, backups, connection overhead).
Handling shared/global documents is awkward (duplication or secondary DB).
Best suited for: High-value B2B, strong data sovereignty requirements; less ideal for millions of B2C users.
6. Logical Isolation with Shared Infrastructure
6.1 PostgreSQL + pgvector + Row-Level Security
Use mature Postgres security model with vector search.
Key elements:
Table includes user_id (owner).
Enable RLS and define policies restricting access by session context (e.g., current_setting('app.current_user_id')).
Application sets the session variable based on authenticated user before running vector query.
Benefits:
Engine-enforced; protects even if queries are missing explicit WHERE user_id = ....
Aligns with compliance-heavy environments.
Integrations:
Supabase maps HTTP auth tokens to RLS (auth.uid()).
When permissions change, compute and store access attributes (e.g., allowed groups) in document metadata.
At query time:
Ask FGA which groups the user belongs to.
Filter on small group set (e.g., allowed_groups IN [...]) in vector DB.
Scales better, because filters depend on groups (small), not document counts (large).
Performance Considerations
11. Impact of Filtering on Vector Search
Pre-filtering may cause non-linear performance behavior:
High selectivity (tiny subset): often fast; may brute-force small candidate sets.
Medium selectivity: can be slowest; HNSW traversal is disrupted by many filtered-out nodes.
Low selectivity (most of data visible): graph behaves normally; fast.
DB-specific optimizations:
Qdrant: cardinality estimation and switching strategies (payload index scans vs. HNSW).
Elasticsearch: bitset caching for fast repeated filtered queries.
Milvus: partition keys to limit search space.
Comparative Overview
12. Tool Comparison (User-Level Access Control)
PostgreSQL + pgvector
Isolation: Row-Level Security (logical).
Security: Engine-enforced, high assurance.
Complexity: Low (SQL policies).
Scalability: High with proper indexing; less effective for extreme-scale high-traffic workloads.
Best for: Compliance-heavy systems, B2B/B2C hybrids where SQL familiarity is strong.
Qdrant
Isolation: Payload filtering; tiered multi-tenancy via shards.
Security: High, but app-enforced (filters must always be applied).
Scalability: Very high; good for high-throughput B2C.
Best for: Real-time, large-scale systems needing dynamic filtering and noisy-neighbor mitigation.
Milvus
Isolation: Partition keys; logical/physical via hashed partitions.
Security: High when filters are used consistently.
Scalability: Very high for large user counts.
Best for: Massive-scale multi-tenant knowledge bases.
Weaviate
Isolation: Tenants mapped to physical shards.
Security: High; shard-based separation.
Scalability: High, aided by tenant cold/offload states.
Best for: Enterprise SaaS platforms needing fine-grained multi-tenancy with resource management.
Conclusion
User-level access control in RAG is an architectural concern spanning ingestion, storage, retrieval, and orchestration layers.
Post-filtering is inadequate for secure multi-user setups; pre-filtered, identity-aware retrieval is mandatory.
PostgreSQL RLS is a strong starting point for secure RAG; specialized vector databases become important at larger scales or stricter latency requirements.
Future architectures will likely combine:
An authorization “agent” to determine allowed data scope.
A retrieval “agent” operating over a pre-filtered, identity-constrained vector space.