GreatAIPrompts 前天 20:20
构建多租户AI平台:Weam的数据隔离与安全策略
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文深入探讨了构建多租户AI平台Weam时所面临的数据隔离挑战以及具体的解决方案。文章强调了在应用开发的每个层面,从数据库查询、文件存储到API访问、实时连接,乃至向量嵌入,都必须严格执行公司级别的隔离。核心原则是将`companyId`作为首要的隔离边界,并详细介绍了如何通过会话管理、查询层面的强制过滤、向量数据库的元数据过滤、文件存储的目录结构划分、Socket.IO的实时通信隔离以及内部的角色权限控制来实现全面的数据安全。此外,文章还分享了测试多租户机制的关键策略和常见的易错点,并强调了监控和审计的重要性,最终指出多租户设计应贯穿于产品开发的始终。

🔑 **公司ID作为核心隔离边界**: Weam平台将每个数据实体(用户、对话、文档、代理等)都与一个唯一的`companyId`关联,以此作为最根本的数据隔离依据。所有数据库模型都强制包含`companyId`字段,并且在查询时必须以此字段进行过滤,确保数据只能在同一公司内部流通。这一点通过数据库Schema设计和索引优化得到保障。

🛡️ **多层面安全防护机制**: 除了数据库层面的隔离,Weam还通过以下方式强化安全:1. **会话管理**: 使用`iron-session`存储用户会话信息,包含`companyId`,所有API请求均需通过会话验证。2. **查询强制过滤**: 开发者必须在每次数据库操作中包含`companyId`过滤,或通过封装的Repository模式自动注入。3. **向量数据库隔离**: 利用Pinecone等向量数据库的元数据过滤功能,将`companyId`作为查询条件,防止向量搜索结果混淆。4. **文件存储与访问控制**: 文件按公司目录结构存储,并对文件URL生成和访问进行`companyId`验证。5. **实时通信隔离**: 通过Socket.IO的房间机制,确保WebSocket消息仅在同一公司内部广播。

🚀 **角色权限与测试审计**: 在公司级别隔离的基础上,Weam还实现了公司内部的用户角色(User, Manager, Admin)权限管理。平台还通过严格的跨公司数据访问测试、缺失过滤测试、会话劫持模拟等方式来验证多租户隔离的有效性。同时,详细的访问日志记录和审计机制,能够追踪所有数据访问行为,便于排查问题和满足合规性要求。

⚠️ **规避常见陷阱与持续改进**: 文章提醒开发者注意避免在聚合管道中遗漏`companyId`过滤、不验证用户提供的ID、以及在错误信息中泄露敏感信息等常见错误。Weam将多租户设计视为平台的基础,并在开发新功能时始终考虑其对公司边界的尊重,确保安全性和隔离性。

💡 **多租户设计理念**: 文章强调,多租户架构并非事后添加,而应从项目初期就融入数据模型和开发流程。通过公司作用域的Repository模式、向量存储的元数据过滤以及基于会话的访问控制,可以构建坚实的多租户基础,并使这些模式成为开发人员的第二天性。

When you’re building an AI platform that serves multiple companies, you can’t just throw everyone’s data into the same bucket and hope for the best. Company A shouldn’t see Company B’s conversations, documents, or custom agents. Ever.

This sounds obvious, but getting it right is tricky. You need to think about isolation at every layer: database queries, file storage, API access, real-time connections, and even vector embeddings. Miss one spot and you’ve got a security nightmare.

Let me explain how we built multi tenant AI platform; Weam, starting from the database and moving up through the application stack.

The Foundation for Multi Tenant AI platform

Here’s the core principle: every piece of data in Weam belongs to a company. Not a user, not a workspace, but a company. This companyId becomes your primary isolation boundary.

When you sign up for Weam, we create a company for you. This identifier is tagged on every user, brain, document, agent, and chat message. It’s not optional, and it’s not nullable.

// MongoDB Schema Exampleconst ChatMessageSchema = new Schema({content: { type: String, required: true },  messageType: { type: String, enum: ['user', 'assistant', 'system'] },  companyId: { type: ObjectId, required: true, index: true },  createdBy: { type: ObjectId, required: true },  session: { type: ObjectId, required: true },  // ... other fields});// Critical: Index on companyId for query performanceChatMessageSchema.index({ companyId: 1, session: 1 });

Notice the index on companyId. Every query that touches this collection will filter by company, so you want that lookup to be fast.

Session Based Access Control

We use iron-session for managing user sessions. When you log in, your session stores your companyId along with your user ID and role. This session data becomes the source of truth for every API request.

Note

We use iron-session, a lightweight session management library for Next.js that stores encrypted session data in cookies perfect for server-side environments without external session stores.

// Session Structure{  _id: "user-id-here",  email: "user@company.com",  roleCode: "USER",  companyId: "company-id-here"}

Before any operation happens, we check the session. No session means no access. Wrong companyId means no access. It’s that simple.

Here’s what the middleware looks like:

// Middleware for Protected Routesasync function checkAccess(req, res, next) {  const session = await req.session.get();    if (!session || !session._id) {    return res.status(401).json({ error: 'Unauthorized' });  }  // Attach company context to request  req.user = {    userId: session._id,    companyId: session.companyId,    role: session.roleCode  };  next();}

Every protected endpoint uses this middleware. No exceptions.

Query-Level Isolation

The session gives you the companyId, but you still need to use it correctly in every query. This is where developers often mess up. They write a query that forgets to filter by company, and suddenly, there’s a data leak.

We enforce this pattern everywhere:

// WRONG - Missing company filterconst chats = await ChatMessage.find({ session: sessionId });// RIGHT - Always filter by companyconst chats = await ChatMessage.find({   companyId: req.user.companyId,  session: sessionId });For extra safety, we built repository classes that automatically inject the companyId:class ChatRepository {  constructor(companyId) {    this.companyId = companyId;  }  async findMessages(sessionId) {    return await ChatMessage.find({      companyId: this.companyId,      session: sessionId    });  }  async createMessage(data) {    return await ChatMessage.create({      ...data,      companyId: this.companyId    });  }}// Usage in route handlerconst chatRepo = new ChatRepository(req.user.companyId);const messages = await chatRepo.findMessages(sessionId);

Note

Repository classes are a clean architecture practice that wrap direct database access, enforcing consistent data rules like always including companyId filters.

This pattern makes it harder to accidentally write an unscoped query.

Vector Database Isolation

Documents in Weam get chunked and stored in Pinecone for semantic search. But vector databases don’t have built-in multi-tenancy. You need to handle it yourself using metadata filters.

When we store embeddings, we attach the companyId and agentId as metadata:

// Storing Vectors with Metadataawait pinecone.upsert({  vectors: [{    id: `chunk-${chunkId}`,    values: embedding,    metadata: {      companyId: companyId,      agentId: agentId,      fileId: fileId,      chunkIndex: index,      content: chunkText    }  }]});When querying, we filter by this metadata:// Querying with Company Isolationconst results = await pinecone.query({  vector: queryEmbedding,  topK: 5,  filter: {    companyId: { $eq: req.user.companyId },    agentId: { $eq: agentId }  }});```

Without that filter, you’d get results from other companies. That’s a major security problem.

Note

Vector databases like Pinecone store numerical embeddings of text for semantic search. Since they don’t support multi-tenancy natively, we rely on metadata filters to scope queries to each company

File Storage and Access

We use either MinIO or S3 for file storage. Files get organized by company in the bucket structure:

bucket-name/  company-abc123/    files/      document1.pdf      document2.docx  company-xyz789/    files/      report.pdfWhen generating presigned URLs or serving files, we verify the requesting user's companyId matches the file's company:async function getFileUrl(fileId, req) {  const file = await File.findOne({    _id: fileId,    companyId: req.user.companyId  // Verify ownership  });  if (!file) {    throw new Error('File not found');  }  // Generate presigned URL  return await s3.getSignedUrl('getObject', {    Bucket: process.env.AWS_S3_BUCKET,    Key: file.s3Key,    Expires: 3600  });}

No company check means no file access.

Real-Time Isolation with Socket.IO

Chat responses stream over WebSockets using Socket.IO. When a user connects, we authenticate their socket connection and store the companyId in the socket’s metadata:

io.use(async (socket, next) => {  const session = await getSession(socket.request);  if (!session || !session.companyId) {    return next(new Error('Authentication failed'));  }  socket.companyId = session.companyId;  socket.userId = session._id;  next();});When emitting events, we can filter by company:// Emit to all sockets in a companyio.to(`company-${companyId}`).emit('notification', data);// Or just to a specific user in that companyio.to(`user-${userId}`).emit('message', data);

This prevents cross-company event leakage in real-time communication.

Note

Each authenticated socket automatically joins a company-{id} room after validation, so messages stay scoped to that company.

Role-Based Access Within Companies

Multi-tenancy handles company isolation, but you also need role-based access control within each company. Weam has three roles: User, Manager, and Admin.

The check-access endpoint validates both company membership and role permissions:

async function checkAccess(userId, resourceType, requiredRole) {  const user = await User.findById(userId);  if (!user) {    return { allowed: false, reason: 'User not found' };  }    // Check if user has required role  const hasPermission = hasRequiredPermission(user.role, requiredRole);  return {     allowed: hasPermission,    companyId: user.companyId,    role: user.role   };}

Testing Multi-Tenancy

You can’t just assume your isolation works. You need to test it. Here’s what we test:

    Cross-company data access attempts: Try to query data with a different companyIdMissing company filters: Deliberately remove filters and verify queries failSession hijacking: Attempt to modify session data to access other companiesVector search leakage: Query vectors without metadata filters
// Example test casedescribe('Multi-tenant isolation', () => {  it('should not return data from other companies', async () => {    const company1 = await createCompany();    const company2 = await createCompany();    const user1 = await createUser({ companyId: company1._id });    const user2 = await createUser({ companyId: company2._id });     const chat1 = await createChat({       companyId: company1._id,      createdBy: user1._id     });    // User2 should not see Company1's chat    const result = await ChatMessage.find({      companyId: company2._id  // User2's company    });    expect(result).not.toContainEqual(chat1);  });});

Common Pitfalls

Here are the mistakes we’ve seen (and fixed):

Forgetting to filter aggregation pipelines: Aggregations need the companyId filter in the first stage:

// WRONGconst stats = await Message.aggregate([  { $group: { _id: '$session', count: { $sum: 1 } } }]);// RIGHTconst stats = await Message.aggregate([  { $match: { companyId: new ObjectId(companyId) } },  { $group: { _id: '$session', count: { $sum: 1 } } }]);

Using user-provided IDs without validation: Never trust user input for cross-references:

// Validate that the resource belongs to the user's companyconst brain = await Brain.findOne({  _id: req.body.brainId,  companyId: req.user.companyId});if (!brain) {  return res.status(403).json({ error: 'Access denied' });}

Leaking data in error messages: Don’t reveal whether resources exist in other companies:

// BAD - Reveals that the resource existsif (!resource) {  return res.status(404).json({ error: 'Resource not found' });}if (resource.companyId !== req.user.companyId) {  return res.status(403).json({ error: 'Access denied' });}// GOOD - Same response for both casesif (!resource || resource.companyId !== req.user.companyId) {  return res.status(404).json({ error: 'Resource not found' });}

Monitoring and Auditing

We log all access attempts with company context:

logger.info('Data access', {  userId: req.user.userId,  companyId: req.user.companyId,  resource: 'chat-messages',  action: 'read',  timestamp: new Date()});

This audit trail helps catch isolation bugs in production and provides compliance documentation.

The Bottom Line

Multi-tenancy isn’t something you bolt on later. It needs to be in your data model from day one. Every query, every file access, every WebSocket message needs company scoping.

The good news is that once you get the patterns right, they become second nature. Company-scoped repositories, metadata filtering in vector stores, and session-based access control give you a strong foundation.

Just remember: every new feature needs to answer the question “how does this respect company boundaries?” If you can’t answer that, you’re not ready to ship it.

Frequently Asked Questions

1. What does “multi-tenancy” mean in an AI platform?

2. Why is isolation important in multi-tenant AI systems?

Multi-tenancy is a software architecture where a single application serves multiple customers—called tenants—from a shared infrastructure. Each tenant’s data, users, and configurations are logically separated, even though they’re using the same underlying codebase and servers.

In SaaS or AI platforms like Weam, this architecture allows efficient scaling and centralized updates — but it introduces a major security responsibility: isolation.

3. How do you enforce isolation at the database layer?

If company data isn’t properly isolated in a multi-tenant system, it can lead to:

    Compliance violations – Breaches of GDPR or SOC 2 requirements.Data leaks – One company’s users see another’s data.Unauthorized access – Sessions or APIs allow cross-tenant access.File exposure – Incorrect file validation reveals private documents.Vector leaks – AI embeddings mix data between companies.

4. How does Weam implement company-level data isolation?

Weam tags every entity (users, agents, chats, documents) with a companyId.

5. How is user access controlled per company?

Through session-based authentication (iron-session), middleware checks, and request validation.

The post Building a Multi Tenant AI Platform: How Weam Handles Isolation and Security appeared first on Weam - AI For Digital Agency.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

多租户 AI平台 数据隔离 安全 Weam Multi-tenancy AI Platform Data Isolation Security
相关文章