🏗️ feat: bulkWrite isolation, pre-auth context, strict-mode fixes (#12445)

* fix: wrap seedDatabase() in runAsSystem() for strict tenant mode seedDatabase() was called without tenant context at startup, causing every Mongoose operation inside it to throw when TENANT_ISOLATION_STRICT=true. Wrapping in runAsSystem() gives it the SYSTEM_TENANT_ID sentinel so the isolation plugin skips filtering, matching the pattern already used for performStartupChecks and updateInterfacePermissions. * fix: chain tenantContextMiddleware in optionalJwtAuth optionalJwtAuth populated req.user but never established ALS tenant context, unlike requireJwtAuth which chains tenantContextMiddleware after successful auth. Authenticated users hitting routes with optionalJwtAuth (e.g. /api/banner) had no tenant isolation. * feat: tenant-safe bulkWrite wrapper and call-site migration Mongoose's bulkWrite() does not trigger schema-level middleware hooks, so the applyTenantIsolation plugin cannot intercept it. This adds a tenantSafeBulkWrite() utility that injects the current ALS tenant context into every operation's filter/document before delegating to native bulkWrite. Migrates all 8 runtime bulkWrite call sites: - agentCategory (seedCategories, ensureDefaultCategories) - conversation (bulkSaveConvos) - message (bulkSaveMessages) - file (batchUpdateFiles) - conversationTag (updateTagsForConversation, bulkIncrementTagCounts) - aclEntry (bulkWriteAclEntries) systemGrant.seedSystemGrants is intentionally not migrated — it uses explicit tenantId: { $exists: false } filters and is exempt from the isolation plugin. * feat: pre-auth tenant middleware and tenant-scoped config cache Adds preAuthTenantMiddleware that reads X-Tenant-Id from the request header and wraps downstream in tenantStorage ALS context. Wired onto /oauth, /api/auth, /api/config, and /api/share — unauthenticated routes that need tenant scoping before JWT auth runs. The /api/config cache key is now tenant-scoped (STARTUP_CONFIG:${tenantId}) so multi-tenant deployments serve the correct login page config per tenant. The middleware is intentionally minimal — no subdomain parsing, no OIDC claim extraction. The private fork's reverse proxy or auth gateway sets the header. * feat: accept optional tenantId in updateInterfacePermissions When tenantId is provided, the function re-enters inside tenantStorage.run({ tenantId }) so all downstream Mongoose queries target that tenant's roles instead of the system context. This lets the private fork's tenant provisioning flow call updateInterfacePermissions per-tenant after creating tenant-scoped ADMIN/USER roles. * fix: tenant-filter $lookup in getPromptGroup aggregation The $lookup stage in getPromptGroup() queried the prompts collection without tenant filtering. While the outer PromptGroup aggregate is protected by the tenantIsolation plugin's pre('aggregate') hook, $lookup runs as an internal MongoDB operation that bypasses Mongoose hooks entirely. Converts from simple field-based $lookup to pipeline-based $lookup with an explicit tenantId match when tenant context is active. * fix: replace field-level unique indexes with tenant-scoped compounds Field-level unique:true creates a globally-unique single-field index in MongoDB, which would cause insert failures across tenants sharing the same ID values. - agent.id: removed field-level unique, added { id, tenantId } compound - convo.conversationId: removed field-level unique (compound at line 50 already exists: { conversationId, user, tenantId }) - message.messageId: removed field-level unique (compound at line 165 already exists: { messageId, user, tenantId }) - preset.presetId: removed field-level unique, added { presetId, tenantId } compound * fix: scope MODELS_CONFIG, ENDPOINT_CONFIG, PLUGINS, TOOLS caches by tenant These caches store per-tenant configuration (available models, endpoint settings, plugin availability, tool definitions) but were using global cache keys. In multi-tenant mode, one tenant's cached config would be served to all tenants. Appends :${tenantId} to cache keys when tenant context is active. Falls back to the unscoped key when no tenant context exists (backward compatible for single-tenant OSS deployments). Covers all read, write, and delete sites: - ModelController.js: get/set MODELS_CONFIG - PluginController.js: get/set PLUGINS, get/set TOOLS - getEndpointsConfig.js: get/set/delete ENDPOINT_CONFIG - app.js: delete ENDPOINT_CONFIG (clearEndpointConfigCache) - mcp.js: delete TOOLS (updateMCPTools, mergeAppTools) - importers.js: get ENDPOINT_CONFIG * fix: add getTenantId to PluginController spec mock The data-schemas mock was missing getTenantId, causing all PluginController tests to throw when the controller calls getTenantId() for tenant-scoped cache keys. * fix: address review findings — migration, strict-mode, DRY, types Addresses all CRITICAL, MAJOR, and MINOR review findings: F1 (CRITICAL): Add agents, conversations, messages, presets to SUPERSEDED_INDEXES in tenantIndexes.ts so dropSupersededTenantIndexes() drops the old single-field unique indexes that block multi-tenant inserts. F2 (CRITICAL): Unknown bulkWrite op types now throw in strict mode instead of silently passing through without tenant injection. F3 (MAJOR): Replace wildcard export with named export for tenantSafeBulkWrite, hiding _resetBulkWriteStrictCache from the public package API. F5 (MAJOR): Restore AnyBulkWriteOperation<IAclEntry>[] typing on bulkWriteAclEntries — the unparameterized wrapper accepts parameterized ops as a subtype. F7 (MAJOR): Fix config.js tenant precedence — JWT-derived req.user.tenantId now takes priority over the X-Tenant-Id header for authenticated requests. F8 (MINOR): Extract scopedCacheKey() helper into tenantContext.ts and replace all 11 inline occurrences across 7 files. F9 (MINOR): Use simple localField/foreignField $lookup for the non-tenant getPromptGroup path (more efficient index seeks). F12 (NIT): Remove redundant BulkOp type alias. F13 (NIT): Remove debug log that leaked raw tenantId. * fix: add new superseded indexes to tenantIndexes test fixture The test creates old indexes to verify the migration drops them. Missing fixture entries for agents.id_1, conversations.conversationId_1, messages.messageId_1, and presets.presetId_1 caused the count assertion to fail (expected 22, got 18). * fix: restore logger.warn for unknown bulk op types in non-strict mode * fix: block SYSTEM_TENANT_ID sentinel from external header input CRITICAL: preAuthTenantMiddleware accepted any string as X-Tenant-Id, including '__SYSTEM__'. The tenantIsolation plugin treats SYSTEM_TENANT_ID as an explicit bypass — skipping ALL query filters. A client sending X-Tenant-Id: __SYSTEM__ to pre-auth routes (/api/share, /api/config, /api/auth, /oauth) would execute Mongoose operations without tenant isolation. Fixes: - preAuthTenantMiddleware rejects SYSTEM_TENANT_ID in header - scopedCacheKey returns the base key (not key:__SYSTEM__) in system context, preventing stale cache entries during runAsSystem() - updateInterfacePermissions guards tenantId against SYSTEM_TENANT_ID - $lookup pipeline separates $expr join from constant tenantId match for better index utilization - Regression test for sentinel rejection in preAuthTenant.spec.ts - Remove redundant getTenantId() call in config.js * test: add missing deleteMany/replaceOne coverage, fix vacuous ALS assertions bulkWrite spec: - deleteMany: verifies tenant-scoped deletion leaves other tenants untouched - replaceOne: verifies tenantId injected into both filter and replacement - replaceOne overwrite: verifies a conflicting tenantId in the replacement document is overwritten by the ALS tenant (defense-in-depth) - empty ops array: verifies graceful handling preAuthTenant spec: - All negative-case tests now use the capturedNext pattern to verify getTenantId() inside the middleware's execution context, not the test runner's outer frame (which was always undefined regardless) * feat: tenant-isolate MESSAGES cache, FLOWS cache, and GenerationJobManager MESSAGES cache (streamAudio.js): - Cache key now uses scopedCacheKey(messageId) to prefix with tenantId, preventing cross-tenant message content reads during TTS streaming. FLOWS cache (FlowStateManager): - getFlowKey() now generates ${type}:${tenantId}:${flowId} when tenant context is active, isolating OAuth flow state per tenant. GenerationJobManager: - tenantId added to SerializableJobData and GenerationJobMetadata - createJob() captures the current ALS tenant context (excluding SYSTEM_TENANT_ID) and stores it in job metadata - SSE subscription endpoint validates job.metadata.tenantId matches req.user.tenantId, blocking cross-tenant stream access - Both InMemoryJobStore and RedisJobStore updated to accept tenantId * fix: add getTenantId and SYSTEM_TENANT_ID to MCP OAuth test mocks FlowStateManager.getFlowKey() now calls getTenantId() for tenant-scoped flow keys. The 4 MCP OAuth test files mock @librechat/data-schemas without these exports, causing TypeError at runtime. * fix: correct import ordering per AGENTS.md conventions Package imports sorted shortest to longest line length, local imports sorted longest to shortest — fixes ordering violations introduced by our new imports across 8 files. * fix: deserialize tenantId in RedisJobStore — cross-tenant SSE guard was no-op in Redis mode serializeJob() writes tenantId to the Redis hash via Object.entries, but deserializeJob() manually enumerates fields and omitted tenantId. Every getJob() from Redis returned tenantId: undefined, causing the SSE route's cross-tenant guard to short-circuit (undefined && ... → false). * test: SSE tenant guard, FlowStateManager key consistency, ALS scope docs SSE stream tenant tests (streamTenant.spec.js): - Cross-tenant user accessing another tenant's stream → 403 - Same-tenant user accessing own stream → allowed - OSS mode (no tenantId on job) → tenant check skipped FlowStateManager tenant tests (manager.tenant.spec.ts): - completeFlow finds flow created under same tenant context - completeFlow does NOT find flow under different tenant context - Unscoped flows are separate from tenant-scoped flows Documentation: - JSDoc on getFlowKey documenting ALS context consistency requirement - Comment on streamAudio.js scopedCacheKey capture site * fix: SSE stream tests hang on success path, remove internal fork references The success-path tests entered the SSE streaming code which never closes, causing timeout. Mock subscribe() to end the response immediately. Restructured assertions to verify non-403/non-404. Removed "private fork" and "OSS" references from code and test descriptions — replaced with "deployment layer", "multi-tenant deployments", and "single-tenant mode". * fix: address review findings — test rigor, tenant ID validation, docs F1: SSE stream tests now mock subscribe() with correct signature (streamId, writeEvent, onDone, onError) and assert 200 status, verifying the tenant guard actually allows through same-tenant users. F2: completeFlow logs the attempted key and ALS tenantId when flow is not found, so reverse proxy misconfiguration (missing X-Tenant-Id on OAuth callback) produces an actionable warning. F3/F10: preAuthTenantMiddleware validates tenant ID format — rejects colons, special characters, and values exceeding 128 chars. Trims whitespace. Prevents cache key collisions via crafted headers. F4: Documented cache invalidation scope limitation in clearEndpointConfigCache — only the calling tenant's key is cleared; other tenants expire via TTL. F7: getFlowKey JSDoc now lists all 8 methods requiring consistent ALS context. F8: Added dedicated scopedCacheKey unit tests — base key without context, base key in system context, scoped key with tenant, no ALS leakage across scope boundaries. * fix: revert flow key tenant scoping, fix SSE test timing FlowStateManager: Reverts tenant-scoped flow keys. OAuth callbacks arrive without tenant ALS context (provider redirects don't carry X-Tenant-Id), so completeFlow/failFlow would never find flows created under tenant context. Flow IDs are random UUIDs with no collision risk, and flow data is ephemeral (TTL-bounded). SSE tests: Use process.nextTick for onDone callback so Express response headers are flushed before res.write/res.end are called. * fix: restore getTenantId import for completeFlow diagnostic log * fix: correct completeFlow warning message, add missing flow test The warning referenced X-Tenant-Id header consistency which was only relevant when flow keys were tenant-scoped (since reverted). Updated to list actual causes: TTL expiry, missing flow, or routing to a different instance without shared Keyv storage. Removed the getTenantId() call and import — no longer needed since flow keys are unscoped. Added test for the !flowState branch in completeFlow — verifies return false and logger.warn on nonexistent flow ID. * fix: add explicit return type to recursive updateInterfacePermissions The recursive call (tenantId branch calls itself without tenantId) causes TypeScript to infer circular return type 'any'. Adding explicit Promise<void> satisfies the rollup typescript plugin. * fix: update MCPOAuthRaceCondition test to match new completeFlow warning * fix: clearEndpointConfigCache deletes both scoped and unscoped keys Unauthenticated /api/endpoints requests populate the unscoped ENDPOINT_CONFIG key. Admin config mutations clear only the tenant-scoped key, leaving the unscoped entry stale indefinitely. Now deletes both when in tenant context. * fix: tenant guard on abort/status endpoints, warn logs, test coverage F1: Add tenant guard to /chat/status/:conversationId and /chat/abort matching the existing guard on /chat/stream/:streamId. The status endpoint exposes aggregatedContent (AI response text) which requires tenant-level access control. F2: preAuthTenantMiddleware now logs warn for rejected __SYSTEM__ sentinel and malformed tenant IDs, providing observability for bypass probing attempts. F3: Abort fallback path (getActiveJobIdsForUser) now has tenant check after resolving the job. F4: Test for strict mode + SYSTEM_TENANT_ID — verifies runAsSystem bypasses tenantSafeBulkWrite without throwing in strict mode. F5: Test for job with tenantId + user without tenantId → 403. F10: Regex uses idiomatic hyphen-at-start form. F11: Test descriptions changed from "rejects" to "ignores" since middleware calls next() (not 4xx). Also fixes MCPOAuthRaceCondition test assertion to match updated completeFlow warning message. * fix: test coverage for logger.warn, status/abort guards, consistency A: preAuthTenant spec now mocks logger and asserts warn calls for __SYSTEM__ sentinel, malformed characters, and oversized headers. B: streamTenant spec expanded with status and abort endpoint tests — cross-tenant status returns 403, same-tenant returns 200 with body, cross-tenant abort returns 403. C: Abort endpoint uses req.user.tenantId (not req.user?.tenantId) matching stream/status pattern — requireJwtAuth guarantees req.user. D: Malformed header warning now includes ip in log metadata, matching the sentinel warning for consistent SOC correlation. * fix: assert ip field in malformed header warn tests * fix: parallelize cache deletes, document tenant guard, fix import order - clearEndpointConfigCache uses Promise.all for independent cache deletes instead of sequential awaits - SSE stream tenant guard has inline comment explaining backward-compat behavior for untenanted legacy jobs - conversation.ts local imports reordered longest-to-shortest per AGENTS.md * fix: tenant-qualify userJobs keys, document tenant guard backward-compat Job store userJobs keys now include tenantId when available: - Redis: stream:user:{tenantId:userId}:jobs (falls back to stream:user:{userId}:jobs when no tenant) - InMemory: composite key tenantId:userId in userJobMap getActiveJobIdsByUser/getActiveJobIdsForUser accept optional tenantId parameter, threaded through from req.user.tenantId at all call sites (/chat/active and /chat/abort fallback). Added inline comments on all three SSE tenant guards explaining the backward-compat design: untenanted legacy jobs remain accessible when the userId check passes. * fix: parallelize cache deletes, document tenant guard, fix import order Fix InMemoryJobStore.getActiveJobIdsByUser empty-set cleanup to use the tenant-qualified userKey instead of bare userId — prevents orphaned empty Sets accumulating in userJobMap for multi-tenant users. Document cross-tenant staleness in clearEndpointConfigCache JSDoc — other tenants' scoped keys expire via TTL, not active invalidation. * fix: cleanup userJobMap leak, startup warning, DRY tenant guard, docs F1: InMemoryJobStore.cleanup() now removes entries from userJobMap before calling deleteJob, preventing orphaned empty Sets from accumulating with tenant-qualified composite keys. F2: Startup warning when TENANT_ISOLATION_STRICT is active — reminds operators to configure reverse proxy to control X-Tenant-Id header. F3: mergeAppTools JSDoc documents that tenant-scoped TOOLS keys are not actively invalidated (matching clearEndpointConfigCache pattern). F5: Abort handler getActiveJobIdsForUser call uses req.user.tenantId (not req.user?.tenantId) — consistent with stream/status handlers. F6: updateInterfacePermissions JSDoc clarifies SYSTEM_TENANT_ID behavior — falls through to caller's ALS context. F7: Extracted hasTenantMismatch() helper, replacing three identical inline tenant guard blocks across stream/status/abort endpoints. F9: scopedCacheKey JSDoc documents both passthrough cases (no context and SYSTEM_TENANT_ID context). * fix: clean userJobMap in evictOldest — same leak as cleanup()
2026-04-03 14:27:20 +02:00 · 2026-03-28 16:43:50 -04:00 · 2026-03-28 16:43:50 -04:00 · 877c2efc85
commit 877c2efc85
parent 935288f841
47 changed files with 1224 additions and 83 deletions
--- a/packages/api/src/app/permissions.ts
+++ b/packages/api/src/app/permissions.ts
@ -1,4 +1,4 @@
-import { logger } from '@librechat/data-schemas';
+import { logger, tenantStorage, SYSTEM_TENANT_ID } from '@librechat/data-schemas';
 import {
  SystemRoles,
  Permissions,
@ -54,6 +54,7 @@ export async function updateInterfacePermissions({
  appConfig,
  getRoleByName,
  updateAccessPermissions,
+  tenantId,
 }: {
  appConfig: AppConfig;
  getRoleByName: (roleName: string, fieldsToSelect?: string | string[]) => Promise<IRole | null>;
@ -63,7 +64,19 @@ export async function updateInterfacePermissions({

    roleData?: IRole | null,
  ) => Promise<void>;
-}) {
+  /**
+   * Optional tenant ID for scoping role updates to a specific tenant.
+   * When provided (and not SYSTEM_TENANT_ID), runs inside `tenantStorage.run({ tenantId })`.
+   * When omitted or SYSTEM_TENANT_ID, uses the caller's existing ALS context.
+   */
+  tenantId?: string;
+}): Promise<void> {
+  if (tenantId && tenantId !== SYSTEM_TENANT_ID) {
+    return tenantStorage.run({ tenantId }, async () =>
+      updateInterfacePermissions({ appConfig, getRoleByName, updateAccessPermissions }),
+    );
+  }
+
  const loadedInterface = appConfig?.interfaceConfig;
  if (!loadedInterface) {
    return;
--- a/packages/api/src/flow/manager.tenant.spec.ts
+++ b/packages/api/src/flow/manager.tenant.spec.ts
@ -0,0 +1,49 @@
+import { Keyv } from 'keyv';
+import { logger, tenantStorage } from '@librechat/data-schemas';
+import { FlowStateManager } from './manager';
+
+jest.mock('@librechat/data-schemas', () => ({
+  ...jest.requireActual('@librechat/data-schemas'),
+  logger: {
+    info: jest.fn(),
+    warn: jest.fn(),
+    error: jest.fn(),
+    debug: jest.fn(),
+  },
+}));
+
+describe('FlowStateManager flow keys are not tenant-scoped', () => {
+  let manager: FlowStateManager;
+
+  beforeEach(() => {
+    jest.clearAllMocks();
+    const store = new Keyv({ store: new Map() });
+    manager = new FlowStateManager(store, { ci: true, ttl: 60_000 });
+  });
+
+  it('completeFlow finds a flow regardless of tenant context (OAuth callback compatibility)', async () => {
+    await tenantStorage.run({ tenantId: 'tenant-a' }, async () => {
+      await manager.initFlow('flow-1', 'oauth', {});
+    });
+
+    const found = await manager.completeFlow('flow-1', 'oauth', { token: 'abc' });
+    expect(found).toBe(true);
+  });
+
+  it('completeFlow works when both creation and completion have the same tenant', async () => {
+    await tenantStorage.run({ tenantId: 'tenant-a' }, async () => {
+      await manager.initFlow('flow-2', 'oauth', {});
+      const found = await manager.completeFlow('flow-2', 'oauth', { token: 'abc' });
+      expect(found).toBe(true);
+    });
+  });
+
+  it('completeFlow returns false and logs when flow does not exist', async () => {
+    const found = await manager.completeFlow('ghost-flow', 'oauth', { token: 'x' });
+    expect(found).toBe(false);
+    expect(logger.warn).toHaveBeenCalledWith(
+      expect.stringContaining('ghost-flow'),
+      expect.objectContaining({ flowId: 'ghost-flow', type: 'oauth' }),
+    );
+  });
+});
--- a/packages/api/src/flow/manager.ts
+++ b/packages/api/src/flow/manager.ts
@ -53,6 +53,12 @@ export class FlowStateManager<T = unknown> {
    process.on('SIGHUP', cleanup);
  }

+  /**
+   * Flow keys are intentionally NOT tenant-scoped. OAuth callbacks arrive
+   * without tenant ALS context (the provider redirect doesn't carry
+   * X-Tenant-Id). Flow IDs are random UUIDs with no collision risk, and
+   * flow data is ephemeral (TTL-bounded, no sensitive user content).
+   */
  private getFlowKey(flowId: string, type: string): string {
    return `${type}:${flowId}`;
  }
@ -253,7 +259,9 @@ export class FlowStateManager<T = unknown> {

    if (!flowState) {
      logger.warn(
-        '[FlowStateManager] Flow state not found during completion — cannot recover metadata, skipping',
+        `[FlowStateManager] completeFlow: flow not found — key=${flowKey}. ` +
+          'Possible causes: flow TTL expired before callback arrived, flow was never created, or ' +
+          'the callback is routing to a different instance without shared Keyv storage.',
        { flowId, type },
      );
      return false;
--- a/packages/api/src/mcp/tests/MCPOAuthCSRFFallback.test.ts
+++ b/packages/api/src/mcp/tests/MCPOAuthCSRFFallback.test.ts
@ -34,6 +34,8 @@ jest.mock('@librechat/data-schemas', () => ({
    error: jest.fn(),
    debug: jest.fn(),
  },
+  getTenantId: jest.fn(),
+  SYSTEM_TENANT_ID: '__SYSTEM__',
  encryptV2: jest.fn(async (val: string) => `enc:${val}`),
  decryptV2: jest.fn(async (val: string) => val.replace(/^enc:/, '')),
 }));
--- a/packages/api/src/mcp/tests/MCPOAuthFlow.test.ts
+++ b/packages/api/src/mcp/tests/MCPOAuthFlow.test.ts
@ -20,6 +20,8 @@ jest.mock('@librechat/data-schemas', () => ({
    error: jest.fn(),
    debug: jest.fn(),
  },
+  getTenantId: jest.fn(),
+  SYSTEM_TENANT_ID: '__SYSTEM__',
  encryptV2: jest.fn(async (val: string) => `enc:${val}`),
  decryptV2: jest.fn(async (val: string) => val.replace(/^enc:/, '')),
 }));
--- a/packages/api/src/mcp/tests/MCPOAuthRaceCondition.test.ts
+++ b/packages/api/src/mcp/tests/MCPOAuthRaceCondition.test.ts
@ -23,6 +23,8 @@ jest.mock('@librechat/data-schemas', () => ({
    error: jest.fn(),
    debug: jest.fn(),
  },
+  getTenantId: jest.fn(),
+  SYSTEM_TENANT_ID: '__SYSTEM__',
  encryptV2: jest.fn(async (val: string) => `enc:${val}`),
  decryptV2: jest.fn(async (val: string) => val.replace(/^enc:/, '')),
 }));
@ -258,7 +260,7 @@ describe('MCP OAuth Race Condition Fixes', () => {
      expect(stateAfterComplete).toBeUndefined();

      expect(mockLogger.warn).toHaveBeenCalledWith(
-        expect.stringContaining('cannot recover metadata'),
+        expect.stringContaining('flow not found'),
        expect.any(Object),
      );
    });
--- a/packages/api/src/mcp/tests/MCPOAuthTokenExpiry.test.ts
+++ b/packages/api/src/mcp/tests/MCPOAuthTokenExpiry.test.ts
@ -26,6 +26,8 @@ jest.mock('@librechat/data-schemas', () => ({
    error: jest.fn(),
    debug: jest.fn(),
  },
+  getTenantId: jest.fn(),
+  SYSTEM_TENANT_ID: '__SYSTEM__',
  encryptV2: jest.fn(async (val: string) => `enc:${val}`),
  decryptV2: jest.fn(async (val: string) => val.replace(/^enc:/, '')),
 }));
--- a/packages/api/src/middleware/index.ts
+++ b/packages/api/src/middleware/index.ts
@ -6,5 +6,6 @@ export * from './balance';
 export * from './json';
 export * from './capabilities';
 export { tenantContextMiddleware } from './tenant';
+export { preAuthTenantMiddleware } from './preAuthTenant';
 export * from './concurrency';
 export * from './checkBalance';
--- a/packages/api/src/middleware/preAuthTenant.spec.ts
+++ b/packages/api/src/middleware/preAuthTenant.spec.ts
@ -0,0 +1,129 @@
+import { getTenantId, logger } from '@librechat/data-schemas';
+import { preAuthTenantMiddleware } from './preAuthTenant';
+import type { Request, Response, NextFunction } from 'express';
+
+jest.mock('@librechat/data-schemas', () => ({
+  ...jest.requireActual('@librechat/data-schemas'),
+  logger: {
+    warn: jest.fn(),
+    error: jest.fn(),
+    info: jest.fn(),
+    debug: jest.fn(),
+  },
+}));
+
+describe('preAuthTenantMiddleware', () => {
+  let req: Partial<Request>;
+  let res: Partial<Response>;
+
+  beforeEach(() => {
+    jest.clearAllMocks();
+    req = { headers: {} };
+    res = {};
+  });
+
+  it('calls next() without ALS context when no X-Tenant-Id header is present', () => {
+    let capturedTenantId: string | undefined = 'sentinel';
+    const capturedNext: NextFunction = () => {
+      capturedTenantId = getTenantId();
+    };
+
+    preAuthTenantMiddleware(req as Request, res as Response, capturedNext);
+    expect(capturedTenantId).toBeUndefined();
+  });
+
+  it('calls next() without ALS context when X-Tenant-Id header is empty', () => {
+    req.headers = { 'x-tenant-id': '' };
+    let capturedTenantId: string | undefined = 'sentinel';
+    const capturedNext: NextFunction = () => {
+      capturedTenantId = getTenantId();
+    };
+
+    preAuthTenantMiddleware(req as Request, res as Response, capturedNext);
+    expect(capturedTenantId).toBeUndefined();
+  });
+
+  it('wraps downstream in ALS context when X-Tenant-Id header is present', () => {
+    req.headers = { 'x-tenant-id': 'acme-corp' };
+    let capturedTenantId: string | undefined;
+    const capturedNext: NextFunction = () => {
+      capturedTenantId = getTenantId();
+    };
+
+    preAuthTenantMiddleware(req as Request, res as Response, capturedNext);
+    expect(capturedTenantId).toBe('acme-corp');
+  });
+
+  it('ignores __SYSTEM__ sentinel and logs warning', () => {
+    req.headers = { 'x-tenant-id': '__SYSTEM__' };
+    req.ip = '10.0.0.1';
+    req.path = '/api/config';
+    let capturedTenantId: string | undefined = 'should-be-overwritten';
+    const capturedNext: NextFunction = () => {
+      capturedTenantId = getTenantId();
+    };
+
+    preAuthTenantMiddleware(req as Request, res as Response, capturedNext);
+    expect(capturedTenantId).toBeUndefined();
+    expect(logger.warn).toHaveBeenCalledWith(
+      expect.stringContaining('__SYSTEM__'),
+      expect.objectContaining({ ip: '10.0.0.1', path: '/api/config' }),
+    );
+  });
+
+  it('ignores array-valued headers (Express can produce these)', () => {
+    req.headers = { 'x-tenant-id': ['a', 'b'] as unknown as string };
+    let capturedTenantId: string | undefined = 'sentinel';
+    const capturedNext: NextFunction = () => {
+      capturedTenantId = getTenantId();
+    };
+
+    preAuthTenantMiddleware(req as Request, res as Response, capturedNext);
+    expect(capturedTenantId).toBeUndefined();
+  });
+
+  it('ignores tenant IDs containing invalid characters and logs warning', () => {
+    req.headers = { 'x-tenant-id': 'tenant:injected' };
+    req.ip = '192.168.1.1';
+    req.path = '/api/auth/login';
+    let capturedTenantId: string | undefined = 'sentinel';
+    const capturedNext: NextFunction = () => {
+      capturedTenantId = getTenantId();
+    };
+
+    preAuthTenantMiddleware(req as Request, res as Response, capturedNext);
+    expect(capturedTenantId).toBeUndefined();
+    expect(logger.warn).toHaveBeenCalledWith(
+      expect.stringContaining('malformed'),
+      expect.objectContaining({ ip: '192.168.1.1', path: '/api/auth/login' }),
+    );
+  });
+
+  it('trims whitespace from tenant ID header', () => {
+    req.headers = { 'x-tenant-id': '  acme-corp  ' };
+    let capturedTenantId: string | undefined;
+    const capturedNext: NextFunction = () => {
+      capturedTenantId = getTenantId();
+    };
+
+    preAuthTenantMiddleware(req as Request, res as Response, capturedNext);
+    expect(capturedTenantId).toBe('acme-corp');
+  });
+
+  it('ignores tenant IDs exceeding max length and logs warning', () => {
+    req.headers = { 'x-tenant-id': 'a'.repeat(200) };
+    req.ip = '192.168.1.1';
+    req.path = '/api/share/abc';
+    let capturedTenantId: string | undefined = 'sentinel';
+    const capturedNext: NextFunction = () => {
+      capturedTenantId = getTenantId();
+    };
+
+    preAuthTenantMiddleware(req as Request, res as Response, capturedNext);
+    expect(capturedTenantId).toBeUndefined();
+    expect(logger.warn).toHaveBeenCalledWith(
+      expect.stringContaining('malformed'),
+      expect.objectContaining({ ip: '192.168.1.1', length: 200, path: '/api/share/abc' }),
+    );
+  });
+});
--- a/packages/api/src/middleware/preAuthTenant.ts
+++ b/packages/api/src/middleware/preAuthTenant.ts
@ -0,0 +1,72 @@
+import { tenantStorage, logger, SYSTEM_TENANT_ID } from '@librechat/data-schemas';
+import type { Request, Response, NextFunction } from 'express';
+
+/**
+ * Pre-authentication tenant context middleware for unauthenticated routes.
+ *
+ * Reads the tenant identifier from the `X-Tenant-Id` request header and wraps
+ * downstream handlers in `tenantStorage.run()` so that Mongoose queries and
+ * config resolution run within the correct tenant scope.
+ *
+ * **Where to use**: Mount on routes that must be tenant-aware before
+ * authentication has occurred:
+ * - `GET /api/config` — login page needs tenant-specific config (social logins, registration)
+ * - `/api/auth/*` — login, register, password reset
+ * - `/oauth/*` — OAuth callback flows
+ * - `GET /api/share/:shareId` — public shared conversation links
+ *
+ * **How the header gets set**: The deployment's reverse proxy, auth gateway,
+ * or OpenID strategy sets `X-Tenant-Id` based on subdomain, path, or OIDC claim.
+ * This middleware does NOT resolve tenants from subdomains or tokens — that is
+ * the responsibility of the deployment layer.
+ *
+ * **Design**: Intentionally minimal. No subdomain parsing, no OIDC claim
+ * extraction, no YAML-driven strategy. Multi-tenant deployments can:
+ * 1. Set the header in the reverse proxy / ingress (simplest),
+ * 2. Replace this middleware's resolver logic entirely, or
+ * 3. Layer additional resolution on top (e.g., OpenID `tenant` claim → header).
+ *
+ * If no header is present, downstream runs without tenant ALS context (same as
+ * single-tenant mode). This preserves backward compatibility.
+ */
+const MAX_TENANT_ID_LENGTH = 128;
+const VALID_TENANT_ID = /^[-a-zA-Z0-9_.]+$/;
+
+export function preAuthTenantMiddleware(req: Request, res: Response, next: NextFunction): void {
+  const raw = req.headers['x-tenant-id'];
+
+  if (!raw || typeof raw !== 'string') {
+    next();
+    return;
+  }
+
+  const tenantId = raw.trim();
+
+  if (!tenantId) {
+    next();
+    return;
+  }
+
+  if (tenantId === SYSTEM_TENANT_ID) {
+    logger.warn('[preAuthTenant] Rejected __SYSTEM__ sentinel in X-Tenant-Id header', {
+      ip: req.ip,
+      path: req.path,
+    });
+    next();
+    return;
+  }
+
+  if (tenantId.length > MAX_TENANT_ID_LENGTH || !VALID_TENANT_ID.test(tenantId)) {
+    logger.warn('[preAuthTenant] Rejected malformed X-Tenant-Id header', {
+      ip: req.ip,
+      length: tenantId.length,
+      path: req.path,
+    });
+    next();
+    return;
+  }
+
+  return void tenantStorage.run({ tenantId }, async () => {
+    next();
+  });
+}
--- a/packages/api/src/stream/GenerationJobManager.ts
+++ b/packages/api/src/stream/GenerationJobManager.ts
@ -1,4 +1,4 @@
-import { logger } from '@librechat/data-schemas';
+import { logger, getTenantId, SYSTEM_TENANT_ID } from '@librechat/data-schemas';
 import type { StandardGraph } from '@librechat/agents';
 import { parseTextParts } from 'librechat-data-provider';
 import type { Agents, TMessageContentParts } from 'librechat-data-provider';
@ -197,7 +197,9 @@ class GenerationJobManagerClass {
    userId: string,
    conversationId?: string,
  ): Promise<t.GenerationJob> {
-    const jobData = await this.jobStore.createJob(streamId, userId, conversationId);
+    const tenantId = getTenantId();
+    const safeTenantId = tenantId && tenantId !== SYSTEM_TENANT_ID ? tenantId : undefined;
+    const jobData = await this.jobStore.createJob(streamId, userId, conversationId, safeTenantId);

    /**
     * Create runtime state with readyPromise.
@ -355,6 +357,7 @@ class GenerationJobManagerClass {
      error: jobData.error,
      metadata: {
        userId: jobData.userId,
+        tenantId: jobData.tenantId,
        conversationId: jobData.conversationId,
        userMessage: jobData.userMessage,
        responseMessageId: jobData.responseMessageId,
@ -1255,8 +1258,8 @@ class GenerationJobManagerClass {
   * @param userId - The user ID to query
   * @returns Array of conversation IDs with active jobs
   */
-  async getActiveJobIdsForUser(userId: string): Promise<string[]> {
-    return this.jobStore.getActiveJobIdsByUser(userId);
+  async getActiveJobIdsForUser(userId: string, tenantId?: string): Promise<string[]> {
+    return this.jobStore.getActiveJobIdsByUser(userId, tenantId);
  }

  /**
--- a/packages/api/src/stream/implementations/InMemoryJobStore.ts
+++ b/packages/api/src/stream/implementations/InMemoryJobStore.ts
@ -70,6 +70,7 @@ export class InMemoryJobStore implements IJobStore {
    streamId: string,
    userId: string,
    conversationId?: string,
+    tenantId?: string,
  ): Promise<SerializableJobData> {
    if (this.jobs.size >= this.maxJobs) {
      await this.evictOldest();
@ -78,6 +79,7 @@ export class InMemoryJobStore implements IJobStore {
    const job: SerializableJobData = {
      streamId,
      userId,
+      ...(tenantId && { tenantId }),
      status: 'running',
      createdAt: Date.now(),
      conversationId,
@ -86,11 +88,12 @@ export class InMemoryJobStore implements IJobStore {

    this.jobs.set(streamId, job);

-    // Track job by userId for efficient user-scoped queries
-    let userJobs = this.userJobMap.get(userId);
+    // Track job by userId (tenant-qualified when available) for efficient user-scoped queries
+    const userKey = tenantId ? `${tenantId}:${userId}` : userId;
+    let userJobs = this.userJobMap.get(userKey);
    if (!userJobs) {
      userJobs = new Set();
-      this.userJobMap.set(userId, userJobs);
+      this.userJobMap.set(userKey, userJobs);
    }
    userJobs.add(streamId);

@ -146,6 +149,17 @@ export class InMemoryJobStore implements IJobStore {
    }

    for (const id of toDelete) {
+      const job = this.jobs.get(id);
+      if (job) {
+        const userKey = job.tenantId ? `${job.tenantId}:${job.userId}` : job.userId;
+        const userJobs = this.userJobMap.get(userKey);
+        if (userJobs) {
+          userJobs.delete(id);
+          if (userJobs.size === 0) {
+            this.userJobMap.delete(userKey);
+          }
+        }
+      }
      await this.deleteJob(id);
    }

@ -169,6 +183,17 @@ export class InMemoryJobStore implements IJobStore {

    if (oldestId) {
      logger.warn(`[InMemoryJobStore] Evicting oldest job: ${oldestId}`);
+      const job = this.jobs.get(oldestId);
+      if (job) {
+        const userKey = job.tenantId ? `${job.tenantId}:${job.userId}` : job.userId;
+        const userJobs = this.userJobMap.get(userKey);
+        if (userJobs) {
+          userJobs.delete(oldestId);
+          if (userJobs.size === 0) {
+            this.userJobMap.delete(userKey);
+          }
+        }
+      }
      await this.deleteJob(oldestId);
    }
  }
@ -205,8 +230,9 @@ export class InMemoryJobStore implements IJobStore {
   * Returns conversation IDs of running jobs belonging to the user.
   * Also performs self-healing cleanup: removes stale entries for jobs that no longer exist.
   */
-  async getActiveJobIdsByUser(userId: string): Promise<string[]> {
-    const trackedIds = this.userJobMap.get(userId);
+  async getActiveJobIdsByUser(userId: string, tenantId?: string): Promise<string[]> {
+    const userKey = tenantId ? `${tenantId}:${userId}` : userId;
+    const trackedIds = this.userJobMap.get(userKey);
    if (!trackedIds || trackedIds.size === 0) {
      return [];
    }
@ -226,7 +252,7 @@ export class InMemoryJobStore implements IJobStore {

    // Clean up empty set
    if (trackedIds.size === 0) {
-      this.userJobMap.delete(userId);
+      this.userJobMap.delete(userKey);
    }

    return activeIds;
--- a/packages/api/src/stream/implementations/RedisJobStore.ts
+++ b/packages/api/src/stream/implementations/RedisJobStore.ts
@ -29,8 +29,9 @@ const KEYS = {
  runSteps: (streamId: string) => `stream:{${streamId}}:runsteps`,
  /** Running jobs set for cleanup (global set - single slot) */
  runningJobs: 'stream:running',
-  /** User's active jobs set: stream:user:{userId}:jobs */
-  userJobs: (userId: string) => `stream:user:{${userId}}:jobs`,
+  /** User's active jobs set, tenant-qualified when tenantId is available */
+  userJobs: (userId: string, tenantId?: string) =>
+    tenantId ? `stream:user:{${tenantId}:${userId}}:jobs` : `stream:user:{${userId}}:jobs`,
 };

 /**
@ -140,10 +141,12 @@ export class RedisJobStore implements IJobStore {
    streamId: string,
    userId: string,
    conversationId?: string,
+    tenantId?: string,
  ): Promise<SerializableJobData> {
    const job: SerializableJobData = {
      streamId,
      userId,
+      ...(tenantId && { tenantId }),
      status: 'running',
      createdAt: Date.now(),
      conversationId,
@ -151,7 +154,7 @@ export class RedisJobStore implements IJobStore {
    };

    const key = KEYS.job(streamId);
-    const userJobsKey = KEYS.userJobs(userId);
+    const userJobsKey = KEYS.userJobs(userId, tenantId);

    // For cluster mode, we can't pipeline keys on different slots
    // The job key uses hash tag {streamId}, runningJobs and userJobs are on different slots
@ -377,8 +380,8 @@ export class RedisJobStore implements IJobStore {
   * @param userId - The user ID to query
   * @returns Array of conversation IDs with active jobs
   */
-  async getActiveJobIdsByUser(userId: string): Promise<string[]> {
-    const userJobsKey = KEYS.userJobs(userId);
+  async getActiveJobIdsByUser(userId: string, tenantId?: string): Promise<string[]> {
+    const userJobsKey = KEYS.userJobs(userId, tenantId);
    const trackedIds = await this.redis.smembers(userJobsKey);

    if (trackedIds.length === 0) {
@ -868,6 +871,7 @@ export class RedisJobStore implements IJobStore {
    return {
      streamId: data.streamId,
      userId: data.userId,
+      tenantId: data.tenantId || undefined,
      status: data.status as JobStatus,
      createdAt: parseInt(data.createdAt, 10),
      completedAt: data.completedAt ? parseInt(data.completedAt, 10) : undefined,
--- a/packages/api/src/stream/interfaces/IJobStore.ts
+++ b/packages/api/src/stream/interfaces/IJobStore.ts
@ -12,6 +12,7 @@ export type JobStatus = 'running' | 'complete' | 'error' | 'aborted';
 export interface SerializableJobData {
  streamId: string;
  userId: string;
+  tenantId?: string;
  status: JobStatus;
  createdAt: number;
  completedAt?: number;
@ -149,6 +150,7 @@ export interface IJobStore {
    streamId: string,
    userId: string,
    conversationId?: string,
+    tenantId?: string,
  ): Promise<SerializableJobData>;

  /** Get a job by streamId (streamId === conversationId) */
@ -186,7 +188,7 @@ export interface IJobStore {
   * @param userId - The user ID to query
   * @returns Array of conversation IDs with active jobs
   */
-  getActiveJobIdsByUser(userId: string): Promise<string[]>;
+  getActiveJobIdsByUser(userId: string, tenantId?: string): Promise<string[]>;

  // ===== Content State Methods =====
  // These methods manage volatile content state tied to each job.
--- a/packages/api/src/types/stream.ts
+++ b/packages/api/src/types/stream.ts
@ -4,6 +4,7 @@ import type { ServerSentEvent } from '~/types';

 export interface GenerationJobMetadata {
  userId: string;
+  tenantId?: string;
  conversationId?: string;
  /** User message data for rebuilding submission on reconnect */
  userMessage?: Agents.UserMessageMeta;