Commit graph

2 commits

Author SHA1 Message Date
Danny Avila
fda72ac621
🏗️ refactor: Remove Redundant Caching, Migrate Config Services to TypeScript (#12466)
* ♻️ refactor: Remove redundant scopedCacheKey caching, support user-provided key model fetching

Remove redundant cache layers that used `scopedCacheKey()` (tenant-only scoping)
on top of `getAppConfig()` which already caches per-principal (role+user+tenant).
This caused config overrides for different principals within the same tenant to
be invisible due to stale cached data.

Changes:
- Add `requireJwtAuth` to `/api/endpoints` route for proper user context
- Remove ENDPOINT_CONFIG, STARTUP_CONFIG, PLUGINS, TOOLS, and MODELS_CONFIG
  cache layers — all derive from `getAppConfig()` with cheap computation
- Enhance MODEL_QUERIES cache: hash(baseURL+apiKey) keys, 2-minute TTL,
  caching centralized in `fetchModels()` base function
- Support fetching models with user-provided API keys in `loadConfigModels`
  via `getUserKeyValues` lookup (no caching for user keys)
- Update all affected tests

Closes #1028

* ♻️ refactor: Migrate config services to TypeScript in packages/api

Move core config logic from CJS /api wrappers to typed TypeScript in
packages/api using dependency injection factories:

- `createEndpointsConfigService` — endpoint config merging + checkCapability
- `createLoadConfigModels` — custom endpoint model loading with user key support
- `createMCPToolCacheService` — MCP tool cache operations (update, merge, cache)

/api files become thin wrappers that wire dependencies (getAppConfig,
loadDefaultEndpointsConfig, getUserKeyValues, getCachedTools, etc.)
into the typed factories.

Also moves existing `endpoints/config.ts` → `endpoints/config/providers.ts`
to accommodate the new `config/` directory structure.

* 🔄 fix: Invalidate models query when user API key is set or revoked

Without this, users had to refresh the page after entering their API key
to see the updated model list fetched with their credentials.

- Invalidate QueryKeys.models in useUpdateUserKeysMutation onSuccess
- Invalidate QueryKeys.models in useRevokeUserKeyMutation onSuccess
- Invalidate QueryKeys.models in useRevokeAllUserKeysMutation onSuccess

* 🗺️ fix: Remap YAML-level override keys to AppConfig equivalents in mergeConfigOverrides

Config overrides stored in the DB use YAML-level keys (TCustomConfig),
but they're merged into the already-processed AppConfig where some fields
have been renamed by AppService. This caused mcpServers overrides to land
on a nonexistent key instead of mcpConfig, so config-override MCP servers
never appeared in the UI.

- Add OVERRIDE_KEY_MAP to remap mcpServers→mcpConfig, interface→interfaceConfig
- Apply remapping before deep merge in mergeConfigOverrides
- Add test for YAML-level key remapping behavior
- Update existing tests to use AppConfig field names in assertions

* 🧪 test: Update service.spec to use AppConfig field names after override key remapping

* 🛡️ fix: Address code review findings — reliability, types, tests, and performance

- Pass tenant context (getTenantId) in importers.js getEndpointsConfig call
- Add 5 tests for user-provided API key model fetching (key found, no key,
  DB error, missing userId, apiKey-only with fixed baseURL)
- Distinguish NO_USER_KEY (debug) from infrastructure errors (warn) in catch
- Switch fetchPromisesMap from Promise.all to Promise.allSettled so one
  failing provider doesn't kill the entire model config
- Parallelize getUserKeyValues DB lookups via batched Promise.allSettled
  instead of sequential awaits in the loop
- Hoist standardCache instance in fetchModels to avoid double instantiation
- Replace Record<string, unknown> types with Partial<TConfig>-based types;
  remove as unknown as T double-cast in endpoints config
- Narrow Bedrock availableRegions to typed destructure
- Narrow version field from string|number|undefined to string|undefined
- Fix import ordering in mcp/tools.ts and config/models.ts per AGENTS.md
- Add JSDoc to getModelsConfig alias clarifying caching semantics

* fix: Guard against null getCachedTools in mergeAppTools

* 🔍 fix: Address follow-up review — deduplicate extractEnvVariable, fix error discrimination, add log-level tests

- Deduplicate extractEnvVariable calls: resolve apiKey/baseURL once, reuse
  for both the entry and isUserProvided checks (Finding A)
- Move ResolvedEndpoint interface from function closure to module scope (Finding B)
- Replace fragile msg.includes('NO_USER_KEY') with ErrorTypes.NO_USER_KEY
  enum check against actual error message format (Finding C). Also handle
  ErrorTypes.INVALID_USER_KEY as an expected "no key" case.
- Add test asserting logger.warn is called for infra errors (not debug)
- Add test asserting logger.debug is called for NO_USER_KEY errors (not warn)

* fix: Preserve numeric assistants version via String() coercion

* 🐛 fix: Address secondary review — Ollama cache bypass, cache tests, type safety

- Fix Ollama success path bypassing cache write in fetchModels (CRITICAL):
  store result before returning so Ollama models benefit from 2-minute TTL
- Add 4 fetchModels cache behavior tests: cache write with TTL, cache hit
  short-circuits HTTP, skipCache bypasses read+write, empty results not cached
- Type-safe OVERRIDE_KEY_MAP: Partial<Record<keyof TCustomConfig, keyof AppConfig>>
  so compiler catches future field rename mismatches
- Fix import ordering in config/models.ts (package types longest→shortest)
- Rename ToolCacheDeps → MCPToolCacheDeps for naming consistency
- Expand getModelsConfig JSDoc to explain caching granularity

* fix: Narrow OVERRIDE_KEY_MAP index to satisfy strict tsconfig

* 🧩 fix: Add allowedProviders to TConfig, remove Record<string, unknown> from PartialEndpointEntry

The agents endpoint config includes allowedProviders (used by the frontend
AgentPanel to filter available providers), but it was missing from TConfig.
This forced PartialEndpointEntry to use & Record<string, unknown> as an
escape hatch, violating AGENTS.md type policy.

- Add allowedProviders?: (string | EModelEndpoint)[] to TConfig
- Remove Record<string, unknown> from PartialEndpointEntry — now just Partial<TConfig>

* 🛡️ fix: Isolate Ollama cache write from fetch try-catch, add Ollama cache tests

- Separate Ollama fetch and cache write into distinct scopes so a cache
  failure (e.g., Redis down) doesn't misattribute the error as an Ollama
  API failure and fall through to the OpenAI-compatible path (Issue A)
- Add 2 Ollama-specific cache tests: models written with TTL on fetch,
  cached models returned without hitting server (Issue B)
- Replace hardcoded 120000 with Time.TWO_MINUTES constant in cache TTL
  test assertion (Issue C)
- Fix OVERRIDE_KEY_MAP JSDoc to accurately describe runtime vs compile-time
  type enforcement (Issue D)
- Add global beforeEach for cache mock reset to prevent cross-test leakage

* 🧪 fix: Address third review — DI consistency, cache key width, MCP tests

- Inject loadCustomEndpointsConfig via EndpointsConfigDeps with default
  fallback, matching loadDefaultEndpointsConfig DI pattern (Finding 3)
- Widen modelsCacheKey from 64-bit (.slice(0,16)) to 128-bit (.slice(0,32))
  for collision-sensitive cross-credential cache key (Finding 4)
- Add fetchModels.mockReset() in loadConfigModels.spec beforeEach to
  prevent mock implementation leaks across tests (Finding 5)
- Add 11 unit tests for createMCPToolCacheService covering all three
  functions: null/empty input, successful ops, error propagation,
  cold-cache merge (Finding 2)
- Simplify getModelsConfig JSDoc to @see reference (Finding 10)

* ♻️ refactor: Address remaining follow-ups from reviews

OVERRIDE_KEY_MAP completeness:
- Add missing turnstile→turnstileConfig mapping
- Add exhaustiveness test verifying all three renamed keys are remapped
  and original YAML keys don't leak through

Import role context:
- Pass userRole through importConversations job → importLibreChatConvo
  so role-based endpoint overrides are honored during conversation import
- Update convos.js route to include req.user.role in the job payload

createEndpointsConfigService unit tests:
- Add 8 tests covering: default+custom merge, Azure/AzureAssistants/
  Anthropic Vertex/Bedrock config enrichment, assistants version
  coercion, agents allowedProviders, req.config bypass

Plugins/tools efficiency:
- Use Set for includedTools/filteredTools lookups (O(1) vs O(n) per plugin)
- Combine auth check + filter into single pass (eliminates intermediate array)
- Pre-compute toolDefKeys Set for O(1) tool definition lookups

* fix: Scope model query cache by user when userIdQuery is enabled

* fix: Skip model cache for userIdQuery endpoints, fix endpoints test types

- When userIdQuery is true, skip caching entirely (like user_provided keys)
  to avoid cross-user model list leakage without duplicating cache data
- Fix AgentCapabilities type error in endpoints.spec.ts — use enum values
  and appConfig() helper for partial mock typing

* 🐛 fix: Restore filteredTools+includedTools composition, add checkCapability tests

- Fix filteredTools regression: whitelist and blacklist are now applied
  independently (two flat guards), matching original behavior where
  includedTools=['a','b'] + filteredTools=['b'] produces ['a'] (Finding A)
- Fix Set spread in toolkit loop: pre-compute toolDefKeysList array once
  alongside the Set, reuse for .some() without per-plugin allocation (Finding B)
- Add 2 filteredTools tests: blacklist-only path and combined
  whitelist+blacklist composition (Finding C)
- Add 3 checkCapability tests: capability present, capability absent,
  fallback to defaultAgentCapabilities for non-agents endpoints (Finding D)

* 🔑 fix: Include config-override MCP servers in filterAuthorizedTools

Config-override MCP servers (defined via admin config overrides for
roles/groups) were rejected by filterAuthorizedTools because it called
getAllServerConfigs(userId) without the configServers parameter. Only
YAML and DB-backed user servers were included in the access check.

- Add configServers parameter to filterAuthorizedTools
- Resolve config servers via resolveConfigServers(req) at all 4 callsites
  (create, update, duplicate, revert) using parallel Promise.all
- Pass configServers through to getAllServerConfigs(userId, configServers)
  so the registry merges config-source servers into the access check
- Update filterAuthorizedTools.spec.js mock for resolveConfigServers

* fix: Skip model cache for userIdQuery endpoints, fix endpoints test types

For user-provided key endpoints (userProvide: true), skip the full model
list re-fetch during message validation — the user already selected from
a list we served them, and re-fetching with skipCache:true on every
message send is both slow and fragile (5s provider timeout = rejected model).

Instead, validate the model string format only:
- Must be a string, max 256 chars
- Must match [a-zA-Z0-9][a-zA-Z0-9_.:\-/@+ ]* (covers all known provider
  model ID formats while rejecting injection attempts)

System-configured endpoints still get full model list validation as before.

* 🧪 test: Add regression tests for filterAuthorizedTools configServers and validateModel

filterAuthorizedTools:
- Add test verifying configServers is passed to getAllServerConfigs and
  config-override server tools are allowed through
- Guard resolveConfigServers in createAgentHandler to only run when
  MCP tools are present (skip for tool-free agent creates)

validateModel (12 new tests):
- Format validation: missing model, non-string, length overflow, leading
  special char, script injection, standard model ID acceptance
- userProvide early-return: next() called immediately, getModelsConfig
  not invoked (regression guard for the exact bug this fixes)
- System endpoint list validation: reject unknown model, accept known
  model, handle null/missing models config

Also fix unnecessary backslash escape in MODEL_PATTERN regex.

* 🧹 fix: Remove space from MODEL_PATTERN, trim input, clean up nits

- Remove space character from MODEL_PATTERN regex — no real model ID
  uses spaces; prevents spurious violation logs from whitespace artifacts
- Add model.trim() before validation to handle accidental whitespace
- Remove redundant filterUniquePlugins call on already-deduplicated output
- Add comment documenting intentional whitelist+blacklist composition
- Add getUserKeyValues.mockReset() in loadConfigModels.spec beforeEach
- Remove narrating JSDoc from getModelsConfig one-liner
- Add 2 tests: trim whitespace handling, reject spaces in model ID

* fix: Match startup tool loader semantics — includedTools takes precedence over filteredTools

The startup tool loader (loadAndFormatTools) explicitly ignores
filteredTools when includedTools is set, with a warning log. The
PluginController was applying both independently, creating inconsistent
behavior where the same config produced different results at startup
vs plugin listing time.

Restored mutually exclusive semantics: when includedTools is non-empty,
filteredTools is not evaluated.

* 🧹 chore: Simplify validateModel flow, note auth requirement on endpoints route

- Separate missing-model from invalid-model checks cleanly: type+presence
  guard first, then trim+format guard (reviewer NIT)
- Add route comment noting auth is required for role/tenant scoping

* fix: Write trimmed model back to req.body.model for downstream consumers
2026-03-30 16:49:48 -04:00
Danny Avila
9f6d8c6e93
🧵 feat: ALS Context Middleware, Tenant Threading, and Config Cache Invalidation (#12407)
* feat: add tenant context middleware for ALS-based isolation

Introduces tenantContextMiddleware that propagates req.user.tenantId
into AsyncLocalStorage, activating the Mongoose applyTenantIsolation
plugin for all downstream DB queries within a request.

- Strict mode (TENANT_ISOLATION_STRICT=true) returns 403 if no tenantId
- Non-strict mode passes through for backward compatibility
- No-op for unauthenticated requests
- Includes 6 unit tests covering all paths

* feat: register tenant middleware and wrap startup/auth in runAsSystem()

- Register tenantContextMiddleware in Express app after capability middleware
- Wrap server startup initialization in runAsSystem() for strict mode compat
- Wrap auth strategy getAppConfig() calls in runAsSystem() since they run
  before user context is established (LDAP, SAML, OpenID, social login, AuthService)

* feat: thread tenantId through all getAppConfig callers

Pass tenantId from req.user to getAppConfig() across all callers that
have request context, ensuring correct per-tenant cache key resolution.

Also fixes getBaseConfig admin endpoint to scope to requesting admin's
tenant instead of returning the unscoped base config.

Files updated:
- Controllers: UserController, PluginController
- Middleware: checkDomainAllowed, balance
- Routes: config
- Services: loadConfigModels, loadDefaultModels, getEndpointsConfig, MCP
- Audio services: TTSService, STTService, getVoices, getCustomConfigSpeech
- Admin: getBaseConfig endpoint

* feat: add config cache invalidation on admin mutations

- Add clearOverrideCache(tenantId?) to flush per-principal override caches
  by enumerating Keyv store keys matching _OVERRIDE_: prefix
- Add invalidateConfigCaches() helper that clears base config, override
  caches, tool caches, and endpoint config cache in one call
- Wire invalidation into all 5 admin config mutation handlers
  (upsert, patch, delete field, delete overrides, toggle active)
- Add strict mode warning when __default__ tenant fallback is used
- Add 3 new tests for clearOverrideCache (all/scoped/base-preserving)

* chore: update getUserPrincipals comment to reflect ALS-based tenant filtering

The TODO(#12091) about missing tenantId filtering is resolved by the
tenant context middleware + applyTenantIsolation Mongoose plugin.
Group queries are now automatically scoped by tenantId via ALS.

* fix: replace runAsSystem with baseOnly for pre-tenant code paths

App configs are tenant-owned — runAsSystem() would bypass tenant
isolation and return cross-tenant DB overrides. Instead, add
baseOnly option to getAppConfig() that returns YAML-derived config
only, with zero DB queries.

All startup code, auth strategies, and MCP initialization now use
getAppConfig({ baseOnly: true }) to get the YAML config without
touching the Config collection.

* fix: address PR review findings — middleware ordering, types, cache safety

- Chain tenantContextMiddleware inside requireJwtAuth after passport auth
  instead of global app.use() where req.user is always undefined (Finding 1)
- Remove global tenantContextMiddleware registration from index.js
- Update BalanceMiddlewareOptions to include tenantId, remove redundant cast (Finding 4)
- Add warning log when clearOverrideCache cannot enumerate keys on Redis (Finding 3)
- Use startsWith instead of includes for cache key filtering (Finding 12)
- Use generator loop instead of Array.from for key enumeration (Finding 3)
- Selective barrel export — exclude _resetTenantMiddlewareStrictCache (Finding 5)
- Move isMainThread check to module level, remove per-request check (Finding 9)
- Move mid-file require to top of app.js (Finding 8)
- Parallelize invalidateConfigCaches with Promise.all (Finding 10)
- Remove clearOverrideCache from public app.js exports (internal only)
- Strengthen getUserPrincipals comment re: ALS dependency (Finding 2)

* fix: restore runAsSystem for startup DB ops, consolidate require, clarify baseOnly

- Restore runAsSystem() around performStartupChecks, updateInterfacePermissions,
  initializeMCPs, and initializeOAuthReconnectManager — these make Mongoose
  queries that need system context in strict tenant mode (NEW-3)
- Consolidate duplicate require('@librechat/api') in requireJwtAuth.js (NEW-1)
- Document that baseOnly ignores role/userId/tenantId in JSDoc (NEW-2)

* test: add requireJwtAuth tenant chaining + invalidateConfigCaches tests

- requireJwtAuth: 5 tests verifying ALS tenant context is set after
  passport auth, isolated between concurrent requests, and not set
  when user has no tenantId (Finding 6)
- invalidateConfigCaches: 4 tests verifying all four caches are cleared,
  tenantId is threaded through, partial failure is handled gracefully,
  and operations run in parallel via Promise.all (Finding 11)

* fix: address Copilot review — passport errors, namespaced cache keys, /base scoping

- Forward passport errors in requireJwtAuth before entering tenant
  middleware — prevents silent auth failures from reaching handlers (P1)
- Account for Keyv namespace prefix in clearOverrideCache — stored keys
  are namespaced as "APP_CONFIG:_OVERRIDE_:..." not "_OVERRIDE_:...",
  so override caches were never actually matched/cleared (P2)
- Remove role from getBaseConfig — /base should return tenant-scoped
  base config, not role-merged config that drifts per admin role (P2)
- Return tenantStorage.run() for cleaner async semantics
- Update mock cache in service.spec.ts to simulate Keyv namespacing

* fix: address second review — cache safety, code quality, test reliability

- Decouple cache invalidation from mutation response: fire-and-forget
  with logging so DB mutation success is not masked by cache failures
- Extract clearEndpointConfigCache helper from inline IIFE
- Move isMainThread check to lazy once-per-process guard (no import
  side effect)
- Memoize process.env read in overrideCacheKey to avoid per-request
  env lookups and log flooding in strict mode
- Remove flaky timer-based parallelism assertion, use structural check
- Merge orphaned double JSDoc block on getUserPrincipals
- Fix stale [getAppConfig] log prefix → [ensureBaseConfig]
- Fix import order in tenant.spec.ts (package types before local values)
- Replace "Finding 1" reference with self-contained description
- Use real tenantStorage primitives in requireJwtAuth spec mock

* fix: move JSDoc to correct function after clearEndpointConfigCache extraction

* refactor: remove Redis SCAN from clearOverrideCache, rely on TTL expiry

Redis SCAN causes 60s+ stalls under concurrent load (see #12410).
APP_CONFIG defaults to FORCED_IN_MEMORY_CACHE_NAMESPACES, so the
in-memory store.keys() path handles the standard case. When APP_CONFIG
is Redis-backed, overrides expire naturally via overrideCacheTtl (60s
default) — an acceptable window for admin config mutations.

* fix: remove return from tenantStorage.run to satisfy void middleware signature

* fix: address second review — cache safety, code quality, test reliability

- Switch invalidateConfigCaches from Promise.all to Promise.allSettled
  so partial failures are logged individually instead of producing one
  undifferentiated error (Finding 3)
- Gate overrideCacheKey strict-mode warning behind a once-per-process
  flag to prevent log flooding under load (Finding 4)
- Add test for passport error forwarding in requireJwtAuth — the
  if (err) { return next(err) } branch now has coverage (Finding 5)
- Add test for real partial failure in invalidateConfigCaches where
  clearAppConfigCache rejects (not just the swallowed endpoint error)

* chore: reorder imports in index.js and app.js for consistency

- Moved logger and runAsSystem imports to maintain a consistent import order across files.
- Improved code readability by ensuring related imports are grouped together.
2026-03-26 17:35:00 -04:00