refactor: improve findMatchingPattern to use longest-match strategy

Replace insertion-order-dependent matching with longest-key-wins approach.
This ensures specific patterns like "kimi-k2.5" always take priority over
broader patterns like "kimi" regardless of key ordering. Update the
documentation comment to reflect the new behavior.
This commit is contained in:
Marco Beretta 2026-02-08 17:37:22 +01:00
parent 023696c5d9
commit 94eaf47b62
No known key found for this signature in database
GPG key ID: D918033D8E74CC11

View file

@ -14,29 +14,22 @@ export type EndpointTokenConfig = Record<string, TokenConfig>;
/** /**
* Model Token Configuration Maps * Model Token Configuration Maps
* *
* IMPORTANT: Key Ordering for Pattern Matching * Pattern Matching Behavior
* ============================================ * ========================
* The `findMatchingPattern` function iterates through object keys in REVERSE order * The `findMatchingPattern` function uses `modelName.includes(key)` for matching
* (last-defined keys are checked first) and uses `modelName.includes(key)` for matching. * and selects the **longest matching key** to ensure the most specific pattern wins.
* *
* This means: * For example, given model name "kimi-k2.5-latest":
* 1. BASE PATTERNS must be defined FIRST (e.g., "kimi", "moonshot") * - "kimi" matches (length 4)
* 2. SPECIFIC PATTERNS must be defined AFTER their base patterns (e.g., "kimi-k2", "kimi-k2.5") * - "kimi-k2" matches (length 7)
* - "kimi-k2.5" matches (length 9) <-- selected as longest match
* *
* Example ordering for Kimi models: * This means key insertion order does NOT affect specificity the longest key always
* kimi: 262144, // Base pattern - checked last * wins. However, for equal-length keys, insertion order still serves as a tiebreaker
* 'kimi-k2': 262144, // More specific - checked before "kimi" * (last-defined key checked first via reverse iteration).
* 'kimi-k2.5': 262144, // Most specific - checked first
* *
* Why this matters: * When adding new model families, ensure no pattern key is an unintended substring
* - Model name "kimi-k2.5" contains both "kimi" and "kimi-k2" as substrings * of another that maps to a different value.
* - If "kimi" were checked first, it would incorrectly match "kimi-k2.5"
* - By defining specific patterns AFTER base patterns, they're checked first in reverse iteration
*
* When adding new model families:
* 1. Define the base/generic pattern first
* 2. Define increasingly specific patterns after
* 3. Ensure no pattern is a substring of another that should match differently
*/ */
const openAIModels = { const openAIModels = {
@ -429,7 +422,12 @@ export const maxOutputTokensMap = {
}; };
/** /**
* Finds the first matching pattern in the tokens map. * Finds the most specific matching pattern in the tokens map.
*
* Uses `includes()` matching so provider-prefixed names (e.g. "openai/gpt-4")
* resolve correctly. When multiple keys match, the longest key wins to ensure
* specific patterns like "kimi-k2.5" take priority over broader ones like "kimi".
*
* @param {string} modelName * @param {string} modelName
* @param {Record<string, number> | EndpointTokenConfig} tokensMap * @param {Record<string, number> | EndpointTokenConfig} tokensMap
* @returns {string|null} * @returns {string|null}
@ -440,14 +438,18 @@ export function findMatchingPattern(
): string | null { ): string | null {
const keys = Object.keys(tokensMap); const keys = Object.keys(tokensMap);
const lowerModelName = modelName.toLowerCase(); const lowerModelName = modelName.toLowerCase();
let bestMatch: string | null = null;
let bestLength = 0;
for (let i = keys.length - 1; i >= 0; i--) { for (let i = keys.length - 1; i >= 0; i--) {
const modelKey = keys[i]; const modelKey = keys[i];
if (lowerModelName.includes(modelKey)) { if (lowerModelName.includes(modelKey) && modelKey.length > bestLength) {
return modelKey; bestMatch = modelKey;
bestLength = modelKey.length;
} }
} }
return null; return bestMatch;
} }
/** /**