mirror of
https://github.com/danny-avila/LibreChat.git
synced 2025-12-17 00:40:14 +01:00
⚡ refactor: Optimize & Standardize Tokenizer Usage (#10777)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
* refactor: Token Limit Processing with Enhanced Efficiency - Added a new test suite for `processTextWithTokenLimit`, ensuring comprehensive coverage of various scenarios including under, at, and exceeding token limits. - Refactored the `processTextWithTokenLimit` function to utilize a ratio-based estimation method, significantly reducing the number of token counting function calls compared to the previous binary search approach. - Improved handling of edge cases and variable token density, ensuring accurate truncation and performance across diverse text inputs. - Included direct comparisons with the old implementation to validate correctness and efficiency improvements. * refactor: Remove Tokenizer Route and Related References - Deleted the tokenizer route from the server and removed its references from the routes index and server files, streamlining the API structure. - This change simplifies the routing configuration by eliminating unused endpoints. * refactor: Migrate countTokens Utility to API Module - Removed the local countTokens utility and integrated it into the @librechat/api module for centralized access. - Updated various files to reference the new countTokens import from the API module, ensuring consistent usage across the application. - Cleaned up unused references and imports related to the previous countTokens implementation. * refactor: Centralize escapeRegExp Utility in API Module - Moved the escapeRegExp function from local utility files to the @librechat/api module for consistent usage across the application. - Updated imports in various files to reference the new centralized escapeRegExp function, ensuring cleaner code and reducing redundancy. - Removed duplicate implementations of escapeRegExp from multiple files, streamlining the codebase. * refactor: Enhance Token Counting Flexibility in Text Processing - Updated the `processTextWithTokenLimit` function to accept both synchronous and asynchronous token counting functions, improving its versatility. - Introduced a new `TokenCountFn` type to define the token counting function signature. - Added comprehensive tests to validate the behavior of `processTextWithTokenLimit` with both sync and async token counting functions, ensuring consistent results. - Implemented a wrapper to track call counts for the `countTokens` function, optimizing performance and reducing unnecessary calls. - Enhanced existing tests to compare the performance of the new implementation against the old one, demonstrating significant improvements in efficiency. * chore: documentation for Truncation Safety Buffer in Token Processing - Added a safety buffer multiplier to the character position estimates during text truncation to prevent overshooting token limits. - Updated the `processTextWithTokenLimit` function to utilize the new `TRUNCATION_SAFETY_BUFFER` constant, enhancing the accuracy of token limit processing. - Improved documentation to clarify the rationale behind the buffer and its impact on performance and efficiency in token counting.
This commit is contained in:
parent
b2387cc6fa
commit
8bdc808074
19 changed files with 925 additions and 107 deletions
|
|
@ -1,37 +0,0 @@
|
|||
const { Tiktoken } = require('tiktoken/lite');
|
||||
const { logger } = require('@librechat/data-schemas');
|
||||
const p50k_base = require('tiktoken/encoders/p50k_base.json');
|
||||
const cl100k_base = require('tiktoken/encoders/cl100k_base.json');
|
||||
|
||||
/**
|
||||
* Counts the number of tokens in a given text using a specified encoding model.
|
||||
*
|
||||
* This function utilizes the 'Tiktoken' library to encode text based on the selected model.
|
||||
* It supports two models, 'text-davinci-003' and 'gpt-3.5-turbo', each with its own encoding strategy.
|
||||
* For 'text-davinci-003', the 'p50k_base' encoder is used, whereas for other models, the 'cl100k_base' encoder is applied.
|
||||
* In case of an error during encoding, the error is logged, and the function returns 0.
|
||||
*
|
||||
* @async
|
||||
* @param {string} text - The text to be tokenized. Defaults to an empty string if not provided.
|
||||
* @param {string} modelName - The name of the model used for tokenizing. Defaults to 'gpt-3.5-turbo'.
|
||||
* @returns {Promise<number>} The number of tokens in the provided text. Returns 0 if an error occurs.
|
||||
* @throws Logs the error to a logger and rethrows if any error occurs during tokenization.
|
||||
*/
|
||||
const countTokens = async (text = '', modelName = 'gpt-3.5-turbo') => {
|
||||
let encoder = null;
|
||||
try {
|
||||
const model = modelName.includes('text-davinci-003') ? p50k_base : cl100k_base;
|
||||
encoder = new Tiktoken(model.bpe_ranks, model.special_tokens, model.pat_str);
|
||||
const tokens = encoder.encode(text);
|
||||
encoder.free();
|
||||
return tokens.length;
|
||||
} catch (e) {
|
||||
logger.error('[countTokens]', e);
|
||||
if (encoder) {
|
||||
encoder.free();
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
};
|
||||
|
||||
module.exports = countTokens;
|
||||
|
|
@ -10,14 +10,6 @@ const {
|
|||
const { sendEvent } = require('@librechat/api');
|
||||
const partialRight = require('lodash/partialRight');
|
||||
|
||||
/** Helper function to escape special characters in regex
|
||||
* @param {string} string - The string to escape.
|
||||
* @returns {string} The escaped string.
|
||||
*/
|
||||
function escapeRegExp(string) {
|
||||
return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
|
||||
}
|
||||
|
||||
const addSpaceIfNeeded = (text) => (text.length > 0 && !text.endsWith(' ') ? text + ' ' : text);
|
||||
|
||||
const base = { message: true, initial: true };
|
||||
|
|
@ -181,7 +173,6 @@ function generateConfig(key, baseURL, endpoint) {
|
|||
module.exports = {
|
||||
handleText,
|
||||
formatSteps,
|
||||
escapeRegExp,
|
||||
formatAction,
|
||||
isUserProvided,
|
||||
generateConfig,
|
||||
|
|
|
|||
|
|
@ -1,5 +1,4 @@
|
|||
const removePorts = require('./removePorts');
|
||||
const countTokens = require('./countTokens');
|
||||
const handleText = require('./handleText');
|
||||
const sendEmail = require('./sendEmail');
|
||||
const queue = require('./queue');
|
||||
|
|
@ -7,7 +6,6 @@ const files = require('./files');
|
|||
|
||||
module.exports = {
|
||||
...handleText,
|
||||
countTokens,
|
||||
removePorts,
|
||||
sendEmail,
|
||||
...files,
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue