mirror of
https://github.com/danny-avila/LibreChat.git
synced 2025-09-22 06:00:56 +02:00
🔧 fix: Catch deleteVectors
Errors & Update RAG API docs (#2299)
* fix(deleteVectors): handle errors gracefully * chore: update docs based on new alternate env vars prefixed with RAG to avoid conflicts with LibreChat keys
This commit is contained in:
parent
e3c236ba3b
commit
e418edd3dc
3 changed files with 13 additions and 22 deletions
|
@ -18,9 +18,12 @@ const { logger } = require('~/config');
|
||||||
* file path is invalid or if there is an error in deletion.
|
* file path is invalid or if there is an error in deletion.
|
||||||
*/
|
*/
|
||||||
const deleteVectors = async (req, file) => {
|
const deleteVectors = async (req, file) => {
|
||||||
if (file.embedded && process.env.RAG_API_URL) {
|
if (!file.embedded || !process.env.RAG_API_URL) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
try {
|
||||||
const jwtToken = req.headers.authorization.split(' ')[1];
|
const jwtToken = req.headers.authorization.split(' ')[1];
|
||||||
axios.delete(`${process.env.RAG_API_URL}/documents`, {
|
return await axios.delete(`${process.env.RAG_API_URL}/documents`, {
|
||||||
headers: {
|
headers: {
|
||||||
Authorization: `Bearer ${jwtToken}`,
|
Authorization: `Bearer ${jwtToken}`,
|
||||||
'Content-Type': 'application/json',
|
'Content-Type': 'application/json',
|
||||||
|
@ -28,6 +31,9 @@ const deleteVectors = async (req, file) => {
|
||||||
},
|
},
|
||||||
data: [file.file_id],
|
data: [file.file_id],
|
||||||
});
|
});
|
||||||
|
} catch (error) {
|
||||||
|
logger.error('Error deleting vectors', error);
|
||||||
|
throw new Error(error.message || 'An error occurred during file deletion.');
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
|
@ -101,16 +101,6 @@ version: '3.4'
|
||||||
# rag_api:
|
# rag_api:
|
||||||
# image: ghcr.io/danny-avila/librechat-rag-api-dev:latest
|
# image: ghcr.io/danny-avila/librechat-rag-api-dev:latest
|
||||||
|
|
||||||
# # USE RAG API IMAGE WITH A DIFFERENT OPENAI API KEY FROM THE ENV FILE
|
|
||||||
# rag_api:
|
|
||||||
# environment:
|
|
||||||
# - OPENAI_API_KEY=sk-your_openai_api_key
|
|
||||||
|
|
||||||
# # OR, USE A CUSTOM ENVIRONMENT VARIABLE TO AVOID HARD-CODING IT
|
|
||||||
# rag_api:
|
|
||||||
# environment:
|
|
||||||
# - OPENAI_API_KEY=${OPENAI_EMBEDDINGS_API_KEY}
|
|
||||||
|
|
||||||
# # ADD OLLAMA
|
# # ADD OLLAMA
|
||||||
# ollama:
|
# ollama:
|
||||||
# image: ollama/ollama:latest
|
# image: ollama/ollama:latest
|
||||||
|
|
|
@ -60,19 +60,14 @@ This contrasts Docker, where is already set in the default `docker-compose.yml`
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
The RAG API provides several configuration options that can be set using environment variables from an `.env` file accessible to the API. Most of them are optional, asides from the credentials/paths necessary for the provider you configured. In the default setup, only OPENAI_API_KEY is required.
|
The RAG API provides several configuration options that can be set using environment variables from an `.env` file accessible to the API. Most of them are optional, asides from the credentials/paths necessary for the provider you configured. In the default setup, only `RAG_OPENAI_API_KEY` is required.
|
||||||
|
|
||||||
> !!! **Important:** When using the default docker setup, the .env file is shared between LibreChat and the RAG API.
|
> !!! **Important:** When using the default docker setup, the .env file is shared between LibreChat and the RAG API. For this reason, it's important to define the needed variables shown in the [RAG API readme.md](https://github.com/danny-avila/rag_api/blob/main/README.md)
|
||||||
|
|
||||||
> You will need to utilize the [Docker Compose Override File](../install/configuration/docker_override.md) to set a unique OPENAI_API_KEY value for RAG API, that is different from the one in your `.env` file.
|
|
||||||
|
|
||||||
> This may be necessary if you wish to use OpenAI for vector embeddings, but have set `OPENAI_API_KEY=user_provided`
|
|
||||||
|
|
||||||
> There is an example for this in `docker-compose.override.yml.example`
|
|
||||||
|
|
||||||
Here are some notable configurations:
|
Here are some notable configurations:
|
||||||
|
|
||||||
- `OPENAI_API_KEY`: The API key for OpenAI API Embeddings (if using default settings).
|
- `RAG_OPENAI_API_KEY`: The API key for OpenAI API Embeddings (if using default settings).
|
||||||
|
- Note: `OPENAI_API_KEY` will work but `RAG_OPENAI_API_KEY` will override it in order to not conflict with the LibreChat credential.
|
||||||
- `RAG_PORT`: The port number where the API server will run. Defaults to port 8000.
|
- `RAG_PORT`: The port number where the API server will run. Defaults to port 8000.
|
||||||
- `RAG_HOST`: The hostname or IP address where the API server will run. Defaults to "0.0.0.0"
|
- `RAG_HOST`: The hostname or IP address where the API server will run. Defaults to "0.0.0.0"
|
||||||
- `COLLECTION_NAME`: The name of the collection in the vector store. Default is "testcollection".
|
- `COLLECTION_NAME`: The name of the collection in the vector store. Default is "testcollection".
|
||||||
|
@ -136,7 +131,7 @@ RAG consists of two main phases: retrieval and content generation.
|
||||||
|
|
||||||
### Challenges and Ongoing Research
|
### Challenges and Ongoing Research
|
||||||
|
|
||||||
While RAG is currently the best-known tool for grounding LLMs on the latest, verifiable information and lowering the costs of constant retraining and updating, it is not perfect. Some challenges include:
|
While RAG is currently one of the best-known tools for grounding LLMs on the latest, verifiable information and lowering the costs of constant retraining and updating, it's not perfect. Some challenges include:
|
||||||
|
|
||||||
1. **Recognizing unanswerable questions**: LLMs need to be explicitly trained to recognize questions they can't answer based on the available information. This may require fine-tuning on thousands of examples of answerable and unanswerable questions.
|
1. **Recognizing unanswerable questions**: LLMs need to be explicitly trained to recognize questions they can't answer based on the available information. This may require fine-tuning on thousands of examples of answerable and unanswerable questions.
|
||||||
2. **Improving retrieval and generation**: Ongoing research focuses on innovating at both ends of the RAG process: improving the retrieval of the most relevant information possible to feed the LLM, and optimizing the structure of that information to obtain the richest responses from the LLM.
|
2. **Improving retrieval and generation**: Ongoing research focuses on innovating at both ends of the RAG process: improving the retrieval of the most relevant information possible to feed the LLM, and optimizing the structure of that information to obtain the richest responses from the LLM.
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue