mirror of
https://github.com/danny-avila/LibreChat.git
synced 2026-03-09 17:42:38 +01:00
📚 docs: Separate LiteLLM and Ollama Documentation (#1948)
* Separate LiteLLM and Ollama Documentation * Clarify Ollama Setup * Fix litellm config
This commit is contained in:
parent
b2ef75e009
commit
78f52859c4
4 changed files with 179 additions and 199 deletions
29
docs/install/configuration/ollama.md
Normal file
29
docs/install/configuration/ollama.md
Normal file
|
|
@ -0,0 +1,29 @@
|
|||
---
|
||||
title: 🚅 Ollama
|
||||
description: Using LibreChat with Ollama
|
||||
weight: -6
|
||||
---
|
||||
## Ollama
|
||||
Use [Ollama](https://ollama.ai/) for
|
||||
|
||||
* Running large language models on local hardware
|
||||
* Hosting multiple models
|
||||
* Dynamically loading the model upon request
|
||||
|
||||
### 1. Install Ollama
|
||||
#### Mac, Linux, Windows Install
|
||||
Ollama supports GPU acceleration on Nvidia, AMD, and Apple Metal. Follow Instructions at [Ollama Download](https://ollama.com/download)
|
||||
|
||||
#### Docker Install
|
||||
Reference docker-compose.override.yml.example for configuration of Ollama in a Docker environment.
|
||||
|
||||
Run ```docker exec -it ollama /bin/bash``` to access the Ollama command within the container.
|
||||
|
||||
### 2. Load Models in Ollama
|
||||
1. Browse the available models at [Ollama Library](https://ollama.ai/library)
|
||||
2. Copy the text from the Tags tab from the library website and paste it into the terminal. It should begin with 'ollama run'
|
||||
3. Check model size. Models that can run in GPU memory perform the best.
|
||||
4. Use /bye to exit the terminal
|
||||
|
||||
### 3. Configure LibreChat
|
||||
Use `librechat.yaml` [Configuration file (guide here)](./ai_endpoints.md) to add Ollama as a separate endpoint.
|
||||
Loading…
Add table
Add a link
Reference in a new issue