浏览代码

Fix instructions for Ollama (#7468)

1. Use `host.docker.internal` as base URL
2. Fix numbers in list
3. Make clear what is the console input and what is the output

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
tags/v0.19.0
Raffaele Mancuso 6 个月前
父节点
当前提交
c4b3d3af95
没有帐户链接到提交者的电子邮件
共有 1 个文件被更改,包括 25 次插入25 次删除
  1. 25
    25
      docs/guides/models/deploy_local_llm.mdx

+ 25
- 25
docs/guides/models/deploy_local_llm.mdx 查看文件

### 1. Deploy Ollama using Docker ### 1. Deploy Ollama using Docker


```bash ```bash
sudo docker run --name ollama -p 11434:11434 ollama/ollama
time=2024-12-02T02:20:21.360Z level=INFO source=routes.go:1248 msg="Listening on [::]:11434 (version 0.4.6)"
time=2024-12-02T02:20:21.360Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12]"
$ sudo docker run --name ollama -p 11434:11434 ollama/ollama
> time=2024-12-02T02:20:21.360Z level=INFO source=routes.go:1248 msg="Listening on [::]:11434 (version 0.4.6)"
> time=2024-12-02T02:20:21.360Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12]"
``` ```


Ensure Ollama is listening on all IP address: Ensure Ollama is listening on all IP address:
```bash ```bash
sudo ss -tunlp | grep 11434
tcp LISTEN 0 4096 0.0.0.0:11434 0.0.0.0:* users:(("docker-proxy",pid=794507,fd=4))
tcp LISTEN 0 4096 [::]:11434 [::]:* users:(("docker-proxy",pid=794513,fd=4))
$ sudo ss -tunlp | grep 11434
> tcp LISTEN 0 4096 0.0.0.0:11434 0.0.0.0:* users:(("docker-proxy",pid=794507,fd=4))
> tcp LISTEN 0 4096 [::]:11434 [::]:* users:(("docker-proxy",pid=794513,fd=4))
``` ```


Pull models as you need. We recommend that you start with `llama3.2` (a 3B chat model) and `bge-m3` (a 567M embedding model): Pull models as you need. We recommend that you start with `llama3.2` (a 3B chat model) and `bge-m3` (a 567M embedding model):
```bash ```bash
sudo docker exec ollama ollama pull llama3.2
pulling dde5aa3fc5ff... 100% ▕████████████████▏ 2.0 GB
success
$ sudo docker exec ollama ollama pull llama3.2
> pulling dde5aa3fc5ff... 100% ▕████████████████▏ 2.0 GB
> success
``` ```


```bash ```bash
sudo docker exec ollama ollama pull bge-m3
pulling daec91ffb5dd... 100% ▕████████████████▏ 1.2 GB
success
$ sudo docker exec ollama ollama pull bge-m3
> pulling daec91ffb5dd... 100% ▕████████████████▏ 1.2 GB
> success
``` ```


### 2. Ensure Ollama is accessible ### 2. Ensure Ollama is accessible


- If RAGFlow runs in Docker and Ollama runs on the same host machine, check if Ollama is accessible from inside the RAGFlow container: - If RAGFlow runs in Docker and Ollama runs on the same host machine, check if Ollama is accessible from inside the RAGFlow container:
```bash ```bash
sudo docker exec -it ragflow-server bash
curl http://host.docker.internal:11434/
Ollama is running
$ sudo docker exec -it ragflow-server bash
$ curl http://host.docker.internal:11434/
> Ollama is running
``` ```


- If RAGFlow is launched from source code and Ollama runs on the same host machine as RAGFlow, check if Ollama is accessible from RAGFlow's host machine: - If RAGFlow is launched from source code and Ollama runs on the same host machine as RAGFlow, check if Ollama is accessible from RAGFlow's host machine:
```bash ```bash
curl http://localhost:11434/
Ollama is running
$ curl http://localhost:11434/
> Ollama is running
``` ```


- If RAGFlow and Ollama run on different machines, check if Ollama is accessible from RAGFlow's host machine: - If RAGFlow and Ollama run on different machines, check if Ollama is accessible from RAGFlow's host machine:
```bash ```bash
curl http://${IP_OF_OLLAMA_MACHINE}:11434/
Ollama is running
$ curl http://${IP_OF_OLLAMA_MACHINE}:11434/
> Ollama is running
``` ```


### 4. Add Ollama
### 3. Add Ollama


In RAGFlow, click on your logo on the top right of the page **>** **Model providers** and add Ollama to RAGFlow: In RAGFlow, click on your logo on the top right of the page **>** **Model providers** and add Ollama to RAGFlow:


![add ollama](https://github.com/infiniflow/ragflow/assets/93570324/10635088-028b-4b3d-add9-5c5a6e626814) ![add ollama](https://github.com/infiniflow/ragflow/assets/93570324/10635088-028b-4b3d-add9-5c5a6e626814)




### 5. Complete basic Ollama settings
### 4. Complete basic Ollama settings


In the popup window, complete basic settings for Ollama: In the popup window, complete basic settings for Ollama:


1. Ensure that your model name and type match those been pulled at step 1 (Deploy Ollama using Docker). For example, (`llama3.2` and `chat`) or (`bge-m3` and `embedding`). 1. Ensure that your model name and type match those been pulled at step 1 (Deploy Ollama using Docker). For example, (`llama3.2` and `chat`) or (`bge-m3` and `embedding`).
2. Ensure that the base URL match the URL determined at step 2 (Ensure Ollama is accessible).
2. In Ollama base URL, as determined by step 2, replace `localhost` with `host.docker.internal`.
3. OPTIONAL: Switch on the toggle under **Does it support Vision?** if your model includes an image-to-text model. 3. OPTIONAL: Switch on the toggle under **Does it support Vision?** if your model includes an image-to-text model.




``` ```
::: :::


### 6. Update System Model Settings
### 5. Update System Model Settings


Click on your logo **>** **Model providers** **>** **System Model Settings** to update your model: Click on your logo **>** **Model providers** **>** **System Model Settings** to update your model:
- *You should now be able to find **llama3.2** from the dropdown list under **Chat model**, and **bge-m3** from the dropdown list under **Embedding model**.* - *You should now be able to find **llama3.2** from the dropdown list under **Chat model**, and **bge-m3** from the dropdown list under **Embedding model**.*
- _If your local model is an embedding model, you should find it under **Embedding model**._ - _If your local model is an embedding model, you should find it under **Embedding model**._


### 7. Update Chat Configuration
### 6. Update Chat Configuration


Update your model(s) accordingly in **Chat Configuration**. Update your model(s) accordingly in **Chat Configuration**.


```bash ```bash
python jina_server.py --model_name gpt2 python jina_server.py --model_name gpt2
``` ```
> The script only supports models downloaded from Hugging Face.
> The script only supports models downloaded from Hugging Face.

正在加载...
取消
保存