浏览代码

add using jina deploy local llm in deploy_local_llm.mdx (#1872)

### What problem does this PR solve?

add using jina deploy local llm in deploy_local_llm.mdx

### Type of change

- [x] Documentation Update

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
tags/v0.10.0
黄腾 1年前
父节点
当前提交
44184d12a8
没有帐户链接到提交者的电子邮件
共有 1 个文件被更改,包括 34 次插入0 次删除
  1. 34
    0
      docs/guides/deploy_local_llm.mdx

+ 34
- 0
docs/guides/deploy_local_llm.mdx 查看文件

This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference. This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference.
::: :::


# Deploy a local model using jina

[Jina](https://github.com/jina-ai/jina) lets you build AI services and pipelines that communicate via gRPC, HTTP and WebSockets, then scale them up and deploy to production.

To deploy a local model, e.g., **gpt2**, using Jina:

### 1. Check firewall settings

Ensure that your host machine's firewall allows inbound connections on port 12345.

```bash
sudo ufw allow 12345/tcp
```

### 2.install jina package

```bash
pip install jina
```

### 3. deployment local model

Step 1: Navigate to the rag/svr directory.

```bash
cd rag/svr
```

Step 2: Use Python to run the jina_server.py script and pass in the model name or the local path of the model (the script only supports loading models downloaded from Huggingface)

```bash
python jina_server.py --model_name gpt2
```

## Deploy a local model using Ollama ## Deploy a local model using Ollama


[Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage. [Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage.

正在加载...
取消
保存