|
|
|
@@ -15,6 +15,40 @@ RAGFlow seamlessly integrates with Ollama and Xinference, without the need for f |
|
|
|
This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference. |
|
|
|
::: |
|
|
|
|
|
|
|
# Deploy a local model using jina |
|
|
|
|
|
|
|
[Jina](https://github.com/jina-ai/jina) lets you build AI services and pipelines that communicate via gRPC, HTTP and WebSockets, then scale them up and deploy to production. |
|
|
|
|
|
|
|
To deploy a local model, e.g., **gpt2**, using Jina: |
|
|
|
|
|
|
|
### 1. Check firewall settings |
|
|
|
|
|
|
|
Ensure that your host machine's firewall allows inbound connections on port 12345. |
|
|
|
|
|
|
|
```bash |
|
|
|
sudo ufw allow 12345/tcp |
|
|
|
``` |
|
|
|
|
|
|
|
### 2.install jina package |
|
|
|
|
|
|
|
```bash |
|
|
|
pip install jina |
|
|
|
``` |
|
|
|
|
|
|
|
### 3. deployment local model |
|
|
|
|
|
|
|
Step 1: Navigate to the rag/svr directory. |
|
|
|
|
|
|
|
```bash |
|
|
|
cd rag/svr |
|
|
|
``` |
|
|
|
|
|
|
|
Step 2: Use Python to run the jina_server.py script and pass in the model name or the local path of the model (the script only supports loading models downloaded from Huggingface) |
|
|
|
|
|
|
|
```bash |
|
|
|
python jina_server.py --model_name gpt2 |
|
|
|
``` |
|
|
|
|
|
|
|
## Deploy a local model using Ollama |
|
|
|
|
|
|
|
[Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage. |