| 
                        123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475 | 
                        - ---
 - sidebar_position: 5
 - slug: /deploy_local_llm
 - ---
 - 
 - # Deploy a local LLM
 - 
 - RAGFlow supports deploying LLMs locally using Ollama or Xinference.
 - 
 - ## Ollama
 - 
 - One-click deployment of local LLMs, that is [Ollama](https://github.com/ollama/ollama).
 - 
 - ### Install
 - 
 - - [Ollama on Linux](https://github.com/ollama/ollama/blob/main/docs/linux.md)
 - - [Ollama Windows Preview](https://github.com/ollama/ollama/blob/main/docs/windows.md)
 - - [Docker](https://hub.docker.com/r/ollama/ollama)
 - 
 - ### Launch Ollama
 - 
 - Decide which LLM you want to deploy ([here's a list for supported LLM](https://ollama.com/library)), say, **mistral**:
 - ```bash
 - $ ollama run mistral
 - ```
 - Or,
 - ```bash
 - $ docker exec -it ollama ollama run mistral
 - ```
 - 
 - ### Use Ollama in RAGFlow
 - 
 - - Go to 'Settings > Model Providers > Models to be added > Ollama'.
 -     
 - 
 - 
 - > Base URL: Enter the base URL where the Ollama service is accessible, like, `http://<your-ollama-endpoint-domain>:11434`.
 - 
 - - Use Ollama Models.
 - 
 - 
 - 
 - ## Xinference
 - 
 - Xorbits Inference([Xinference](https://github.com/xorbitsai/inference)) empowers you to unleash the full potential of cutting-edge AI models. 
 - 
 - ### Install
 - 
 - - [pip install "xinference[all]"](https://inference.readthedocs.io/en/latest/getting_started/installation.html)
 - - [Docker](https://inference.readthedocs.io/en/latest/getting_started/using_docker_image.html)
 - 
 - To start a local instance of Xinference, run the following command:
 - ```bash
 - $ xinference-local --host 0.0.0.0 --port 9997
 - ```
 - ### Launch Xinference
 - 
 - Decide which LLM you want to deploy ([here's a list for supported LLM](https://inference.readthedocs.io/en/latest/models/builtin/)), say, **mistral**.
 - Execute the following command to launch the model, remember to replace ${quantization} with your chosen quantization method from the options listed above:
 - ```bash
 - $ xinference launch -u mistral --model-name mistral-v0.1 --size-in-billions 7 --model-format pytorch --quantization ${quantization}
 - ```
 - 
 - ### Use Xinference in RAGFlow
 - 
 - - Go to 'Settings > Model Providers > Models to be added > Xinference'.
 -     
 - 
 - 
 - > Base URL: Enter the base URL where the Xinference service is accessible, like, `http://<your-xinference-endpoint-domain>:9997/v1`.
 - 
 - - Use Xinference Models.
 - 
 - 
 - 
 
 
  |