Vous ne pouvez pas sélectionner plus de 25 sujets Les noms de sujets doivent commencer par une lettre ou un nombre, peuvent contenir des tirets ('-') et peuvent comporter jusqu'à 35 caractères.

deploy_local_llm.md 2.6KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475
  1. ---
  2. sidebar_position: 5
  3. slug: /deploy_local_llm
  4. ---
  5. # Deploy a local LLM
  6. RAGFlow supports deploying LLMs locally using Ollama or Xinference.
  7. ## Ollama
  8. One-click deployment of local LLMs, that is [Ollama](https://github.com/ollama/ollama).
  9. ### Install
  10. - [Ollama on Linux](https://github.com/ollama/ollama/blob/main/docs/linux.md)
  11. - [Ollama Windows Preview](https://github.com/ollama/ollama/blob/main/docs/windows.md)
  12. - [Docker](https://hub.docker.com/r/ollama/ollama)
  13. ### Launch Ollama
  14. Decide which LLM you want to deploy ([here's a list for supported LLM](https://ollama.com/library)), say, **mistral**:
  15. ```bash
  16. $ ollama run mistral
  17. ```
  18. Or,
  19. ```bash
  20. $ docker exec -it ollama ollama run mistral
  21. ```
  22. ### Use Ollama in RAGFlow
  23. - Go to 'Settings > Model Providers > Models to be added > Ollama'.
  24. ![](https://github.com/infiniflow/ragflow/assets/12318111/a9df198a-226d-4f30-b8d7-829f00256d46)
  25. > Base URL: Enter the base URL where the Ollama service is accessible, like, `http://<your-ollama-endpoint-domain>:11434`.
  26. - Use Ollama Models.
  27. ![](https://github.com/infiniflow/ragflow/assets/12318111/60ff384e-5013-41ff-a573-9a543d237fd3)
  28. ## Xinference
  29. Xorbits Inference([Xinference](https://github.com/xorbitsai/inference)) empowers you to unleash the full potential of cutting-edge AI models.
  30. ### Install
  31. - [pip install "xinference[all]"](https://inference.readthedocs.io/en/latest/getting_started/installation.html)
  32. - [Docker](https://inference.readthedocs.io/en/latest/getting_started/using_docker_image.html)
  33. To start a local instance of Xinference, run the following command:
  34. ```bash
  35. $ xinference-local --host 0.0.0.0 --port 9997
  36. ```
  37. ### Launch Xinference
  38. Decide which LLM you want to deploy ([here's a list for supported LLM](https://inference.readthedocs.io/en/latest/models/builtin/)), say, **mistral**.
  39. Execute the following command to launch the model, remember to replace ${quantization} with your chosen quantization method from the options listed above:
  40. ```bash
  41. $ xinference launch -u mistral --model-name mistral-v0.1 --size-in-billions 7 --model-format pytorch --quantization ${quantization}
  42. ```
  43. ### Use Xinference in RAGFlow
  44. - Go to 'Settings > Model Providers > Models to be added > Xinference'.
  45. ![](https://github.com/infiniflow/ragflow/assets/12318111/bcbf4d7a-ade6-44c7-ad5f-0a92c8a73789)
  46. > Base URL: Enter the base URL where the Xinference service is accessible, like, `http://<your-xinference-endpoint-domain>:9997/v1`.
  47. - Use Xinference Models.
  48. ![](https://github.com/infiniflow/ragflow/assets/12318111/b01fcb6f-47c9-4777-82e0-f1e947ed615a)
  49. ![](https://github.com/infiniflow/ragflow/assets/12318111/1763dcd1-044f-438d-badd-9729f5b3a144)