Você não pode selecionar mais de 25 tópicos Os tópicos devem começar com uma letra ou um número, podem incluir traços ('-') e podem ter até 35 caracteres.

Feat: make document parsing and embedding batch sizes configurable via environment variables (#8266) ### Description This PR introduces two new environment variables, ‎`DOC_BULK_SIZE` and ‎`EMBEDDING_BATCH_SIZE`, to allow flexible tuning of batch sizes for document parsing and embedding vectorization in RAGFlow. By making these parameters configurable, users can optimize performance and resource usage according to their hardware capabilities and workload requirements. ### What problem does this PR solve? Previously, the batch sizes for document parsing and embedding were hardcoded, limiting the ability to adjust throughput and memory consumption. This PR enables users to set these values via environment variables (in ‎`.env`, Helm chart, or directly in the deployment environment), improving flexibility and scalability for both small and large deployments. - ‎`DOC_BULK_SIZE`: Controls how many document chunks are processed in a single batch during document parsing (default: 4). - ‎`EMBEDDING_BATCH_SIZE`: Controls how many text chunks are processed in a single batch during embedding vectorization (default: 16). This change updates the codebase, documentation, and configuration files to reflect the new options. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update - [ ] Refactoring - [x] Performance Improvement - [ ] Other (please describe): ### Additional context - Updated ‎`.env`, ‎`helm/values.yaml`, and documentation to describe the new variables. - Modified relevant code paths to use the environment variables instead of hardcoded values. - Users can now tune these parameters to achieve better throughput or reduce memory usage as needed. Before: Default value: <img width="643" alt="image" src="https://github.com/user-attachments/assets/086e1173-18f3-419d-a0f5-68394f63866a" /> After: 10x: <img width="777" alt="image" src="https://github.com/user-attachments/assets/5722bbc0-0bcb-4536-b928-077031e550f1" />
4 meses atrás
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218
  1. # Based on docker compose .env file
  2. env:
  3. # The type of doc engine to use.
  4. # Available options:
  5. # - `elasticsearch` (default)
  6. # - `infinity` (https://github.com/infiniflow/infinity)
  7. # - `opensearch` (https://github.com/opensearch-project/OpenSearch)
  8. # DOC_ENGINE: elasticsearch
  9. DOC_ENGINE: infinity
  10. # DOC_ENGINE: opensearch
  11. # The version of Elasticsearch.
  12. STACK_VERSION: "8.11.3"
  13. # The password for Elasticsearch
  14. ELASTIC_PASSWORD: infini_rag_flow_helm
  15. # The password for OpenSearch.
  16. # At least one uppercase letter, one lowercase letter, one digit, and one special character
  17. OPENSEARCH_PASSWORD: infini_rag_flow_OS_01
  18. # The password for MySQL
  19. MYSQL_PASSWORD: infini_rag_flow_helm
  20. # The database of the MySQL service to use
  21. MYSQL_DBNAME: rag_flow
  22. # The username for MinIO.
  23. MINIO_ROOT_USER: rag_flow
  24. # The password for MinIO
  25. MINIO_PASSWORD: infini_rag_flow_helm
  26. # The password for Redis
  27. REDIS_PASSWORD: infini_rag_flow_helm
  28. # The RAGFlow Docker image to download.
  29. # Defaults to the v0.20.1-slim edition, which is the RAGFlow Docker image without embedding models.
  30. RAGFLOW_IMAGE: infiniflow/ragflow:v0.20.1-slim
  31. #
  32. # To download the RAGFlow Docker image with embedding models, uncomment the following line instead:
  33. # RAGFLOW_IMAGE: infiniflow/ragflow:v0.20.1
  34. #
  35. # The Docker image of the v0.20.1 edition includes:
  36. # - Built-in embedding models:
  37. # - BAAI/bge-large-zh-v1.5
  38. # - BAAI/bge-reranker-v2-m3
  39. # - maidalun1020/bce-embedding-base_v1
  40. # - maidalun1020/bce-reranker-base_v1
  41. # - Embedding models that will be downloaded once you select them in the RAGFlow UI:
  42. # - BAAI/bge-base-en-v1.5
  43. # - BAAI/bge-large-en-v1.5
  44. # - BAAI/bge-small-en-v1.5
  45. # - BAAI/bge-small-zh-v1.5
  46. # - jinaai/jina-embeddings-v2-base-en
  47. # - jinaai/jina-embeddings-v2-small-en
  48. # - nomic-ai/nomic-embed-text-v1.5
  49. # - sentence-transformers/all-MiniLM-L6-v2
  50. #
  51. #
  52. # The local time zone.
  53. TIMEZONE: "Asia/Shanghai"
  54. # Uncomment the following line if you have limited access to huggingface.co:
  55. # HF_ENDPOINT: https://hf-mirror.com
  56. # The maximum file size for each uploaded file, in bytes.
  57. # You can uncomment this line and update the value if you wish to change 128M file size limit
  58. # MAX_CONTENT_LENGTH: "134217728"
  59. # After making the change, ensure you update `client_max_body_size` in nginx/nginx.conf correspondingly.
  60. # The number of document chunks processed in a single batch during document parsing.
  61. DOC_BULK_SIZE: 4
  62. # The number of text chunks processed in a single batch during embedding vectorization.
  63. EMBEDDING_BATCH_SIZE: 16
  64. ragflow:
  65. # Optional service configuration overrides
  66. # to be written to local.service_conf.yaml
  67. # inside the RAGFlow container
  68. # https://ragflow.io/docs/dev/configurations#service-configuration
  69. service_conf:
  70. # Optional yaml formatted override for the
  71. # llm_factories.json file inside the RAGFlow
  72. # container.
  73. llm_factories:
  74. # factory_llm_infos:
  75. # - name: OpenAI-API-Compatible
  76. # logo: ""
  77. # tags: "LLM,TEXT EMBEDDING,SPEECH2TEXT,MODERATION"
  78. # status: "1"
  79. # llm:
  80. # - llm_name: my-custom-llm
  81. # tags: "LLM,CHAT,"
  82. # max_tokens: 100000
  83. # model_type: chat
  84. # is_tools: false
  85. # Kubernetes configuration
  86. deployment:
  87. strategy:
  88. resources:
  89. service:
  90. # Use LoadBalancer to expose the web interface externally
  91. type: ClusterIP
  92. api:
  93. service:
  94. enabled: true
  95. type: ClusterIP
  96. infinity:
  97. image:
  98. repository: infiniflow/infinity
  99. tag: v0.6.0-dev5
  100. storage:
  101. className:
  102. capacity: 5Gi
  103. deployment:
  104. strategy:
  105. resources:
  106. service:
  107. type: ClusterIP
  108. elasticsearch:
  109. storage:
  110. className:
  111. capacity: 20Gi
  112. deployment:
  113. strategy:
  114. resources:
  115. requests:
  116. cpu: "4"
  117. memory: "16Gi"
  118. service:
  119. type: ClusterIP
  120. opensearch:
  121. image:
  122. repository: opensearchproject/opensearch
  123. tag: 2.19.1
  124. storage:
  125. className:
  126. capacity: 20Gi
  127. deployment:
  128. strategy:
  129. resources:
  130. requests:
  131. cpu: "4"
  132. memory: "16Gi"
  133. service:
  134. type: ClusterIP
  135. minio:
  136. image:
  137. repository: quay.io/minio/minio
  138. tag: RELEASE.2023-12-20T01-00-02Z
  139. storage:
  140. className:
  141. capacity: 5Gi
  142. deployment:
  143. strategy:
  144. resources:
  145. service:
  146. type: ClusterIP
  147. mysql:
  148. image:
  149. repository: mysql
  150. tag: 8.0.39
  151. storage:
  152. className:
  153. capacity: 5Gi
  154. deployment:
  155. strategy:
  156. resources:
  157. service:
  158. type: ClusterIP
  159. redis:
  160. image:
  161. repository: valkey/valkey
  162. tag: 8
  163. storage:
  164. className:
  165. capacity: 5Gi
  166. persistence:
  167. enabled: true
  168. # Set's the retention policy for the persistent storage (only available in k8s 1.32 or later)
  169. # https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#persistentvolumeclaim-retention
  170. # retentionPolicy:
  171. # whenDeleted: Delete
  172. # whenScaled: Delete
  173. deployment:
  174. strategy:
  175. resources:
  176. service:
  177. type: ClusterIP
  178. # This block is for setting up web service ingress. For more information, see:
  179. # https://kubernetes.io/docs/concepts/services-networking/ingress/
  180. ingress:
  181. enabled: false
  182. className: ""
  183. annotations: {}
  184. # kubernetes.io/ingress.class: nginx
  185. # kubernetes.io/tls-acme: "true"
  186. hosts:
  187. - host: chart-example.local
  188. paths:
  189. - path: /
  190. pathType: ImplementationSpecific
  191. tls: []
  192. # - secretName: chart-example-tls
  193. # hosts:
  194. # - chart-example.local