ragflow

コミットグラフ

作成者	SHA1	メッセージ	日付
cutiechi	8f9bcb1c74	Feat: make document parsing and embedding batch sizes configurable via environment variables (#8266) ### Description This PR introduces two new environment variables, ‎`DOC_BULK_SIZE` and ‎`EMBEDDING_BATCH_SIZE`, to allow flexible tuning of batch sizes for document parsing and embedding vectorization in RAGFlow. By making these parameters configurable, users can optimize performance and resource usage according to their hardware capabilities and workload requirements. ### What problem does this PR solve? Previously, the batch sizes for document parsing and embedding were hardcoded, limiting the ability to adjust throughput and memory consumption. This PR enables users to set these values via environment variables (in ‎`.env`, Helm chart, or directly in the deployment environment), improving flexibility and scalability for both small and large deployments. - ‎`DOC_BULK_SIZE`: Controls how many document chunks are processed in a single batch during document parsing (default: 4). - ‎`EMBEDDING_BATCH_SIZE`: Controls how many text chunks are processed in a single batch during embedding vectorization (default: 16). This change updates the codebase, documentation, and configuration files to reflect the new options. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update - [ ] Refactoring - [x] Performance Improvement - [ ] Other (please describe): ### Additional context - Updated ‎`.env`, ‎`helm/values.yaml`, and documentation to describe the new variables. - Modified relevant code paths to use the environment variables instead of hardcoded values. - Users can now tune these parameters to achieve better throughput or reduce memory usage as needed. Before: Default value: <img width="643" alt="image" src="https://github.com/user-attachments/assets/086e1173-18f3-419d-a0f5-68394f63866a" /> After: 10x: <img width="777" alt="image" src="https://github.com/user-attachments/assets/5722bbc0-0bcb-4536-b928-077031e550f1" />	4ヶ月前
Kevin Hu	b1117a8717	Fix: base url issue. (#8281) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	4ヶ月前
Yongteng Lei	0fa1a1469e	Fix: avoid mixing different embedding models in document parsing (#8260) ### What problem does this PR solve? Fix mixing different embedding models in document parsing. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	4ヶ月前
cutiechi	dabbc852c8	Fix: opendal storage health attribute not found & remove duplicate operator scheme initialization (#8265) ### What problem does this PR solve? This PR fixes two issues in the OpenDAL storage connector: 1. The ‎`health` method was missing, which prevented health checks on the storage backend. 3. The initialization of the ‎`opendal.Operator` object included a redundant scheme parameter, causing unnecessary duplication and potential confusion. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Background - The absence of a ‎`health` method made it difficult to verify the availability and reliability of the storage service. - Initializing ‎`opendal.Operator` with both ‎`self._scheme` and unpacked ‎`**self._kwargs` could lead to errors or unexpected behavior if the scheme was already included in the kwargs. ### What is changed and how it works? - Adds a ‎`health` method that writes a test file to verify storage availability. - Removes the duplicate scheme parameter from the ‎`opendal.Operator` initialization to ensure clarity and prevent conflicts. before: <img width="762" alt="企业微信截图_46be646f-2e99-4e5e-be67-b1483426e77c" src="https://github.com/user-attachments/assets/acecbb8c-4810-457f-8342-6355148551ba" /> <img width="767" alt="image" src="https://github.com/user-attachments/assets/147cd5a2-dde3-466b-a9c1-d1d4f0819e5d" /> after: <img width="1123" alt="企业微信截图_09d62997-8908-4985-b89f-7a78b5da55ac" src="https://github.com/user-attachments/assets/97dc88c9-0f4e-4d77-88b3-cd818e8da046" />	4ヶ月前
Stephen Hu	545ea229b6	Refa: Structure Ask Message (#8276) ### What problem does this PR solve? Refactoring codes for SDK ### Type of change - [x] Refactoring	4ヶ月前
writinwaters	df17294865	Docs: Sandbox quickstart (#8264) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	4ヶ月前
balibabu	b8e3852d3b	Feat: Reset the default values of large model parameters (#8262) ### What problem does this PR solve? Feat: Reset the default values of large model parameters ### Type of change - [x] New Feature (non-breaking change which adds functionality)	4ヶ月前
balibabu	0bde5397d0	Feat: Modify the style of the canvas operator node #3221 (#8261) ### What problem does this PR solve? Feat: Modify the style of the canvas operator node #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	4ヶ月前
Kevin Hu	f7074037ef	Feat: Let number of task ahead be visible. (#8259) ### What problem does this PR solve? ![image](https://github.com/user-attachments/assets/d4ef0526-343a-426f-a85a-b05eb8b559a1) ### Type of change - [x] New Feature (non-breaking change which adds functionality)	4ヶ月前
Liu An	1aa991d914	Refa: Translate test file content from Chinese to English in file_utils.py (#8258) ### What problem does this PR solve? Update all test file creation functions to use English text instead of Chinese for consistency with the project's language standards. This includes DOCX, Excel, PPT, PDF, TXT, MD, JSON, EML, and HTML test file generators. ### Type of change - [x] Update test case	4ヶ月前
Yongteng Lei	b2eed8fed1	Fix: incorrect progress updating (#8253) ### What problem does this PR solve? Progress is only updated if it's valid and not regressive. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	4ヶ月前
Liu An	0c0188b688	Fix: Update customer service template with query references to RewriteQuestion (#8252) ### What problem does this PR solve? - Add query references to "RewriteQuestion:AllNightsSniff" in multiple components - Set "selected" to false for retrieval node ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	4ヶ月前
balibabu	6b58b67d12	Feat: Add canvas node toolbar #3221 (#8249) ### What problem does this PR solve? Feat: Add canvas node toolbar #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	4ヶ月前
Liu An	64af09ce7b	Test: Add web API test suite for knowledge base operations (#8254) ### What problem does this PR solve? - Implement RAGFlowWebApiAuth class for web API authentication - Add comprehensive test cases for KB CRUD operations - Set up common fixtures and utilities in conftest.py - Add helper functions in common.py for web API requests The changes establish a complete testing framework for knowledge base management via web API endpoints. ### Type of change - [x] Add test case	4ヶ月前
Yongteng Lei	8f9e7a6f6f	Refa: revert to original task message collection logic (#8251) ### What problem does this PR solve? Get rid of 'RedisDB.get_unacked_iterator queue rag_flow_svr_queue_1 doesn't exist' ---- Edit: revert to original message collection logic. ### Type of change - [x] Refactoring --------- Co-authored-by: Zhichang Yu <yuzhichang@gmail.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	4ヶ月前
Kevin Hu	65d5268439	Feat: implement novitaAI embedding and reranking. (#8250) ### What problem does this PR solve? Close #8227 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	4ヶ月前
cutiechi	6aa0b0819d	Fix: unify opendal config key from ‎`schema` to ‎`scheme` (#8232) ### What problem does this PR solve? This PR resolves the inconsistency in the opendal configuration where both ‎`schema` and ‎`scheme` were used as keys. The code and configuration file now consistently use ‎`scheme`, which helps prevent configuration errors and runtime issues. This change improves code clarity and maintainability. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Additional context - Updated both ‎`conf/service_conf.yaml` and ‎`rag/utils/opendal_conn.py` to use ‎`scheme` instead of ‎`schema` - No breaking changes to other configuration fields	4ヶ月前
Wesley	3d0b440e9f	fix(search.py):remove hard page_size (#8242) ### What problem does this PR solve? Fix the restriction of forcing similarity_threshold=0 and page_size=30 when doc_ids is not empty #8228 --------- Co-authored-by: shiqing.wusq <shiqing.wusq@dtzhejiang.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	4ヶ月前
Kenny	800e263f64	Fix: Update customer_service.json (#8238) ### What problem does this PR solve? The issue of reporting the 「Can't inference the where the component input is. Please identify whose output is this component's input」error when creating an Agent using the Customer service template has been resolved. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	4ヶ月前
Stephen Hu	ce65ea1fc1	Fix: Change allocate_container_blocking Calculate Time by async time (#8206) ### What problem does this PR solve? Change allocate_container_blocking Calculate Time by async time ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	4ヶ月前
writinwaters	2341939376	Docs: Miscellaneous editorial updates (#8237) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	4ヶ月前
balibabu	a9d9215547	Feat: Connect conditional operators to other operators #3221 (#8231) ### What problem does this PR solve? Feat: Connect conditional operators to other operators #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	4ヶ月前
Liu An	99725444f1	Fix: desc parameter parsing (#8229) ### What problem does this PR solve? - Fix boolean parsing for 'desc' parameter in kb_app.py to properly handle string values ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	4ヶ月前
Stephen Hu	1ab0f52832	Fix：The OpenAI-Compatible Agent API returns an incorrect message (#8177) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8175 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	4ヶ月前
Yongteng Lei	24ca4cc6b7	Refa: GraphRAG and explaining GraphRAG stalling behavior on large files (#8223) ### What problem does this PR solve? This PR investigates the cause of #7957. TL;DR: Incorrect similarity calculations lead to too many candidates. Since candidate selection involves interaction with the LLM, this causes significant delays in the program. What this PR does: 1. Fix similarity calculation: When processing a 64 pages government document, the corrected similarity calculation reduces the number of candidates from over 100,000 to around 16,000. With a default batch size of 100 pairs per LLM call, this fix reduces unnecessary LLM interactions from over 1,000 calls to around 160, a roughly 10x improvement. 2. Add concurrency and timeout limits: Up to 5 entity types are processed in "parallel", each with a 180-second timeout. These limits may be configurable in future updates. 3. Improve logging: The candidate resolution process now reports progress in real time. 4. Mitigates potential concurrency risks ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	4ヶ月前
Kevin Hu	d36c8d18b1	Refa: make exception more clear. (#8224) ### What problem does this PR solve? #8156 ### Type of change - [x] Refactoring	4ヶ月前
Liu An	86a1411b07	Refa: Test configs (#8220) ### What problem does this PR solve? - Move common constants (HOST_ADDRESS, INVALID_API_TOKEN, etc.) to configs.py - Update test imports to use centralized configs - Clean up duplicate constant definitions across test files This improves maintainability by centralizing configuration. ### Type of change - [x] Refactoring test case	4ヶ月前
Liu An	54a465f9e8	Test: fix chunk deletion test assertions (#8222) ### What problem does this PR solve? - Fix test assertions in test_delete_chunks.py to expect empty results after deletion Action 7619 ### Type of change - [x] Bug Fix test cases	4ヶ月前
balibabu	bf7f7c7027	Feat: Display the connection lines between multiple conditions of the conditional operator #3221 (#8218) ### What problem does this PR solve? Feat: Display the connection lines between multiple conditions of the conditional operator #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	4ヶ月前
Liu An	7fbbc9650d	Fix: Move pagerank field from create to update dataset API (#8217) ### What problem does this PR solve? - Remove pagerank from CreateDatasetReq and add to UpdateDatasetReq - Add pagerank update logic in dataset update endpoint - Update API documentation to reflect changes - Modify related test cases and SDK references #8208 This change makes pagerank a mutable property that can only be set after dataset creation, and only when using elasticsearch as the doc engine. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	4ヶ月前
Liu An	d0c5ff04a6	Fix: Add pagerank validation for non-elasticsearch doc engines (#8215) ### What problem does this PR solve? Validate that pagerank updates are only allowed when using elasticsearch as the document engine. Return an error if pagerank is set while using a different doc engine, preventing potential inconsistencies in document scoring. #8208 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	4ヶ月前
Kevin Hu	d5236b71f4	Refa: ollama keep alive issue. (#8216) ### What problem does this PR solve? #8122 ### Type of change - [x] Refactoring	4ヶ月前
Stephen Hu	e7c85e569b	Fix: Improve TS Warning For http_api_reference.md (#8172) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8157 The current master code should work fine, but hI ave some warnings, so I added a declare to improve the warning ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	4ヶ月前
balibabu	84b4e32c34	Feat: The value selected in the Select component only displays the icon #3221 (#8209) ### What problem does this PR solve? Feat: The value selected in the Select component only displays the icon #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	4ヶ月前
Kevin Hu	56ee69e9d9	Refa: chat with tools. (#8210) ### What problem does this PR solve? ### Type of change - [x] Refactoring	4ヶ月前
africa-worker	44287fb05f	Oss support opendal(including mysql) (#8204) ### What problem does this PR solve? #8074 Oss support opendal(including mysql) ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	4ヶ月前
Liu An	cef587abc2	Fix: Add validation for dataset name in KB update API (#8194) ### What problem does this PR solve? Validate dataset name in knowledge base update endpoint to ensure: - Name is a non-empty string - Name length doesn't exceed DATASET_NAME_LIMIT - Whitespace is trimmed before processing Prevents invalid dataset names from being saved and provides clear error messages. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	4ヶ月前
Yongteng Lei	1a5f991d86	Fix: auto-keyword and auto-question fail with qwq model (#8190) ### What problem does this PR solve? Fix auto-keyword and auto-question fail with qwq model. #8189 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	4ヶ月前
balibabu	713b574c9d	Feat: Add SwitchForm component #3221 (#8200) ### What problem does this PR solve? Feat: Add SwitchForm component #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	4ヶ月前
Liu An	60c1bf5a19	Fix: duplicate knowledgebase name validation logic (#8199) ### What problem does this PR solve? Change the condition from checking for >1 to >=1 when validating duplicate knowledgebase names to properly catch all duplicates. This ensures no two knowledgebases can have the same name for a tenant. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	4ヶ月前
writinwaters	d331866a12	Docs: Miscellaneous (#8198) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	4ヶ月前
Kevin Hu	69e1fc496d	Refa: chat models (#8187) ### What problem does this PR solve? ### Type of change - [x] Refactoring	4ヶ月前
Liu An	e87ad8126c	Fix: Improve dataset name validation in KB app (#8188) ### What problem does this PR solve? - Trim whitespace before checking for empty dataset names - Change length check from >= to > DATASET_NAME_LIMIT for consistency ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	4ヶ月前
Yongteng Lei	5e30426916	Feat: add Qwen3-Embedding text-embedding-v4 (#8184) ### What problem does this PR solve? Add Qwen3-Embedding text-embedding-v4. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	4ヶ月前
Liu An	6aff3e052a	Test: Refactor test fixtures to use HttpApiAuth naming consistently (#8180) ### What problem does this PR solve? - Rename `api_key` fixture to `HttpApiAuth` across all test files - Update all dependent fixtures and test cases to use new naming - Maintain same functionality while improving naming clarity The rename better reflects the fixture's purpose as an HTTP API authentication helper rather than just an API key. ### Type of change - [x] Refactoring	4ヶ月前
Liu An	f29d9fa3f9	Test: fix test cases and improve document parsing validation (#8179) ### What problem does this PR solve? - Update chat assistant tests to use dataset.id directly in payloads - Enhance document parsing tests with better condition checking - Add explicit type hints and improve timeout handling Action_7556 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	4ヶ月前
balibabu	31003cd5f6	Feat: Display the agent node running timeline #3221 (#8185) ### What problem does this PR solve? Feat: Display the agent node running timeline #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	4ヶ月前
balibabu	f0a3d91171	Feat: Display agent operator call log #3221 (#8169) ### What problem does this PR solve? Feat: Display agent operator call log #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	4ヶ月前
cwr31	e6d36f3a3a	Improve image rotation logic for text recognition (#8167) ### What problem does this PR solve? Enhanced the image rotation handling by evaluating the original orientation, clockwise 90°, and counter-clockwise 90° rotations. The image with the highest text recognition score is now selected, improving accuracy for text detection in images with aspect ratios >= 1.5. #8166 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: wenrui.cao <wenrui.cao@univers.com>	4ヶ月前
writinwaters	c8269206d7	Docs: UI updates (#8170) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	4ヶ月前

1 2 3 4 5 ...

3204 コミット (8f9bcb1c74ebcbdbb2afaa66a630155dd17ec7b4) すべてのブランチ 検索

3204 コミット (8f9bcb1c74ebcbdbb2afaa66a630155dd17ec7b4)

すべてのブランチ