ragflow

Wykres commitów

Autor	SHA1	Wiadomość	Data
Yongteng Lei	787e0c6786	Refa: OpenAI whisper-1 (#9552) ### What problem does this PR solve? Refactor OpenAI to enable audio parsing. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2 miesięcy temu
Stephen Hu	a0d630365c	Refactor:Improve VoyageRerank not texts handling (#9539) ### What problem does this PR solve? Improve VoyageRerank not texts handling ### Type of change - [x] Refactoring	2 miesięcy temu
Kevin Hu	b5b8032a56	Feat: Support metadata auto filer for Search. (#9524) ### What problem does this PR solve? ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2 miesięcy temu
Yongteng Lei	fe32952825	Fix: Gemini parameters error (#9520) ### What problem does this PR solve? Fix Gemini parameters error. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2 miesięcy temu
Kevin Hu	ca720bd811	Fix: save team's canvas issue. (#9518) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Yongteng Lei	ba11312766	Feat: embedded search (#9501) ### What problem does this PR solve? Add embedded search functionality. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2 miesięcy temu
Stephen Hu	fb77f9917b	Refactor: Use Input Length In DefaultRerank (#9516) ### What problem does this PR solve? 1. Use input length to prepare res 2. Adjust torch_empty_cache code location ### Type of change - [x] Refactoring - [x] Performance Improvement	2 miesięcy temu
Yongteng Lei	eef43fa25c	Fix: unexpected truncated Excel files (#9500) ### What problem does this PR solve? Handle unexpected truncated Excel files. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Kevin Hu	2114e966d8	Feat: add citation option to agent and enlarge the timeouts. (#9484) ### What problem does this PR solve? #9422 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2 miesięcy temu
RuyXu	762aa4b8c4	fix: preserve correct MIME & unify data URL handling for vision inputs (relates #9248) (#9474) fix: preserve correct MIME & unify data URL handling for vision inputs (relates #9248) - Updated image2base64() to return a full data URL (data:image/<fmt>;base64,...) with accurate MIME - Removed hardcoded image/jpeg in Base._image_prompt(); pass through data URLs and default raw base64 to image/png - Set AnthropicCV._image_prompt() raw base64 media_type default to image/png - Ensures MIME type matches actual image content, fixing “cannot process base64 image” errors on vLLM/OpenAI-compatible backends ### What problem does this PR solve? This PR fixes a compatibility issue where base64-encoded images sent to vision models (e.g., vLLM/OpenAI-compatible backends) were rejected due to mismatched MIME type or incorrect decoding. Previously, the backend: - Always converted raw base64 into data:image/jpeg;base64,... even if the actual content was PNG. - In some cases, base64 decoding was attempted on the full data URL string instead of the pure base64 part. This caused errors like: ``` cannot process base64 image failed to decode base64 string: illegal base64 data at input byte 0 ``` by strict validators such as vLLM. With this fix, the MIME type in the request now matches the actual image content, and data URLs are correctly handled or passed through, ensuring vision models can decode and process images reliably. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Stephen Hu	f2806a8332	Update cv_model.py (#9472) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/9452 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Jay Xu	6d1078b538	fix 'KeyError: "There is no item named 'word/NULL' in the archive"' (#9455) ### What problem does this PR solve? Issue referring to: https://github.com/python-openxml/python-docx/issues/797 Fix referring to: https://github.com/python-openxml/python-docx/issues/1105#issuecomment-1298075246 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Kevin Hu	5e8cd693a5	Refa: split services about llm. (#9450) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2 miesięcy temu
Kevin Hu	4b1b68c5fc	Fix: no doc hits after meta data filter. (#9435) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Stephen Hu	da5cef0686	Refactor:Improve the float compare for LocalAIRerank (#9428) ### What problem does this PR solve? Improve the float compare for LocalAIRerank ### Type of change - [x] Refactoring	2 miesięcy temu
Yongteng Lei	a0c2da1219	Fix: Patch LiteLLM (#9416) ### What problem does this PR solve? Patch LiteLLM refactor. #9408 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Kevin Hu	153e430b00	Feat: add meta data filter. (#9405) ### What problem does this PR solve? #8531 #7417 #6761 #6573 #6477 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2 miesięcy temu
Yongteng Lei	83771e500c	Refa: migrate chat models to LiteLLM (#9394) ### What problem does this PR solve? All models pass the mock response tests, which means that if a model can return the correct response, everything should work as expected. However, not all models have been fully tested in a real environment, the real API_KEY. I suggest actively monitoring the refactored models over the coming period to ensure they work correctly and fixing them step by step, or waiting to merge until most have been tested in practical environment. ### Type of change - [x] Refactoring	2 miesięcy temu
HaiyangP	79399f7f25	Support the case of one cell split by multiple columns. (#9225) ### What problem does this PR solve? Support the case of one cell split by multiple columns. Besides, the codes are compatible with the common cell case. #8606 can be fixed. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) I provide a case of one cell split by multiple columns: [test.xlsx](https://github.com/user-attachments/files/21578693/test.xlsx) The chunk res: <img width="236" height="57" alt="2025-06-17 16-04-07 的屏幕截图" src="https://github.com/user-attachments/assets/b0a499ac-349d-4c3d-8c6e-0931c8fc26de" />	2 miesięcy temu
Jay Xu	7f08ba47d7	Fix "no `tc` element at grid_offset" (#9375) ### What problem does this PR solve? fix "no `tc` element at grid_offset", just log warning and ignore. stacktrace: ``` Traceback (most recent call last): File "/ragflow/rag/svr/task_executor.py", line 620, in handle_task await do_handle_task(task) File "/ragflow/rag/svr/task_executor.py", line 553, in do_handle_task chunks = await build_chunks(task, progress_callback) File "/ragflow/rag/svr/task_executor.py", line 257, in build_chunks cks = await trio.to_thread.run_sync(lambda: chunker.chunk(task["name"], binary=binary, from_page=task["from_page"], File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 447, in to_thread_run_sync return msg_from_thread.unwrap() File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap raise captured_error File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 373, in do_release_then_return_result return result.unwrap() File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap raise captured_error File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 392, in worker_fn ret = context.run(sync_fn, *args) File "/ragflow/rag/svr/task_executor.py", line 257, in <lambda> cks = await trio.to_thread.run_sync(lambda: chunker.chunk(task["name"], binary=binary, from_page=task["from_page"], File "/ragflow/rag/app/naive.py", line 384, in chunk sections, tables = Docx()(filename, binary) File "/ragflow/rag/app/naive.py", line 230, in __call__ while i < len(r.cells): File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 438, in cells return tuple(_iter_row_cells()) File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 436, in _iter_row_cells yield from iter_tc_cells(tc) File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 424, in iter_tc_cells yield from iter_tc_cells(tc._tc_above) # pyright: ignore[reportPrivateUsage] File "/ragflow/.venv/lib/python3.10/site-packages/docx/oxml/table.py", line 741, in _tc_above return self._tr_above.tc_at_grid_offset(self.grid_offset) File "/ragflow/.venv/lib/python3.10/site-packages/docx/oxml/table.py", line 98, in tc_at_grid_offset raise ValueError(f"no `tc` element at grid_offset={grid_offset}") ValueError: no `tc` element at grid_offset=10 ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Jay Xu	ce3dd019c3	Fix broken data stream when writing image file (#9354) ### What problem does this PR solve? fix "broken data stream when writing image file", just log warning and ignore Close #8379 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
TeslaZY	476c56868d	Agent plans tasks by referring to its own prompt. (#9315) ### What problem does this PR solve? Fixes the issue in the analyze_task execution flow where the Lead Agent was not utilizing its own sys_prompt during task analysis, resulting in incorrect or incomplete task planning. https://github.com/infiniflow/ragflow/issues/9294 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2 miesięcy temu
Stephen Hu	7713e14d6a	Update chat_model.py (#9318) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/9317 base on https://discuss.ai.google.dev/t/valueerror-invalid-operation-the-response-text-quick-accessor-requires-the-response-to-contain-a-valid-part-but-none-were-returned/42866 should can be handled by retry ### Type of change - [x] Refactoring	2 miesięcy temu
Kevin Hu	a2e1f5618d	Fix: bytes style image issue. (#9304) ### What problem does this PR solve? #9302 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Stephen Hu	0a0bfc02a0	Refactor:naive_merge_with_images close useless images (#9296) ### What problem does this PR solve? naive_merge_with_images close useless images ### Type of change - [x] Refactoring	2 miesięcy temu
He Wang	4fc9e42e74	fix: add missing env vars and default values of service_conf.yaml (#9289) ### What problem does this PR solve? Add missing env var `MYSQL_MAX_PACKET` to service_conf.yaml.template, and add default values to opendal config to fix npe. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
so95	35539092d0	Add kwargs to model base class constructors (#9252) Updated constructors for base and derived classes in chat, embedding, rerank, sequence2txt, and tts models to accept kwargs. This change improves extensibility and allows passing additional parameters without breaking existing interfaces. - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: IT: Sop.Son <sop.son@feavn.local> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2 miesięcy temu
Kevin Hu	9ca86d801e	Refa: add provider info while adding model. (#9273) ### What problem does this PR solve? #9248 ### Type of change - [x] Refactoring	2 miesięcy temu
Stephen Hu	7efeaf6548	Fix:remove a img close which can not operate (#9267) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/9149#issuecomment-3157129587 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
gooodboyAo	a7eba61067	FIX: If chunk["content_with_weight"] contains one or more unpaired surrogate characters (such as incomplete emoji or other special characters), then calling .encode("utf-8") directly will raise a UnicodeEncodeError. (#9246) FIX: If chunk["content_with_weight"] contains one or more unpaired surrogate characters (such as incomplete emoji or other special characters), then calling .encode("utf-8") directly will raise a UnicodeEncodeError. ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Kevin Hu	2124329e95	Fix: local variable issue. (#9255) ### What problem does this PR solve? #9227 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
yzz	550e65bb22	Fix: PlainParser using fix in presentation (#9239) ### What problem does this PR solve? tiny fix about the using of `deepdoc.pdf_parser.PlainParser` in `rag.app.presentation.chunk`, I referred to other ways of using this class. So tiny the fix is, a issue seems unnecessary. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Stephen Hu	0a303d9ae1	Refactor:Improve the chat stream logic for NvidiaCV (#9242) ### What problem does this PR solve? Improve the chat stream logic for NvidiaCV ### Type of change - [x] Refactoring	2 miesięcy temu
Stephen Hu	1deb0a2d42	Fix:local variable 'response' referenced before assignment (#9230) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/9227 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2 miesięcy temu
Yongteng Lei	a249803961	Refa: ensure Redis stream queue could be created properly (#9223) ### What problem does this PR solve? Ensure Redis queue could be created properly. ### Type of change - [x] Refactoring	2 miesięcy temu
Kevin Hu	6ec3f18e22	Fix: self-deployed LLM error, (#9217) ### What problem does this PR solve? Close #9197 Close #9145 ### Type of change - [x] Refactoring - [x] Bug fixing.	2 miesięcy temu
Yongteng Lei	30ccc4a66c	Fix: correct single base64 image handling in image prompt (#9220) ### What problem does this PR solve? Correct single base64 image handling in image prompt. ![img_v3_02or_ec4757c2-a9d4-4774-9a76-f7c6be633ebg](https://github.com/user-attachments/assets/872a86bf-e2a8-48d1-9b71-2a0c7a35ba9e) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Jay Xu	cae11201ef	fix "out of memory" if slide.get_thumbnail() to a huge image (#9211) ### What problem does this PR solve? fix "out of memory" if slide.get_thumbnail() to a huge image ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Stephen Hu	667c5812d0	Fix:Repeated images when parsing markdown files with images (#9196) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/9149 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Stephen Hu	e9cbf4611d	Fix:Error when parsing files using Gemini: ERROR: GENERIC_ERROR - Unknown field for GenerationConfig: max_tokens (#9195) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/9177 The reason should be due to the gemin internal use a different parameter name ` max_output_tokens (int): Optional. The maximum number of tokens to include in a response candidate. Note: The default value varies by model, see the ``Model.output_token_limit`` attribute of the ``Model`` returned from the ``getModel`` function. This field is a member of `oneof`_ ``_max_output_tokens``. ` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2 miesięcy temu
Kevin Hu	a16cd4f110	Refa: add result to callback for agent tool use. (#9137) ### What problem does this PR solve? ### Type of change - [x] Refactoring	3 miesięcy temu
Stephen Hu	5ccdb95008	Refactor:Introduce Image Close For GeminiCV (#9147) ### What problem does this PR solve? Introduce Image Close For GeminiCV ### Type of change - [x] Refactoring - [x] Performance Improvement	3 miesięcy temu
JI4JUN	aeaeb169e4	Feat/support 302ai provider (#8742) ### What problem does this PR solve? Support 302.AI provider. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	3 miesięcy temu
Stephen Hu	20b4d88098	Refactor: Improve the try catch logic for XinferenceEmbed (#9128) ### What problem does this PR solve? Improve the try catch logic for XinferenceEmbed ### Type of change - [x] Refactoring	3 miesięcy temu
Kevin Hu	d9fe279dde	Feat: Redesign and refactor agent module (#9113) ### What problem does this PR solve? #9082 #6365 <u> WARNING: it's not compatible with the older version of `Agent` module, which means that `Agent` from older versions can not work anymore.</u> ### Type of change - [x] New Feature (non-breaking change which adds functionality)	3 miesięcy temu
謝富祥	021e8b57ae	Fix: fix error 429 api rate limit when building knowledge graph for all chat model and Mistral embedding model (#9106) ### What problem does this PR solve? fix error 429 api rate limit when building knowledge graph for all chat model and Mistral embedding model. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	3 miesięcy temu
Yongteng Lei	39ef2ffba9	Feat: parsing supports jsonl or ldjson format (#9087) ### What problem does this PR solve? Supports jsonl or ldjson format. Feature request from [discussion](https://github.com/orgs/infiniflow/discussions/8774). ### Type of change - [x] New Feature (non-breaking change which adds functionality)	3 miesięcy temu
Stephen Hu	ba563f8095	Update embedding_model.py (#9083) ### What problem does this PR solve? Reduce the logic scope for DefaultEmbedding ### Type of change - [x] Refactoring	3 miesięcy temu
Zhichang Yu	342a04ec8a	Added infinity rank_feature support (#9044) ### What problem does this PR solve? Added infinity rank_feature support ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	3 miesięcy temu
Stephen Hu	86b4da0844	Refactor: Remove Useless split for BedrockEmbed (#9067) ### What problem does this PR solve? Remove Useless split for BedrockEmbed ### Type of change - [x] Refactoring	3 miesięcy temu

1 2 3 4 5 ...

849 Commity (v0.20.3)