### What problem does this PR solve?
Refactor OpenAI to enable audio parsing.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
### What problem does this PR solve?
Fix Gemini parameters error.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Add embedded search functionality.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Refactor: Use Input Length In DefaultRerank (#9516)
### What problem does this PR solve?
1. Use input length to prepare res
2. Adjust torch_empty_cache code location
### Type of change
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
Handle unexpected truncated Excel files.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
fix: preserve correct MIME & unify data URL handling for vision inputs (relates #9248) (#9474)
fix: preserve correct MIME & unify data URL handling for vision inputs
(relates #9248)
- Updated image2base64() to return a full data URL
(data:image/<fmt>;base64,...) with accurate MIME
- Removed hardcoded image/jpeg in Base._image_prompt(); pass through
data URLs and default raw base64 to image/png
- Set AnthropicCV._image_prompt() raw base64 media_type default to
image/png
- Ensures MIME type matches actual image content, fixing “cannot process
base64 image” errors on vLLM/OpenAI-compatible backends
### What problem does this PR solve?
This PR fixes a compatibility issue where base64-encoded images sent to
vision models (e.g., vLLM/OpenAI-compatible backends) were rejected due
to mismatched MIME type or incorrect decoding.
Previously, the backend:
- Always converted raw base64 into data:image/jpeg;base64,... even if
the actual content was PNG.
- In some cases, base64 decoding was attempted on the full data URL
string instead of the pure base64 part.
This caused errors like:
```
cannot process base64 image
failed to decode base64 string: illegal base64 data at input byte 0
```
by strict validators such as vLLM.
With this fix, the MIME type in the request now matches the actual image
content, and data URLs are correctly handled or passed through, ensuring
vision models can decode and process images reliably.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
All models pass the mock response tests, which means that if a model can
return the correct response, everything should work as expected.
However, not all models have been fully tested in a real environment,
the real API_KEY. I suggest actively monitoring the refactored models
over the coming period to ensure they work correctly and fixing them
step by step, or waiting to merge until most have been tested in
practical environment.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
fix "no `tc` element at grid_offset", just log warning and ignore.
stacktrace:
```
Traceback (most recent call last):
File "/ragflow/rag/svr/task_executor.py", line 620, in handle_task
await do_handle_task(task)
File "/ragflow/rag/svr/task_executor.py", line 553, in do_handle_task
chunks = await build_chunks(task, progress_callback)
File "/ragflow/rag/svr/task_executor.py", line 257, in build_chunks
cks = await trio.to_thread.run_sync(lambda: chunker.chunk(task["name"], binary=binary, from_page=task["from_page"],
File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 447, in to_thread_run_sync
return msg_from_thread.unwrap()
File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
raise captured_error
File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 373, in do_release_then_return_result
return result.unwrap()
File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
raise captured_error
File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 392, in worker_fn
ret = context.run(sync_fn, *args)
File "/ragflow/rag/svr/task_executor.py", line 257, in <lambda>
cks = await trio.to_thread.run_sync(lambda: chunker.chunk(task["name"], binary=binary, from_page=task["from_page"],
File "/ragflow/rag/app/naive.py", line 384, in chunk
sections, tables = Docx()(filename, binary)
File "/ragflow/rag/app/naive.py", line 230, in __call__
while i < len(r.cells):
File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 438, in cells
return tuple(_iter_row_cells())
File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 436, in _iter_row_cells
yield from iter_tc_cells(tc)
File "/ragflow/.venv/lib/python3.10/site-packages/docx/table.py", line 424, in iter_tc_cells
yield from iter_tc_cells(tc._tc_above) # pyright: ignore[reportPrivateUsage]
File "/ragflow/.venv/lib/python3.10/site-packages/docx/oxml/table.py", line 741, in _tc_above
return self._tr_above.tc_at_grid_offset(self.grid_offset)
File "/ragflow/.venv/lib/python3.10/site-packages/docx/oxml/table.py", line 98, in tc_at_grid_offset
raise ValueError(f"no `tc` element at grid_offset={grid_offset}")
ValueError: no `tc` element at grid_offset=10
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix broken data stream when writing image file (#9354)
### What problem does this PR solve?
fix "broken data stream when writing image file", just log warning and
ignore
Close #8379
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Agent plans tasks by referring to its own prompt. (#9315)
### What problem does this PR solve?
Fixes the issue in the analyze_task execution flow where the Lead Agent
was not utilizing its own sys_prompt during task analysis, resulting in
incorrect or incomplete task planning.
https://github.com/infiniflow/ragflow/issues/9294
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
fix: add missing env vars and default values of service_conf.yaml (#9289)
### What problem does this PR solve?
Add missing env var `MYSQL_MAX_PACKET` to service_conf.yaml.template,
and add default values to opendal config to fix npe.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Add **kwargs to model base class constructors (#9252)
Updated constructors for base and derived classes in chat, embedding,
rerank, sequence2txt, and tts models to accept **kwargs. This change
improves extensibility and allows passing additional parameters without
breaking existing interfaces.
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: IT: Sop.Son <sop.son@feavn.local>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
FIX: If chunk["content_with_weight"] contains one or more unpaired surrogate characters (such as incomplete emoji or other special characters), then calling .encode("utf-8") directly will raise a UnicodeEncodeError. (#9246)
FIX: If chunk["content_with_weight"] contains one or more unpaired
surrogate characters (such as incomplete emoji or other special
characters), then calling .encode("utf-8") directly will raise a
UnicodeEncodeError.
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: PlainParser using fix in presentation (#9239)
### What problem does this PR solve?
tiny fix about the using of `deepdoc.pdf_parser.PlainParser` in
`rag.app.presentation.chunk`, I referred to other ways of using this
class.
So tiny the fix is, a issue seems unnecessary.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix:local variable 'response' referenced before assignment (#9230)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/9227
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
fix "out of memory" if slide.get_thumbnail() to a huge image (#9211)
### What problem does this PR solve?
fix "out of memory" if slide.get_thumbnail() to a huge image
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix:Error when parsing files using Gemini: **ERROR**: GENERIC_ERROR - Unknown field for GenerationConfig: max_tokens (#9195)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/9177
The reason should be due to the gemin internal use a different parameter
name
`
max_output_tokens (int):
Optional. The maximum number of tokens to include in a
response candidate.
Note: The default value varies by model, see the
``Model.output_token_limit`` attribute of the ``Model``
returned from the ``getModel`` function.
This field is a member of `oneof`_ ``_max_output_tokens``.
`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
#9082#6365
<u> **WARNING: it's not compatible with the older version of `Agent`
module, which means that `Agent` from older versions can not work
anymore.**</u>
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fix: fix error 429 api rate limit when building knowledge graph for all chat model and Mistral embedding model (#9106)
### What problem does this PR solve?
fix error 429 api rate limit when building knowledge graph for all chat
model and Mistral embedding model.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Feat: parsing supports jsonl or ldjson format (#9087)
### What problem does this PR solve?
Supports jsonl or ldjson format. Feature request from
[discussion](https://github.com/orgs/infiniflow/discussions/8774).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)