Fix: fix mismatch of assitant message and its reference (#9233)
### What problem does this PR solve?
#9232
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
1. When creating a new session, initialize an empty reference that
includes both the app api and sdk API.
2. Fix the logic when retrieving references for historical messages: the
number of dialogue messages and reference messages may differ, but it
should match the number of assistant messages.
Co-authored-by: Li Ye <liye@unittec.com>
Feat: list documents supports range filtering (#9214)
### What problem does this PR solve?
list_document supports range filtering.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
fix #8424 NPE in dify_retrieval.py, add log exception (#9212)
### What problem does this PR solve?
fix #8424 NPE in dify_retrieval.py, add log exception
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: migrate deprecated Langfuse API from v2 to v3 (#9204)
### What problem does this PR solve?
Fix:
```bash
'Langfuse' object has no attribute 'trace'
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: list tags api by using tenant id instead of user id (#9103)
### What problem does this PR solve?
The index name of the tag chunks is generated by the tenant id of the
knowledge base, so it should use the tenant id instead of the current
user id in the listing tags API.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Feat: Add industry-related search keyword generation function (#9156)
### What problem does this PR solve?
Add industry-related search keyword generation function
- When generating search keywords, support for specific industries has
been added
- If the "industry" parameter is provided, industry-specific
restrictions will be added to the prompt
- This change can help users generate more precise search keywords
within specific industries
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
```bash
Traceback (most recent call last):
File "/home/infiniflow/workspace/ragflow/api/db/services/document_service.py", line 635, in update_progress
info["progress_msg"] = "%d tasks are ahead in the queue..."%get_queue_length(priority)
File "/home/infiniflow/workspace/ragflow/api/db/services/document_service.py", line 686, in get_queue_length
return int(group_info.get("lag", 0))
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
```
This issue can happen very rare. When a `stream` is first created, the
`lag` value may be nil, which can cause this issue. However, once any
message is synced, the `lag` will become `0` afterwards.
```bash
> XINFO GROUPS rag_flow_svr_queue
1) 1) "name"
2) "rag_flow_svr_task_broker"
3) "consumers"
4) (integer) 0
5) "pending"
6) (integer) 0
7) "last-delivered-id"
8) "1753952489937-0"
9) "entries-read"
10) (nil)
11) "lag"
12) (nil)
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Add Generator return type annotation for tts method
- Import typing.Generator for type hints
### Type of change
- [x] Refactoring
### What problem does this PR solve?
#9082#6365
<u> **WARNING: it's not compatible with the older version of `Agent`
module, which means that `Agent` from older versions can not work
anymore.**</u>
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Refa: Update base64 test image with new sample data (#9115)
### What problem does this PR solve?
Replace the placeholder test image in base64_image.py with a new sample
image data string.
### Type of change
- [x] Refactoring
Fix : API /document/thumbnails using wrong method to get ''doc_ids' (#9097)
### What problem does this PR solve?
doc_ids is a list , should use request.args.getlist("doc_ids")
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Feat: parsing supports jsonl or ldjson format (#9087)
### What problem does this PR solve?
Supports jsonl or ldjson format. Feature request from
[discussion](https://github.com/orgs/infiniflow/discussions/8774).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fix:When deleting a knowledge base that is currently performing a parsing task, the parsing queue will not be deleted! (#9018)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/8995
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Refa: validation utils to use Pydantic v2 style models (#9037)
### What problem does this PR solve?
- Update BaseModel to use model_config instead of Config class
- Replace StrEnum with Literal types for method fields
- Convert Field declarations to Annotated style
### Type of change
- [x] Refactoring
fix: obfuscate additional server secrets values (#9014)
### What problem does this PR solve?
Obfuscates additional secrets values on ragflow_server startup to
prevent leakage:
* `secret` (azure)
* `client_secret` (oauth)
* `http_secret_key` (authentication)
* `sas_token` (azure)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Gifford R Nowland <gifford.r.nowland@aero.org>
### What problem does this PR solve?
OpenAI-compatible-API supports references.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fix: Increase timeouts for document parsing and model checks (#8996)
### What problem does this PR solve?
- Extended embedding model timeout from 3 to 10 seconds in api_utils.py
- Added more time for large file batches and concurrent parsing
operations to prevent test flakiness
- Import from #8940
- https://github.com/infiniflow/ragflow/actions/runs/16422052652
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: Refactor parser config handling and add GraphRAG defaults (#8778)
### What problem does this PR solve?
- Update `get_parser_config` to merge provided configs with defaults
- Add GraphRAG configuration defaults for all chunk methods
- Make raptor and graphrag fields non-nullable in ParserConfig schema
- Update related test cases to reflect config changes
- Ensure backward compatibility while adding new GraphRAG support
- #8396
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Correct cancel logic error
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
Feat: Add model editing functionality with improved UI labels (#8855)
### What problem does this PR solve?
Add edit button for local LLM models
<img width="1531" height="1428" alt="image"
src="https://github.com/user-attachments/assets/19d62255-59a6-4a7e-9772-8b8743101f78"
/>
<img width="1531" height="1428" alt="image"
src="https://github.com/user-attachments/assets/c3a0f77e-cc6b-4190-95a6-13835463428b"
/>
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: Liu An <asiro@qq.com>
fix settings global docStoreConn for opensearch (#8885)
### What problem does this PR solve?
fix opensearch OSConnection init.
```
docStoreConn = rag.utils.opensearch_conn.OSConnection()
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
### What problem does this PR solve?
Change document status in bulk.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fixes a function name typo for the `/list` route in
`api/apps/conversation_app.py`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
fix: use tenant_id of kb to get index name in rm chunk func (#8760)
### What problem does this PR solve?
The rm function in chunk_app.py now takes the index name differently
than other functions, so there will be situations where users can create
and update a chunk but not delete it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: Increase default `chunk_token_num` from 128 to 512 in parser config (#8753)
### What problem does this PR solve?
Updated the default `chunk_token_num` value in `api_utils.py` and
`validation_utils.py` to 512 to accommodate larger text chunks. Adjusted
corresponding test cases in HTTP and SDK API tests to reflect this
change.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Validate dialog name in `dialog_app.py` to ensure it is a non-empty
string and does not exceed 255 bytes in UTF-8 encoding.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)