fix settings global docStoreConn for opensearch (#8885)
### What problem does this PR solve?
fix opensearch OSConnection init.
```
docStoreConn = rag.utils.opensearch_conn.OSConnection()
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
### What problem does this PR solve?
Change document status in bulk.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fixes a function name typo for the `/list` route in
`api/apps/conversation_app.py`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
fix: use tenant_id of kb to get index name in rm chunk func (#8760)
### What problem does this PR solve?
The rm function in chunk_app.py now takes the index name differently
than other functions, so there will be situations where users can create
and update a chunk but not delete it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: Increase default `chunk_token_num` from 128 to 512 in parser config (#8753)
### What problem does this PR solve?
Updated the default `chunk_token_num` value in `api_utils.py` and
`validation_utils.py` to 512 to accommodate larger text chunks. Adjusted
corresponding test cases in HTTP and SDK API tests to reflect this
change.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Validate dialog name in `dialog_app.py` to ensure it is a non-empty
string and does not exceed 255 bytes in UTF-8 encoding.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix the case where pages variable might be None
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Issue #8602
`parser_config.task_page_size` can be defaults to `None` when dataset is
created by API. This was not handled by the `task_executor.py` code thus
`page_size` could sometimes be `None` which will cause issue in line
351.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Add comprehensive test suite for chunk operations including:
- Test files for create, list, retrieve, update, and delete chunks
- Authorization tests
- Batch operations tests
- Update test configurations and common utilities
- Validate `important_kwd` and `question_kwd` fields are lists in
chunk_app.py
- Reorganize imports and clean up duplicate code
### Type of change
- [x] Add test cases
### What problem does this PR solve?
Add file management HTTP_API for operating files
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Feat: allow users to choose which MCP tools are enabled (#8519)
### What problem does this PR solve?
Allow users to choose which MCP tools are enabled.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
stack:
```
2025-06-26 17:22:24,739 ERROR 1609 list index out of range
Traceback (most recent call last):
File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 880, in full_dispatch_request
rv = self.dispatch_request()
File "/ragflow/.venv/lib/python3.10/site-packages/flask/app.py", line 865, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return]
File "/ragflow/api/utils/api_utils.py", line 298, in decorated_function
return func(*args, **kwargs)
File "/ragflow/api/apps/sdk/session.py", line 472, in list_session
print(conv["reference"][message_num])
IndexError: list index out of range
```

### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix chunk number error after re-parsing. #8503.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: Add input validation to chunk creation endpoint (#8516)
### What problem does this PR solve?
- Include optional `tag_feas` field if present in request
- Add input validation for `important_kwd` and `question_kwd` to ensure
they are lists
- #8462
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Feat: add MCP dashboard functionalities list_tools and test_tool (#8505)
### What problem does this PR solve?
Add MCP dashboard functionalities list_tools and test_tool.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fix: Enforce default embedding model in create_dataset / update_dataset (#8486)
### What problem does this PR solve?
Previous:
- Defaulted to hardcoded model 'BAAI/bge-large-zh-v1.5@BAAI'
- Did not respect user-configured default embedding_model
Now:
- Correctly prioritizes user-configured default embedding_model
Other:
- Make embedding_model optional in CreateDatasetReq with proper None
handling
- Add default embedding model fallback in dataset update when empty
- Enhance validation utils to handle None values and string
normalization
- Update SDK default embedding model to None to match API changes
- Adjust related test cases to reflect new validation rules
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: some cases Task return but not set progress (#8469)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/8466
I go through the codes, current logic:
When do_handle_task raises an exception, handle_task will set the
progress, but for some cases do_handle_task internal will just return
but not set the right progress, at this cases the redis stream will been
acked but the task is running.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Add MCP server dashboard operations.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
before refactor
1. create file record
2. Add to blob
if have some execption at 2 the system db will have a file record but
not have related blob, which will introduce some bug.
after refactor
1. add to blob
2. create file record.
if 1 success but 2 failed just have a dirty blob in blob system, user
will not feel that
### Type of change
- [x] Refactoring
### What problem does this PR solve?
This is a cherry-pick from #7781 as requested.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Fix: Document parse via API will alot problen (#8407)
### What problem does this PR solve?
#8391#8404
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Fix: code debug may corrupt by history answer (#8385)
### What problem does this PR solve?
Fix code debug may corrupt by history answer.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Correct boolean parsing for 'desc' parameter in document_app.py to
properly handle string values
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix typo in dataset name length error message (#8351)
### What problem does this PR solve?
Fixes a minor grammar issue in a user-facing error message. The original
message said "large than" instead of the correct comparative form
"larger than". Just a quick fix I noticed while reading the code.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Refa: Implement centralized file name length limit using FILE_NAME_LEN_LIMIT constant (#8318)
### What problem does this PR solve?
- Replace hardcoded 255-byte file name length checks with
FILE_NAME_LEN_LIMIT constant
- Update error messages to show the actual limit value
- #8290
### Type of change
- [x] Refactoring
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Fix: Add validation for empty filenames in document_app.py (#8321)
### What problem does this PR solve?
- Add validation for empty filenames in document_app.py and trim
whitespace
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Add filename length validation (<=255 bytes) for document
upload/rename in both HTTP and SDK APIs
- Update error messages for consistency
- Fix comparison operator in SDK from '>=' to '>' for filename length
check
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: avoid mixing different embedding models in document parsing (#8260)
### What problem does this PR solve?
Fix mixing different embedding models in document parsing.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Progress is only updated if it's valid and not regressive.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Fix boolean parsing for 'desc' parameter in kb_app.py to properly
handle string values
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)