Feat: add SMTP support for user invitation emails (#9479)
### What problem does this PR solve?
Add SMTP support for user invitation emails
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Refactor:Improve the logic so that it does not decode base 64 for the test image each time (#9264)
### What problem does this PR solve?
Improve the logic so that it does not decode base 64 for the test image
each time
### Type of change
- [x] Refactoring
- [x] Performance Improvement
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
#9082#6365
<u> **WARNING: it's not compatible with the older version of `Agent`
module, which means that `Agent` from older versions can not work
anymore.**</u>
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Refa: Update base64 test image with new sample data (#9115)
### What problem does this PR solve?
Replace the placeholder test image in base64_image.py with a new sample
image data string.
### Type of change
- [x] Refactoring
Feat: parsing supports jsonl or ldjson format (#9087)
### What problem does this PR solve?
Supports jsonl or ldjson format. Feature request from
[discussion](https://github.com/orgs/infiniflow/discussions/8774).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Refa: validation utils to use Pydantic v2 style models (#9037)
### What problem does this PR solve?
- Update BaseModel to use model_config instead of Config class
- Replace StrEnum with Literal types for method fields
- Convert Field declarations to Annotated style
### Type of change
- [x] Refactoring
fix: obfuscate additional server secrets values (#9014)
### What problem does this PR solve?
Obfuscates additional secrets values on ragflow_server startup to
prevent leakage:
* `secret` (azure)
* `client_secret` (oauth)
* `http_secret_key` (authentication)
* `sas_token` (azure)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Gifford R Nowland <gifford.r.nowland@aero.org>
Fix: Increase timeouts for document parsing and model checks (#8996)
### What problem does this PR solve?
- Extended embedding model timeout from 3 to 10 seconds in api_utils.py
- Added more time for large file batches and concurrent parsing
operations to prevent test flakiness
- Import from #8940
- https://github.com/infiniflow/ragflow/actions/runs/16422052652
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: Refactor parser config handling and add GraphRAG defaults (#8778)
### What problem does this PR solve?
- Update `get_parser_config` to merge provided configs with defaults
- Add GraphRAG configuration defaults for all chunk methods
- Make raptor and graphrag fields non-nullable in ParserConfig schema
- Update related test cases to reflect config changes
- Ensure backward compatibility while adding new GraphRAG support
- #8396
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: Increase default `chunk_token_num` from 128 to 512 in parser config (#8753)
### What problem does this PR solve?
Updated the default `chunk_token_num` value in `api_utils.py` and
`validation_utils.py` to 512 to accommodate larger text chunks. Adjusted
corresponding test cases in HTTP and SDK API tests to reflect this
change.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Feat: add MCP dashboard functionalities list_tools and test_tool (#8505)
### What problem does this PR solve?
Add MCP dashboard functionalities list_tools and test_tool.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fix: Enforce default embedding model in create_dataset / update_dataset (#8486)
### What problem does this PR solve?
Previous:
- Defaulted to hardcoded model 'BAAI/bge-large-zh-v1.5@BAAI'
- Did not respect user-configured default embedding_model
Now:
- Correctly prioritizes user-configured default embedding_model
Other:
- Make embedding_model optional in CreateDatasetReq with proper None
handling
- Add default embedding model fallback in dataset update when empty
- Enhance validation utils to handle None values and string
normalization
- Update SDK default embedding model to None to match API changes
- Adjust related test cases to reflect new validation rules
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add MCP server dashboard operations.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fix: Document parse via API will alot problen (#8407)
### What problem does this PR solve?
#8391#8404
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Fix: Move pagerank field from create to update dataset API (#8217)
### What problem does this PR solve?
- Remove pagerank from CreateDatasetReq and add to UpdateDatasetReq
- Add pagerank update logic in dataset update endpoint
- Update API documentation to reflect changes
- Modify related test cases and SDK references
#8208
This change makes pagerank a mutable property that can only be set after
dataset creation, and only when using elasticsearch as the doc engine.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Refa: HTTP API list datasets / test cases / docs (#7720)
### What problem does this PR solve?
This PR introduces Pydantic-based validation for the list datasets HTTP
API, improving code clarity and robustness. Key changes include:
Pydantic Validation
Error Handling
Test Updates
Documentation Updates
### Type of change
- [x] Documentation Update
- [x] Refactoring
Feat: repair corrupted PDF files on upload automatically (#7693)
### What problem does this PR solve?
Try the best to repair corrupted PDF files on upload automatically.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Refa: HTTP API delete dataset / test cases / docs (#7657)
### What problem does this PR solve?
This PR introduces Pydantic-based validation for the delete dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. Error Handling
3. Test Updates
4. Documentation Updates
### Type of change
- [x] Documentation Update
- [x] Refactoring
Fix(api): correct default value handling in dataset parser config (#7589)
### What problem does this PR solve?
Fix HTTP API Create/Update dataset parser config default value error
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Refa: HTTP API update dataset / test cases / docs (#7564)
### What problem does this PR solve?
This PR introduces Pydantic-based validation for the update dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. Error Handling
3. Test Updates
4. Documentation Updates
5. fix bug: #5915
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring
Fix: change create dataset htto api delimiter default value to r'\n' (#7434)
### What problem does this PR solve?
change create dataset delimiter default value to r'\n'
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Remove unnecessary parameter restrictions in dataset creation API
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Refa: http API create dataset and test cases (#7393)
### What problem does this PR solve?
This PR introduces Pydantic-based validation for the create dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. Error Handling
3. Test Updates
4. Documentation
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
In the generate_confirmation_token method, a spelling error was found
with 'tenent_id'. The correct spelling should be 'tenant_id'.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: shengliang xiao <shengliangxiao2024@gmail.com>
Feat: add primitive support for function calls (#6840)
### What problem does this PR solve?
This PR introduces **primitive support for function calls**,
enabling the system to handle basic function call capabilities.
However, this feature is currently experimental and **not yet enabled
for general use**, as it is only supported by a subset of models,
namely, Qwen and OpenAI models.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
add openai agent
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Feat: Add Duplicate ID Check and Update Deletion Logic (#6376)
- Introduce the `check_duplicate_ids` function in `dataset.py` and
`doc.py` to check for and handle duplicate IDs.
- Update the deletion operation to ensure that when deleting datasets
and documents, error messages regarding duplicate IDs can be returned.
- Implement the `check_duplicate_ids` function in `api_utils.py` to
return unique IDs and error messages for duplicate IDs.
### What problem does this PR solve?
Close https://github.com/infiniflow/ragflow/issues/6234
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Fix: #5719 Added type check for parser_config (#6243)
### What problem does this PR solve?
Fix #5719
Add data type validation for parser_config
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)