Refa: validation utils to use Pydantic v2 style models (#9037)
### What problem does this PR solve?
- Update BaseModel to use model_config instead of Config class
- Replace StrEnum with Literal types for method fields
- Convert Field declarations to Annotated style
### Type of change
- [x] Refactoring
Fix: Increase timeouts for document parsing and model checks (#8996)
### What problem does this PR solve?
- Extended embedding model timeout from 3 to 10 seconds in api_utils.py
- Added more time for large file batches and concurrent parsing
operations to prevent test flakiness
- Import from #8940
- https://github.com/infiniflow/ragflow/actions/runs/16422052652
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: Refactor parser config handling and add GraphRAG defaults (#8778)
### What problem does this PR solve?
- Update `get_parser_config` to merge provided configs with defaults
- Add GraphRAG configuration defaults for all chunk methods
- Make raptor and graphrag fields non-nullable in ParserConfig schema
- Update related test cases to reflect config changes
- Ensure backward compatibility while adding new GraphRAG support
- #8396
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: Increase default `chunk_token_num` from 128 to 512 in parser config (#8753)
### What problem does this PR solve?
Updated the default `chunk_token_num` value in `api_utils.py` and
`validation_utils.py` to 512 to accommodate larger text chunks. Adjusted
corresponding test cases in HTTP and SDK API tests to reflect this
change.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Add comprehensive test suite for chunk operations including:
- Test files for create, list, retrieve, update, and delete chunks
- Authorization tests
- Batch operations tests
- Update test configurations and common utilities
- Validate `important_kwd` and `question_kwd` fields are lists in
chunk_app.py
- Reorganize imports and clean up duplicate code
### Type of change
- [x] Add test cases
Fix: Enforce default embedding model in create_dataset / update_dataset (#8486)
### What problem does this PR solve?
Previous:
- Defaulted to hardcoded model 'BAAI/bge-large-zh-v1.5@BAAI'
- Did not respect user-configured default embedding_model
Now:
- Correctly prioritizes user-configured default embedding_model
Other:
- Make embedding_model optional in CreateDatasetReq with proper None
handling
- Add default embedding model fallback in dataset update when empty
- Enhance validation utils to handle None values and string
normalization
- Update SDK default embedding model to None to match API changes
- Adjust related test cases to reflect new validation rules
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
- Add new test suite for document app with
create/list/parse/upload/remove tests
- Update API URLs to use version variable from config in HTTP and web
API tests
### Type of change
- [x] Add test cases
### What problem does this PR solve?
1. rename var
2. update if statement
### Type of change
- [x] Refactoring
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
- Add filename length validation (<=255 bytes) for document
upload/rename in both HTTP and SDK APIs
- Update error messages for consistency
- Fix comparison operator in SDK from '>=' to '>' for filename length
check
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Test: Add web API test suite for knowledge base operations (#8254)
### What problem does this PR solve?
- Implement RAGFlowWebApiAuth class for web API authentication
- Add comprehensive test cases for KB CRUD operations
- Set up common fixtures and utilities in conftest.py
- Add helper functions in common.py for web API requests
The changes establish a complete testing framework for knowledge base
management via web API endpoints.
### Type of change
- [x] Add test case
### What problem does this PR solve?
- Move common constants (HOST_ADDRESS, INVALID_API_TOKEN, etc.) to
configs.py
- Update test imports to use centralized configs
- Clean up duplicate constant definitions across test files
This improves maintainability by centralizing configuration.
### Type of change
- [x] Refactoring test case
### What problem does this PR solve?
- Fix test assertions in test_delete_chunks.py to expect empty results
after deletion
Action 7619
### Type of change
- [x] Bug Fix test cases
Fix: Move pagerank field from create to update dataset API (#8217)
### What problem does this PR solve?
- Remove pagerank from CreateDatasetReq and add to UpdateDatasetReq
- Add pagerank update logic in dataset update endpoint
- Update API documentation to reflect changes
- Modify related test cases and SDK references
#8208
This change makes pagerank a mutable property that can only be set after
dataset creation, and only when using elasticsearch as the doc engine.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Test: Refactor test fixtures to use HttpApiAuth naming consistently (#8180)
### What problem does this PR solve?
- Rename `api_key` fixture to `HttpApiAuth` across all test files
- Update all dependent fixtures and test cases to use new naming
- Maintain same functionality while improving naming clarity
The rename better reflects the fixture's purpose as an HTTP API
authentication helper rather than just an API key.
### Type of change
- [x] Refactoring
Test: Refactor test fixtures and add SDK session management tests (#8141)
### What problem does this PR solve?
- Consolidate HTTP API test fixtures using batch operations
(batch_add_chunks, batch_create_chat_assistants)
- Fix fixture initialization order in clear_session_with_chat_assistants
- Add new SDK API test suite for session management
(create/delete/list/update)
### Type of change
- [x] Add test cases
- [x] Refactoring
Test: Add SDK API tests for chat assistant management and improve con… (#8131)
### What problem does this PR solve?
- Implement new SDK API test cases for chat assistant CRUD operations
- Enhance HTTP API concurrent tests to use as_completed for better
reliability
### Type of change
- [x] Add test cases
- [x] Refactoring
Test: Refactor test concurrency handling and add SDK chunk management tests (#8112)
### What problem does this PR solve?
- Improve concurrent test cases by using as_completed for better
reliability
- Rename variables for clarity (chunk_num -> count)
- Add new SDK API test suite for chunk management operations
- Update HTTP API tests with consistent concurrency patterns
### Type of change
- [x] Add test cases
- [x] Refactoring