### What problem does this PR solve?
1. Add sandbox options for max memory and timeout.
2. Malicious code detection for Python only.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Launch sandbox from docker-compose.
#4977
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
Feat: Adds OpenSearch2.19.1 as the vector_database support (#7140)
### What problem does this PR solve?
This PR adds the support for latest OpenSearch2.19.1 as the store engine
& search engine option for RAGFlow.
### Main Benefit
1. OpenSearch2.19.1 is licensed under the [Apache v2.0 License] which is
much better than Elasticsearch
2. For search, OpenSearch2.19.1 supports full-text
search、vector_search、hybrid_search those are similar with Elasticsearch
on schema
3. For store, OpenSearch2.19.1 stores text、vector those are quite
simliar with Elasticsearch on schema
### Changes
- Support opensearch_python_connetor. I make a lot of adaptions since
the schema and api/method between ES and Opensearch differs in many
ways(especially the knn_search has a significant gap) :
rag/utils/opensearch_coon.py
- Support static config adaptions by changing:
conf/service_conf.yaml、api/settings.py、rag/settings.py
- Supprt some store&search schema changes between OpenSearch and ES:
conf/os_mapping.json
- Support OpenSearch python sdk : pyproject.toml
- Support docker config for OpenSearch2.19.1 :
docker/.env、docker/docker-compose-base.yml、docker/service_conf.yaml.template
### How to use
- I didn't change the priority that ES as the default doc/search engine.
Only if in docker/.env , we set DOC_ENGINE=${DOC_ENGINE:-opensearch}, it
will work.
### Others
Our team tested a lot of docs in our environment by using OpenSearch as
the vector database ,it works very well.
All the conifg for OpenSearch is necessary.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Yongteng Lei <yongtengrey@outlook.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
This PR updates the MySQL container configuration by setting the
parameter --binlog_expire_logs_seconds to 604800 seconds (7 days). This
change ensures that MySQL automatically purges binary logs older than 7
days, helping to conserve disk space and maintain precise log
management.
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Bump infinity to v0.6.0-dev2. Close #4477
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Elasticsearch disk-based shard allocator use absolute byte values instead of ratio (#4069)
### What problem does this PR solve?
Elasticsearch disk-based shard allocator use absolute byte values
instead of ratio. Close #4018
### Type of change
- [ ] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
Added an infinity configuration file to easily customize the settings of Infinity (#3715)
### What problem does this PR solve?
Added an infinity configuration file to easily customize the settings of
Infinity
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Added doc for switching elasticsearch to infinity (#3370)
### What problem does this PR solve?
Added doc for switching elasticsearch to infinity
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
Integration with Infinity
- Replaced ELASTICSEARCH with dataStoreConn
- Renamed deleteByQuery with delete
- Renamed bulk to upsertBulk
- getHighlight, getAggregation
- Fix KGSearch.search
- Moved Dealer.sql_retrieval to es_conn.py
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Replaced redis with Valkey. Close #3070
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Build multi-arch docker image `infiniflow/ragflow:poetry` on
`linux/amd64` and `linux/arm64`.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Redis post config is missing
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
feat: Support Password Access for ElasticSearch (#1072)
### What problem does this PR solve?
Using password authentication to access ElasticSearch is essential,
especially in a production environment.
This PR will enable password access support.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Optimize task broker and executor for reduce memory usage and deployment
complexity.
### Type of change
- [x] Performance Improvement
- [x] Refactoring
### Change Log
- Enhance redis utils for message queue(use stream)
- Modify task broker logic via message queue (1.get parse event from
message queue 2.use ThreadPoolExecutor async executor )
- Modify the table column name of document and task (process_duation ->
process_duration maybe just a spelling mistake)
- Reformat some code style(just what i see)
- Add requirement_dev.txt for developer
- Add redis container on docker compose
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
chore: disable Kibana volume storage in Docker Compose (#548)
### What problem does this PR solve?
Since Kibana service is not currently being used, the associated volume
'kibanadata' has been commented out in the Docker Compose file. This
change helps to prevent the allocation of unnecessary resources and
simplifies the configuration.
### Type of change
- [x] Refactoring
unused Kibana volume storage
### What problem does this PR solve?
The docker-compose file can't config minio related port by .env file. So
I just add env `MINIO_CONSOLE_PORT=9001
MINIO_PORT=9000` to .env file.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
Issue link:#[[Link the issue
here](https://github.com/infiniflow/ragflow/issues/226)]
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
* add field progress msg into docinfo; add file processing procedure
* go through upload, create kb, add doc to kb
* smoke test for all API
* smoke test for all API