Docs: Update version references to v0.20.4 in READMEs and docs (#9758)
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.20.3 to v0.20.4
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
Docs: Update version references to v0.20.3 in READMEs and docs (#9581)
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.20.2 to v0.20.3
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
Docs: Update version references to v0.20.2 in READMEs and docs (#9559)
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.20.1 to v0.20.2
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
Docs: Update version references to v0.20.1 in READMEs and docs (#9335)
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.20.0 to v0.20.1
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
fix: add missing env vars and default values of service_conf.yaml (#9289)
### What problem does this PR solve?
Add missing env var `MYSQL_MAX_PACKET` to service_conf.yaml.template,
and add default values to opendal config to fix npe.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Feat: Add the migration script and its doc, added `backup` as default… (#8245)
### What problem does this PR solve?
This PR adds a data backup and migration solution for RAGFlow Docker
Compose deployments. Currently, users lack a standardized way to backup
and restore RAGFlow data volumes (MySQL, MinIO, Redis, Elasticsearch),
which is essential for data safety and environment migration.
**Solution:**
- **Migration Script** (`docker/migration.sh`) - Automates
backup/restore operations for all RAGFlow data volumes
- **Documentation**
(`docs/guides/migration/migrate_from_docker_compose.md`) - Usage guide
and best practices
- **Safety Features** - Container conflict detection and user
confirmations to prevent data loss
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
Co-authored-by: treedy <treedy2022@icloud.com>
Docs: Update version references to v0.20.0 in READMEs and docs (#9164)
### What problem does this PR solve?
- Update version tags in README files (including translations) from
v0.19.1 to v0.20.0
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files
### Type of change
- [x] Documentation Update
Feat: Enable MCP streamable-http model via docker compose (#9092)
### What problem does this PR solve?
Enable MCP streamable-http model via docker compose
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
adding platform: linux/amd64 to the mac build (#9059)
### What problem does this PR solve?
Mac OS build fails on M4. Docker compose requires platform to be
specified to build correctly
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Charles Copley <ccopley@ancera.com>
### What problem does this PR solve?
Bump to infinity v0.6.0-dev4.
WARNNING: infinity v0.6.0-dev4 has very different meta data format with
older versions. You have to destroy infinity data volume are restart
infinity container if there's existing data.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
fix: modify the connection ports of minio and redis in
service_conf.yaml.template
### What problem does this PR solve?
If you modify the external ports of minio and redis in the .env file, it
will also affect the connection ports inside the container in the
service_conf.yaml.template file, which is unreasonable.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Suppress docker-compose warning like:
```bash
The "HF_ENDPOINT" variable is not set. Defaulting to a blank string.
The "MACOS" variable is not set. Defaulting to a blank string.
The "SANDBOX_EXECUTOR_MANAGER_IMAGE variable is not set. Defaulting to a blank string.
The "SANDBOX_EXECUTOR_MANAGER_PORT variable is not set. Defaulting to a blank string.
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
Refa: Update Minio image to specific release version in docker-compose-base.yml (#8693)
### What problem does this PR solve?
- Ensure consistent Minio deployment by pinning the image to a specific
release version (RELEASE.2025-06-13T11-33-47Z) for stability and
reproducibility.
- #8672
### Type of change
- [x] Refactoring
fix(docker-compose):The old base image lost the curl command, and the image has been updated to fix this issue. Add Health Check (#8672)
### What problem does this PR solve?
1.The old base image lost the curl command, and an updated image was
used to fix this issue (the service has been tested in the new version)
2.Add Health Check
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix: when using external components, it is impossible to specify the
port, because the variables in the `docker/.env` variable were not
referenced by `docker/service_conf.yaml.template`.
382d2d0373/docker/.env (L85)
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Update .env ,Defaults to the v0.19.1-slim edition (#8412)
### What problem does this PR solve?
Update .env ,Defaults to the v0.19.1-slim edition
### Type of change
- [x] Other (please describe): Update .env ,Defaults to the
v0.19.1-slim edition
Feat: Add HTTPS setup instructions and configuration for Nginx (#8401)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change: Documentation Update/Refactoring
#### Summary
Adds HTTPS/SSL configuration guide/example to enable secure RAGFlow
deployments with proper certificate management.
#### Changes
- New HTTPS Setup Section: Step-by-step guide for SSL certificate
configuration
- Let's Encrypt Integration: Complete Certbot setup instructions
- Docker Configuration: Volume mapping examples for certificates
#### Key Features
- Prerequisites checklist
- Docker Compose configuration examples
- Support for both Let's Encrypt and existing certificates
#### Files Modified
- `README.md`
- `ragflow.https.conf` (new file)
Docs: Update version references to v0.19.1 in READMEs and docs (#8366)
### What problem does this PR solve?
- Update Docker image version badges and references from v0.19.0 to
v0.19.1
- Modify version mentions in all localized README files (id, ja, ko,
pt_br, tzh, zh)
- Update version in docker/README.md and related documentation files
- Includes updates to Helm values and Python SDK dependencies
### Type of change
- [x] Documentation Update
Feat: make document parsing and embedding batch sizes configurable via environment variables (#8266)
### Description
This PR introduces two new environment variables, `DOC_BULK_SIZE` and
`EMBEDDING_BATCH_SIZE`, to allow flexible tuning of batch sizes for
document parsing and embedding vectorization in RAGFlow. By making these
parameters configurable, users can optimize performance and resource
usage according to their hardware capabilities and workload
requirements.
### What problem does this PR solve?
Previously, the batch sizes for document parsing and embedding were
hardcoded, limiting the ability to adjust throughput and memory
consumption. This PR enables users to set these values via environment
variables (in `.env`, Helm chart, or directly in the deployment
environment), improving flexibility and scalability for both small and
large deployments.
- `DOC_BULK_SIZE`: Controls how many document chunks are processed in a
single batch during document parsing (default: 4).
- `EMBEDDING_BATCH_SIZE`: Controls how many text chunks are processed
in a single batch during embedding vectorization (default: 16).
This change updates the codebase, documentation, and configuration files
to reflect the new options.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):
### Additional context
- Updated `.env`, `helm/values.yaml`, and documentation to describe
the new variables.
- Modified relevant code paths to use the environment variables instead
of hardcoded values.
- Users can now tune these parameters to achieve better throughput or
reduce memory usage as needed.
Before:
Default value:
<img width="643" alt="image"
src="https://github.com/user-attachments/assets/086e1173-18f3-419d-a0f5-68394f63866a"
/>
After:
10x:
<img width="777" alt="image"
src="https://github.com/user-attachments/assets/5722bbc0-0bcb-4536-b928-077031e550f1"
/>
If the name field is not specified, Docker Compose will default to using
`docker` as the project name. This may cause conflicts with other
default projects, leading to unintended operations when executing
`docker compose` commands.
### What problem does this PR solve?
When executing Docker Compose commands, interference occurs between
multiple default projects, leading to operational chaos.
### Type of change
- [x] Other (please describe):
### What problem does this PR solve?
1. Add sandbox options for max memory and timeout.
2. Malicious code detection for Python only.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Docs: Improve oauth configuration documentation and examples (#7675)
### What problem does this PR solve?
Improve oauth configuration documentation and examples.
- Related pull requests:
- #7379
- #7553
- #7587
- Related issues:
- #3495
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Launch sandbox from docker-compose.
#4977
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
Refa: Deprecate `/github_callback` in favor of `/oauth/callback/<channel>` for GitHub OAuth integration (#7587)
### What problem does this PR solve?
Deprecate `/github_callback` route in favor of
`/oauth/callback/<channel>` for GitHub OAuth integration:
- Added GitHub OAuth support in the authentication module
- Introduced `GithubOAuthClient` with methods to fetch and normalize
user info
- Updated `CLIENT_TYPES` to include GitHub OAuth client
- Deprecated `/github_callback` route and suggested using the generic
`/oauth/callback/<channel>` route
---
- Related pull requests:
- #7379
- #7553
### Usage
- [Create a GitHub OAuth
App](https://github.com/settings/applications/new) to obtain the
`client_id` and `client_secret`, configure the authorization callback
url: `https://your-app.com/v1/user/oauth/callback/github`
- Edit `service_conf.yaml.template`:
```yaml
# ...
oauth:
github:
type: "github"
icon: "github"
display_name: "Github"
client_id: "your_client_id"
client_secret: "your_client_secret"
redirect_uri: "https://your-app.com/v1/user/oauth/callback/github"
# ...
```
### Type of change
- [x] Documentation Update
- [x] Refactoring (non-breaking change)
Perf: Increase database connection pool size (#7559)
### What problem does this PR solve?
1. The MySQL instance is configured with max_connections=1000,
but our connection pool was limited to max_connections: 100.
This mismatch caused connection pool exhaustion during performance
testing.
2. Increase stale_timeout to resolve #6548
### Type of change
- [x] Performance Improvement
Feat: Add `/login/channels` route and improve auth logic for frontend third-party login integration (#7521)
### What problem does this PR solve?
Add `/login/channels` route and improve auth logic to support frontend
integration with third-party login providers:
- Add `/login/channels` route to provide authentication channel list
with `display_name` and `icon`
- Optimize user info parsing logic by prioritizing `avatar_url` and
falling back to `picture`
- Simplify OIDC token validation by removing unnecessary `kid` checks
- Ensure `client_id` is safely cast to string during `audience`
validation
- Fix typo
---
- Related pull request: #7379
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
Fix https://github.com/infiniflow/ragflow/issues/7224 and
https://github.com/infiniflow/ragflow/issues/6793
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)a
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Feat: Add support for OAuth2 and OpenID Connect (OIDC) authentication (#7379)
### What problem does this PR solve?
Add support for OAuth2 and OpenID Connect (OIDC) authentication,
allowing OAuth/OIDC authentication using the specified routes:
- `/login/<channel>`: Initiates the OAuth flow for the specified channel
- `/oauth/callback/<channel>`: Handles the OAuth callback after
successful authentication
The callback URL should be configured in your OAuth provider as:
```
https://your-app.com/oauth/callback/<channel>
```
For detailed instructions on configuring **service_conf.yaml.template**,
see: `./api/apps/auth/README.md#usage`.
- Related issues
#3495
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
0.18.0 mcp server can not start with upgrade from 0.17.2 or new install
except rebuild all docker
Close #7321
mcp server can not start auto from docker :
2025-04-25 17:30:44,512 INFO 25 task_executor_2a9f3e2de99a_0 reported
heartbeat: {"name": "task_executor_2a9f3e2de99a_0", "now":
"2025-04-25T17:30:44.509+08:00", "boot_at":
"2025-04-25T16:43:33.038+08:00", "pending": 0, "lag": 0, "done": 0,
"failed": 0, "current": {}}
usage: server.py [-h] [--base_url BASE_URL] [--host HOST] [--port PORT]
[--mode MODE] [--api_key API_KEY]
server.py: error: unrecognized arguments:
problem:
server.py in docker start arguments not correct , so mcp server start
fail
reason:
```
1. docker-copose.yaml
example - --mcp-host-api-key="ragflow-12345678" is wrong. do not add "" to key or it says:"api-key wrong"
2.docker file entrypoint.sh can not translate config to exec command , we need mapping file from host to docker
- ./entrypoint.sh:/ragflow/entrypoint.sh
3.just add one code raw fix all probelm
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Documentation Update
- [x] Performance Improvement
---------
Co-authored-by: Yongteng Lei <yongtengrey@outlook.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Feat: Adds OpenSearch2.19.1 as the vector_database support (#7140)
### What problem does this PR solve?
This PR adds the support for latest OpenSearch2.19.1 as the store engine
& search engine option for RAGFlow.
### Main Benefit
1. OpenSearch2.19.1 is licensed under the [Apache v2.0 License] which is
much better than Elasticsearch
2. For search, OpenSearch2.19.1 supports full-text
search、vector_search、hybrid_search those are similar with Elasticsearch
on schema
3. For store, OpenSearch2.19.1 stores text、vector those are quite
simliar with Elasticsearch on schema
### Changes
- Support opensearch_python_connetor. I make a lot of adaptions since
the schema and api/method between ES and Opensearch differs in many
ways(especially the knn_search has a significant gap) :
rag/utils/opensearch_coon.py
- Support static config adaptions by changing:
conf/service_conf.yaml、api/settings.py、rag/settings.py
- Supprt some store&search schema changes between OpenSearch and ES:
conf/os_mapping.json
- Support OpenSearch python sdk : pyproject.toml
- Support docker config for OpenSearch2.19.1 :
docker/.env、docker/docker-compose-base.yml、docker/service_conf.yaml.template
### How to use
- I didn't change the priority that ES as the default doc/search engine.
Only if in docker/.env , we set DOC_ENGINE=${DOC_ENGINE:-opensearch}, it
will work.
### Others
Our team tested a lot of docs in our environment by using OpenSearch as
the vector database ,it works very well.
All the conifg for OpenSearch is necessary.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Yongteng Lei <yongtengrey@outlook.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Fix the entrypoint file from the docker container to solve #7249
Here is the important part from the logs:
```
docker logs -f ragflow-server
...
usage: server.py [-h] [--base_url BASE_URL] [--host HOST] [--port PORT]
[--mode MODE] [--api_key API_KEY]
server.py: error: unrecognized arguments:
...
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
This PR fixes an issue with the MCP server configuration in RAGFlow's
Docker deployment where:
1. Incorrect parameter naming (`--mcp--host-api-key` with double
hyphens) caused server startup failures
2. Port binding conflicts occurred due to unexposed MCP ports in Docker
3. Inconsistent host addressing between `0.0.0.0` and `127.0.0.1` led to
connectivity issues
The changes ensure proper MCP server initialization and reliable
inter-service communication.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### Key Changes
1. **Parameter Correction**:
- Fixed `--mcp--host-api-key` → `--mcp-host-api-key`
### What problem does this PR solve?
Add mcp self-host mode, a complement of #7084.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add MCP support with a client example.
Issue link: #4344
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Feat: extend S3 storage compatibility and add knowledge base ID prefix (#6355)
### What problem does this PR solve?
- Added support for S3-compatible protocols.
- Enabled the use of knowledge base ID as a file prefix when storing
files in S3.
- Updated docker/README.md to include detailed S3 and OSS configuration
instructions.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Consolidate entrypoint to support broader deployment scenarios (#6566)
### What problem does this PR solve?
This PR gives better control over how we distribute which service will
be loaded. With this approach, we can create containers to run only the
web server and others to run the task executor. It also introduces the
unique ID per task executor host, this will be important when scaling
task executors horizontally, considering unique task executor ids will
be required.
This new `entrypoint.sh` maintains the default behavior of starting the
web server and task executor in the same host.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [X] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
This PR updates the MySQL container configuration by setting the
parameter --binlog_expire_logs_seconds to 604800 seconds (7 days). This
change ensures that MySQL automatically purges binary logs older than 7
days, helping to conserve disk space and maintain precise log
management.
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):