|
|
1 maand geleden | |
|---|---|---|
| .. | ||
| common | 1 maand geleden | |
| setup | 1 maand geleden | |
| README.md | 1 maand geleden | |
| cleanup.py | 1 maand geleden | |
| locust.conf | 1 maand geleden | |
| run_locust_stress_test.sh | 1 maand geleden | |
| setup_all.py | 1 maand geleden | |
| sse_benchmark.py | 1 maand geleden | |
A high-performance stress test suite for Dify workflow execution using Locust - optimized for measuring Server-Sent Events (SSE) streaming performance.
The stress test focuses on four critical SSE performance indicators:
The stress test focuses on SSE streaming performance with these key metrics:
/v1/workflows/runThe stress test tests a single endpoint with comprehensive SSE metrics tracking:
Dependencies are automatically installed when running setup:
Complete Dify setup:
# Run the complete setup
python scripts/stress-test/setup_all.py
IMPORTANT: For accurate stress testing, run the API server with Gunicorn in production mode:
# Run from the api directory
cd api
uv run gunicorn \
--bind 0.0.0.0:5001 \
--workers 4 \
--worker-class gevent \
--timeout 120 \
--keep-alive 5 \
--log-level info \
--access-logfile - \
--error-logfile - \
app:app
Configuration options explained:
--workers 4: Number of worker processes (adjust based on CPU cores)--worker-class gevent: Async worker for handling concurrent connections--timeout 120: Worker timeout for long-running requests--keep-alive 5: Keep connections alive for SSE streamingNOT RECOMMENDED for stress testing:
# Debug mode - DO NOT use for stress testing (slow performance)
./dev/start-api # This runs Flask in debug mode with single-threaded execution
Also start the Mock OpenAI server:
python scripts/stress-test/setup/mock_openai_server.py
# Run with default configuration (headless mode)
./scripts/stress-test/run_locust_stress_test.sh
# Or run directly with uv
uv run --project api python -m locust -f scripts/stress-test/sse_benchmark.py --host http://localhost:5001
# Run with Web UI (access at http://localhost:8089)
uv run --project api python -m locust -f scripts/stress-test/sse_benchmark.py --host http://localhost:5001 --web-port 8089
The script will:
reports/ directoryThe stress test configuration is in locust.conf:
users = 10 # Number of concurrent users
spawn-rate = 2 # Users spawned per second
run-time = 1m # Test duration (30s, 5m, 1h)
headless = true # Run without web UI
Modify the questions list in sse_benchmark.py:
self.questions = [
"Your custom question 1",
"Your custom question 2",
# Add more questions...
]
After running the stress test, you’ll find these files in the reports/ directory:
locust_summary_YYYYMMDD_HHMMSS.txt - Complete console output with metricslocust_report_YYYYMMDD_HHMMSS.html - Interactive HTML report with chartslocust_YYYYMMDD_HHMMSS_stats.csv - CSV with detailed statisticslocust_YYYYMMDD_HHMMSS_stats_history.csv - Time-series dataRequests Per Second (RPS):
Response Time Percentiles:
Success Rate:
============================================================
DIFY SSE STRESS TEST
============================================================
[2025-09-12 15:45:44,468] Starting test run with 10 users at 2 users/sec
============================================================
SSE Metrics | Active: 8 | Total Conn: 142 | Events: 2841
Rates: 2.4 conn/s | 47.3 events/s | TTFE: 43ms
============================================================
Type Name # reqs # fails | Avg Min Max Med | req/s failures/s
---------|------------------------------|--------|--------|--------|--------|--------|--------|--------|-----------
POST /v1/workflows/run 142 0(0.00%) | 41 18 192 38 | 2.37 0.00
---------|------------------------------|--------|--------|--------|--------|--------|--------|--------|-----------
Aggregated 142 0(0.00%) | 41 18 192 38 | 2.37 0.00
============================================================
FINAL RESULTS
============================================================
Total Connections: 142
Total Events: 2841
Average TTFE: 43 ms
============================================================
Live SSE Metrics Box (Updates every 10 seconds):
SSE Metrics | Active: 8 | Total Conn: 142 | Events: 2841
Rates: 2.4 conn/s | 47.3 events/s | TTFE: 43ms
Standard Locust Table:
Type Name # reqs # fails | Avg Min Max Med | req/s
POST /v1/workflows/run 142 0(0.00%) | 41 18 192 38 | 2.37
Performance Targets:
✅ Good Performance:
⚠️ Warning Signs:
concurrency: 10
iterations: 100
concurrency: 100
iterations: 1000
concurrency: 500
iterations: 5000
concurrency: 1000
iterations: 10000
Gunicorn Tuning for Different Load Levels:
# Light load (10-50 concurrent users)
uv run gunicorn --bind 0.0.0.0:5001 --workers 2 --worker-class gevent app:app
# Medium load (50-200 concurrent users)
uv run gunicorn --bind 0.0.0.0:5001 --workers 4 --worker-class gevent --worker-connections 1000 app:app
# Heavy load (200-1000 concurrent users)
uv run gunicorn --bind 0.0.0.0:5001 --workers 8 --worker-class gevent --worker-connections 2000 --max-requests 1000 app:app
Worker calculation formula:
PostgreSQL Connection Pool Tuning:
For high-concurrency stress testing, increase the PostgreSQL max connections in docker/middleware.env:
# Edit docker/middleware.env
POSTGRES_MAX_CONNECTIONS=200 # Default is 100
# Recommended values for different load levels:
# Light load (10-50 users): 100 (default)
# Medium load (50-200 users): 200
# Heavy load (200-1000 users): 500
After changing, restart the PostgreSQL container:
docker compose -f docker/docker-compose.middleware.yaml down db
docker compose -f docker/docker-compose.middleware.yaml up -d db
Note: Each connection uses ~10MB of RAM. Ensure your database server has sufficient memory:
ulimit -n 65536
# Increase TCP buffer sizes
sudo sysctl -w net.core.rmem_max=134217728
sudo sysctl -w net.core.wmem_max=134217728
# Enable TCP fast open
sudo sysctl -w net.ipv4.tcp_fastopen=3
# Increase maximum connections
sudo sysctl -w kern.ipc.somaxconn=2048
# Dependencies are installed automatically, but if needed:
uv --project api add --dev locust sseclient-py
# Run setup
python scripts/stress-test/setup_all.py
# Start Dify API with Gunicorn (production mode)
cd api
uv run gunicorn --bind 0.0.0.0:5001 --workers 4 --worker-class gevent app:app
# Start Mock OpenAI server
python scripts/stress-test/setup/mock_openai_server.py
High error rate:
Permission denied running script:
chmod +x run_benchmark.sh
# Run stress test 3 times with 60-second intervals
for i in {1..3}; do
echo "Run $i of 3"
./run_locust_stress_test.sh
sleep 60
done
Run Locust directly with custom options:
# With specific user count and spawn rate
uv run --project api python -m locust -f scripts/stress-test/sse_benchmark.py \
--host http://localhost:5001 --users 50 --spawn-rate 5
# Generate CSV reports
uv run --project api python -m locust -f scripts/stress-test/sse_benchmark.py \
--host http://localhost:5001 --csv reports/results
# Run for specific duration
uv run --project api python -m locust -f scripts/stress-test/sse_benchmark.py \
--host http://localhost:5001 --run-time 5m --headless
# Compare multiple stress test runs
ls -la reports/stress_test_*.txt | tail -5
Possible causes:
Check for:
Investigate:
Locust was chosen over Drill for this stress test because:
To improve the stress test suite:
stress_test.yml for configuration changesrun_locust_stress_test.sh for workflow improvements