### What problem does this PR solve?
Add Claude Opus 4.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
### What problem does this PR solve?
Add a series of qwen3 latest SOTA models:
qwen3-coder-480b-a35b-instruct, qwen3-30b-a3b-instruct-2507,
qwen3-30b-a3b-thinking-2507, qwen3-235b-a22b-instruct-2507,
qwen3-235b-a22b-thinking-2507
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix kimi-latest is not authorized.
Add kimi-thinking-preview.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
add Kimi-K2-Instruct from Tongyi-Qianwen API (#9125)
### What problem does this PR solve?
add Kimi-K2-Instruct from Tongyi-Qianwen API
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add model provider DeepInfra. This model list comes from our community.
NOTE: most endpoints haven't been tested, but they should work as OpenAI
does.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fix: Adds newest Gemini models to fit google's standard API rate limits (#8970)
### What problem does this PR solve?
Adds configurations for gemini-2.5-flash and Gemini 2.5-pro models,
including tags, maximum token limits, and model types.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add Kimi model series support.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR enhances the application's capabilities by adding support for
four new Voyage embedding models (voyage-3-large, voyage-3.5,
voyage-3.5-lite, and voyage-code-3) to the `llm_factories.json`
configuration file. These models expand the available options for text
embedding tasks, enabling improved processing of text data with a
maximum token limit of 32,000. This addition addresses the need for more
diverse and specialized embedding models to support various use cases
without altering existing functionality.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add xAI provider (experimental feature, requires user feedback).
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add Qwen3-Embedding text-embedding-v4.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
change default models to buildin models
https://github.com/infiniflow/ragflow/issues/7774
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Feat: update llm factories for SILICONFLOW (#7620)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Other (please describe): llm factories update
Feat: add support for OpenAi gpt 4.1 series (#7540)
### What problem does this PR solve?
Adds support for the GPT-4.1 series from OpenAI.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
fix bug #7309 deepseek-ai/deepseek-vl2 model can not be select as a VL model to parse pdf image (#7312)
### What problem does this PR solve?
fix deepseek-ai/deepseek-vl2 model can not be select as a VL model to
parse pdf image . And add other vl models config from siliconflow
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: unknown <taoshi.ln@chinatelecom.cn>
### What problem does this PR solve?
Qwen3 and more LLMs.
Close #7296
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
With current config will get error "Fail to access model(gemma-7b-it)
using this api key"
Since the model has been removed, according to Groq official document:
https://console.groq.com/docs/models
### Type of change
- [ x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Chenzy <chenzy901@gmail.com>
### What problem does this PR solve?
AWS Bedrock has made deepseek-r1 available on its serverless inference.
This adds the R1 serverless model for use via the bedrock model
abilities.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Support downloading models from ModelScope Community. (#5073)
This PR supports downloading models from ModelScope. The main
modifications are as follows:
-New Feature (non-breaking change which adds functionality)
-Documentation Update
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
The current design is not well-suited for multimodal models, as each
model can only be configured for a single purpose—either chat or
Img2txt. To work around this limitation, we use model aliases such as
gpt-4o-mini and gpt-4o-mini-2024-07-18.
To fix this, this PR allows specifying the Img2txt model by tag instead
of model_type.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Add a LLM provider: PPIO
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
Add GPUStack as a new model provider.
[GPUStack](https://github.com/gpustack/gpustack) is an open-source GPU
cluster manager for running LLMs. Currently, locally deployed models in
GPUStack cannot integrate well with RAGFlow. GPUStack provides both
OpenAI compatible APIs (Models / Chat Completions / Embeddings /
Speech2Text / TTS) and other APIs like Rerank. We would like to use
GPUStack as a model provider in ragflow.
[GPUStack Docs](https://docs.gpustack.ai/latest/quickstart/)
Related issue: https://github.com/infiniflow/ragflow/issues/4064.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### Testing Instructions
1. Install GPUStack and deploy the `llama-3.2-1b-instruct` llm, `bge-m3`
text embedding model, `bge-reranker-v2-m3` rerank model,
`faster-whisper-medium` Speech-to-Text model, `cosyvoice-300m-sft` in
GPUStack.
2. Add provider in ragflow settings.
3. Testing in ragflow.