Kaynağa Gözat

fix: retry embedding with Qwen family models when limits temporarily reached. (#8690)

fix: retry embedding with Qwen family models when limits temporarily
reached.

APIs of Qwen family models are limited by calling rates. When reached,
the "output" attribute of the "resp" will be None, and in turn cause
TypeError when trying to retrieve "embeddings". Since these limits are
almost temporary, I have added a simple retry mechanism to avoid it.
Besides, if retry_max reached, the error can be early raised, instead of
hidden behind "TypeError".

### What problem does this PR solve?

Sometimes Qwen blocks calling due to rate limits, but it will cause the
whole parsing procedure stops when creating knowledge base. In this
situation, resp["output"] will be None, and resp["output"]["embeddings"]
will cause TypeError. Since the limits are temporary, I apply a simple
retry mechanism to solve it.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
tags/v0.20.0
6607changchun 3 ay önce
ebeveyn
işleme
9580e99650
No account linked to committer's email address
1 değiştirilmiş dosya ile 9 ekleme ve 0 silme
  1. 9
    0
      rag/llm/embedding_model.py

+ 9
- 0
rag/llm/embedding_model.py Dosyayı Görüntüle

@@ -200,13 +200,22 @@ class QWenEmbed(Base):

def encode(self, texts: list):
import dashscope
import time

batch_size = 4
res = []
token_count = 0
texts = [truncate(t, 2048) for t in texts]
for i in range(0, len(texts), batch_size):
retry_max = 5
resp = dashscope.TextEmbedding.call(model=self.model_name, input=texts[i : i + batch_size], api_key=self.key, text_type="document")
while resp["output"] is None and retry_max > 0:
time.sleep(10)
resp = dashscope.TextEmbedding.call(model=self.model_name, input=texts[i : i + batch_size], api_key=self.key, text_type="document")
retry_max -= 1
if retry_max == 0 and resp["output"] is None:
log_exception(ValueError("Retry_max reached, calling embedding model failed"))
raise
try:
embds = [[] for _ in range(len(resp["output"]["embeddings"]))]
for e in resp["output"]["embeddings"]:

Loading…
İptal
Kaydet