Procházet zdrojové kódy

fix: retry embedding with Qwen family models when limits temporarily reached. (#8690)

fix: retry embedding with Qwen family models when limits temporarily
reached.

APIs of Qwen family models are limited by calling rates. When reached,
the "output" attribute of the "resp" will be None, and in turn cause
TypeError when trying to retrieve "embeddings". Since these limits are
almost temporary, I have added a simple retry mechanism to avoid it.
Besides, if retry_max reached, the error can be early raised, instead of
hidden behind "TypeError".

### What problem does this PR solve?

Sometimes Qwen blocks calling due to rate limits, but it will cause the
whole parsing procedure stops when creating knowledge base. In this
situation, resp["output"] will be None, and resp["output"]["embeddings"]
will cause TypeError. Since the limits are temporary, I apply a simple
retry mechanism to solve it.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
tags/v0.20.0
6607changchun před 3 měsíci
rodič
revize
9580e99650
Žádný účet není propojen s e-mailovou adresou tvůrce revize
1 změnil soubory, kde provedl 9 přidání a 0 odebrání
  1. 9
    0
      rag/llm/embedding_model.py

+ 9
- 0
rag/llm/embedding_model.py Zobrazit soubor

@@ -200,13 +200,22 @@ class QWenEmbed(Base):

def encode(self, texts: list):
import dashscope
import time

batch_size = 4
res = []
token_count = 0
texts = [truncate(t, 2048) for t in texts]
for i in range(0, len(texts), batch_size):
retry_max = 5
resp = dashscope.TextEmbedding.call(model=self.model_name, input=texts[i : i + batch_size], api_key=self.key, text_type="document")
while resp["output"] is None and retry_max > 0:
time.sleep(10)
resp = dashscope.TextEmbedding.call(model=self.model_name, input=texts[i : i + batch_size], api_key=self.key, text_type="document")
retry_max -= 1
if retry_max == 0 and resp["output"] is None:
log_exception(ValueError("Retry_max reached, calling embedding model failed"))
raise
try:
embds = [[] for _ in range(len(resp["output"]["embeddings"]))]
for e in resp["output"]["embeddings"]:

Načítá se…
Zrušit
Uložit