Parcourir la source

perf: Optimize GraphRAG’s LOOP_PROMPT (#7356)

### What problem does this PR solve?

当前graphrag的LOOP_PROMPT,会导致模型输出Y之后,继续补充了实体和关系,比较浪费时间。参照[graph
rag](https://github.com/microsoft/graphrag/blob/main/graphrag/prompts/index/extract_graph.py)最新的代码,修改了LOOP_PROMPT,经过验证,修改后可以稳定的输出Y停止。

Currently, GraphRAG’s LOOP_PROMPT causes the model to keep appending
entities and relationships even after outputting “Y,” which wastes time.
Referring to the latest code in
[graphRAG](https://github.com/microsoft/graphrag/blob/main/graphrag/prompts/index/extract_graph.py),
I modified the LOOP_PROMPT, and after verification the updated prompt
reliably outputs “Y” and stops.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>
tags/v0.19.0
liuzhenghua il y a 6 mois
Parent
révision
af770c5ced
Aucun compte lié à l'adresse e-mail de l'auteur
2 fichiers modifiés avec 3 ajouts et 2 suppressions
  1. 2
    1
      graphrag/general/graph_extractor.py
  2. 1
    1
      graphrag/general/graph_prompt.py

+ 2
- 1
graphrag/general/graph_extractor.py Voir le fichier

@@ -130,8 +130,9 @@ class GraphExtractor(Extractor):
async with chat_limiter:
continuation = await trio.to_thread.run_sync(lambda: self._chat("", history, {"temperature": 0.8}))
token_count += num_tokens_from_string("\n".join([m["content"] for m in history]) + response)
if continuation != "YES":
if continuation != "Y":
break
history.append({"role": "assistant", "content": "Y"})

records = split_string_by_multi_markers(
results,

+ 1
- 1
graphrag/general/graph_prompt.py Voir le fichier

@@ -106,7 +106,7 @@ Text: {input_text}
Output:"""

CONTINUE_PROMPT = "MANY entities were missed in the last extraction. Add them below using the same format:\n"
LOOP_PROMPT = "It appears some entities may have still been missed. Answer YES | NO if there are still entities that need to be added.\n"
LOOP_PROMPT = "It appears some entities may have still been missed. Answer Y if there are still entities that need to be added, or N if there are none. Please answer with a single letter Y or N.\n"

SUMMARIZE_DESCRIPTIONS_PROMPT = """
You are a helpful assistant responsible for generating a comprehensive summary of the data provided below.

Chargement…
Annuler
Enregistrer