Browse Source

fix bug [ERROR][Exception]: 8 vs. 9 (#6955)

### What problem does this PR solve?

Sometimes, the **s** in **chunks (s, a)** is an empty string. This
causes the condition **if s and len(a) > 0** in the line **chunks = [(s,
a) for s, a in chunks if s and len(a) > 0]** to fail, which changes the
length of the new chunks. As a result, the final assertion **assert
len(chunks) - end == n_clusters, "{} vs. {}".format(len(chunks) - end,
n_clusters)** fails and raises a confusing error like 7 vs. 8

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
tags/v0.18.0
dylan 6 months ago
parent
commit
e54c0e39b5
No account linked to committer's email address
1 changed files with 2 additions and 2 deletions
  1. 2
    2
      rag/raptor.py

+ 2
- 2
rag/raptor.py View File

@@ -77,11 +77,11 @@ class RecursiveAbstractiveProcessing4TreeOrganizedRetrieval:
return optimal_clusters

async def __call__(self, chunks, random_state, callback=None):
layers = [(0, len(chunks))]
start, end = 0, len(chunks)
if len(chunks) <= 1:
return []
chunks = [(s, a) for s, a in chunks if s and len(a) > 0]
layers = [(0, len(chunks))]
start, end = 0, len(chunks)

async def summarize(ck_idx: list[int]):
nonlocal chunks

Loading…
Cancel
Save