### What problem does this PR solve?
```bash
Traceback (most recent call last):
File "/home/infiniflow/workspace/ragflow/api/db/services/document_service.py", line 635, in update_progress
info["progress_msg"] = "%d tasks are ahead in the queue..."%get_queue_length(priority)
File "/home/infiniflow/workspace/ragflow/api/db/services/document_service.py", line 686, in get_queue_length
return int(group_info.get("lag", 0))
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
```
This issue can happen very rare. When a `stream` is first created, the
`lag` value may be nil, which can cause this issue. However, once any
message is synced, the `lag` will become `0` afterwards.
```bash
> XINFO GROUPS rag_flow_svr_queue
1) 1) "name"
2) "rag_flow_svr_task_broker"
3) "consumers"
4) (integer) 0
5) "pending"
6) (integer) 0
7) "last-delivered-id"
8) "1753952489937-0"
9) "entries-read"
10) (nil)
11) "lag"
12) (nil)
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
#9082#6365
<u> **WARNING: it's not compatible with the older version of `Agent`
module, which means that `Agent` from older versions can not work
anymore.**</u>
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fix:When deleting a knowledge base that is currently performing a parsing task, the parsing queue will not be deleted! (#9018)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/8995
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix chunk number error after re-parsing. #8503.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
fix: resolve residual image files issue after document deletion (#7964)
### What problem does this PR solve?
When deleting knowledge base documents in RAGFlow, the current process
only removes the block texts in Elasticsearch and the original files in
MinIO, but it leaves behind many binary images and thumbnails generated
during chunking. This pull request improves the deletion process by
querying the block information in Elasticsearch to ensure a more
thorough and complete cleanup.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Feat: KB detail supports document total size (#7546)
### What problem does this PR solve?
Kb detail supports return document total size now.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Feat: Improve 'user_canvan_version' delete and 'document' delete performance (#6553)
### What problem does this PR solve?
1. Add delete_by_ids method
2. Add get_doc_ids_by_doc_names
3. Improve user_canvan_version's logic (avoid O(n) db IO)
4. Improve document delete logic (avoid O(n) db IO)
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
Fix `filed_map` was incorrectly persisted. #7412
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix: whole knowledge graph lost after removing any document in the knowledge base (#7151)
### What problem does this PR solve?
When you removed any document in a knowledge base using knowledge graph,
the graph's `removed_kwd` is set to "Y".
However, in the function `graphrag.utils.get_gaph`, `rebuild_graph`
method is passed and directly return `None` while `removed_kwd=Y`,
making residual part of the graph abandoned (but old entity data still
exist in db).
Besides, infinity instance actually pass deleting graph components'
`source_id` when removing document. It may cause wrong graph after
rebuild.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
[BREAKING CHANGE] GET to POST: enhance document list capability (#7349)
### What problem does this PR solve?
Enhance capability of `list_docs`.
Breaking change: change method from `GET` to `POST`.
### Type of change
- [x] Refactoring
- [x] Enhancement with breaking change
Fix knowledge_graph_kwd on infinity. Close #6476 and #6624 (#6651)
### What problem does this PR solve?
Fix knowledge_graph_kwd on infinity. Close #6476 and #6624
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Resolve document concurrent upload issue. #6039
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Refactored DocumentService.update_progress
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Fix:validate knowledge base association before document upload (#5373)
### What problem does this PR solve?
fix this bug: https://github.com/infiniflow/ragflow/issues/5368
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
Feat: Add support for document meta fields update through api (#5120)
### What problem does this PR solve?
add support for update document meta data through api
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Try to reuse existing chunks. Close #3793
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Fix logs. Use dict.pop instead of del. Close #3473 (#3484)
### What problem does this PR solve?
Fix logs. Use dict.pop instead of del.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Move settings initialization after module init phase (#3438)
### What problem does this PR solve?
1. Module init won't connect database any more.
2. Config in settings need to be used with settings.CONFIG_NAME
### Type of change
- [x] Refactoring
Signed-off-by: jinhai <haijin.chn@gmail.com>
Use consistent log file names, introduced initLogger (#3403)
### What problem does this PR solve?
Use consistent log file names, introduced initLogger
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Integration with Infinity
- Replaced ELASTICSEARCH with dataStoreConn
- Renamed deleteByQuery with delete
- Renamed bulk to upsertBulk
- getHighlight, getAggregation
- Fix KGSearch.search
- Moved Dealer.sql_retrieval to es_conn.py
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Add test for API
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: liuhua <10215101452@stu.ecun.edu.cn>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>