H
2290c2a2f0
fix pdf_paser char content confusion (#1462)
### What problem does this PR solve?
#1407
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
1 år sedan
H
dbb8f7b77b
fix pdf_parser content confusion (#1458)
### What problem does this PR solve?
#1407
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
1 år sedan
Zhedong Cen
45853505bb
Fix occasional errors in pdf table recognition (#1277)
### What problem does this PR solve?
Fix occasional errors in pdf table recognition
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
1 år sedan
KevinHuSh
4454ba7a1e
add self-rag (#1070)
### What problem does this PR solve?
#1069
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
1 år sedan
Jin Hai
cdea1d0a85
Update readme and add license (#1018)
### What problem does this PR solve?
- Update readme
- Add license
### Type of change
- [x] Documentation Update
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
1 år sedan
KevinHuSh
843720f958
fix bug in pdf parser (#986)
### What problem does this PR solve?
#963
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
1 år sedan
KevinHuSh
7eee193956
fix #917 #915 (#946)
### What problem does this PR solve?
#917
#915
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
1 år sedan
xinzhuang
3bbdf3b770
fixbug for computing 'not concating feature' (#896)
### What problem does this PR solve?
When pdfparser call `_naive_vertical_merge` method,there is a "not
concating feature " value by computing difference between `b` and `b_`'s
layoutno ,but actually is `b` and `b`. I think it's a bug, so fix it.
Please check again.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
1 år sedan
KevinHuSh
99be226c7c
fix coordinate error (#686)
### What problem does this PR solve?
#683
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
1 år sedan
KevinHuSh
cab274f560
remove PyMuPDF (#618)
### What problem does this PR solve?
#613
### Type of change
- [x] Other (please describe):
1 år sedan
KevinHuSh
8c07992b6c
refine code (#595)
### What problem does this PR solve?
### Type of change
- [x] Refactoring
1 år sedan
KevinHuSh
d589b0f568
fix exception in pdf parser (#584)
### What problem does this PR solve?
#451
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
1 år sedan
KevinHuSh
9d60a84958
refactor code (#583)
### What problem does this PR solve?
### Type of change
- [x] Refactoring
1 år sedan
KevinHuSh
66f8d35632
Refactor (#537)
### What problem does this PR solve?
### Type of change
- [x] Refactoring
1 år sedan
KevinHuSh
0dfc8ddc0f
enlarge docker memory usage (#501)
### What problem does this PR solve?
### Type of change
- [x] Refactoring
1 år sedan
KevinHuSh
962c66714e
fix divide by zero bug (#447)
### What problem does this PR solve?
#445
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
1 år sedan
加帆
39f1feaccb
Bug fix pdf parse index out of range (#440)
### What problem does this PR solve?
fix a bug comes when parse some pdf file #436
### Type of change
- [☑️ ] Bug Fix (non-breaking change which fixes an issue)
1 år sedan
KevinHuSh
0499a3f621
rm page number exception for pdf parser (#424)
### What problem does this PR solve?
#423
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
1 år sedan
KevinHuSh
453c29170f
make sure the models will not be load twice (#422)
### What problem does this PR solve?
#381
### Type of change
- [x] Refactoring
1 år sedan
KevinHuSh
a5384446e3
let's load model from local (#163)
1 år sedan
KevinHuSh
fd7fcb5baf
apply pep8 formalize (#155)
1 år sedan
KevinHuSh
979b3a5b4b
support snapshot download from local (#153)
* support snapshot download from local
* let snapshot download from local
1 år sedan
KevinHuSh
da21320b88
fix plainPdf bugs (#152)
1 år sedan
KevinHuSh
71fe314955
refine page ranges (#147)
1 år sedan
KevinHuSh
f6aee7f230
add use layout or not option (#145)
* add use layout or not option
* trival
1 år sedan
KevinHuSh
6c6b144de2
refine manual parser (#140)
1 år sedan
KevinHuSh
6999598101
refine for English corpus (#135)
1 år sedan
KevinHuSh
9a843667b3
fix github account login issue (#132)
1 år sedan
KevinHuSh
9da671b951
refine manul parser (#131)
1 år sedan
KevinHuSh
675a9f8d9a
add dockerfile for cuda envirement. Refine table search strategy, (#123)
1 år sedan
KevinHuSh
8f86ab9f7f
refine pdf parser, add time zone to userinfo (#112)
1 år sedan
KevinHuSh
602038ac49
fix task cancling bug (#98)
1 år sedan
KevinHuSh
8a57f2afd5
change callback strategy, add timezone to docker (#96)
1 år sedan
KevinHuSh
7bfaf0df29
fix position extraction bug (#93)
* fix position extraction bug
* remove delimiter for naive parser
1 år sedan
KevinHuSh
685b4d8a95
fix table desc bugs, add positions to chunks (#91)
1 år sedan
KevinHuSh
8a726fb04b
solve task execution issues (#90)
1 år sedan
KevinHuSh
3d4315c42a
resolve the issue of naive parser (#87)
1 år sedan
KevinHuSh
0429107e80
fix user login issue (#85)
1 år sedan
KevinHuSh
4568a4b2cb
refine admin initialization (#75)
1 år sedan
KevinHuSh
d32322c081
rename vision, add layour and tsr recognizer (#70)
* rename vision, add layour and tsr recognizer
* trivial fixing
1 år sedan
KevinHuSh
cacd36c5e1
use onnx models, new deepdoc (#68)
1 år sedan
KevinHuSh
a8294f2168
Refine resume parts and fix bugs in retrival using sql (#66)
1 år sedan
KevinHuSh
407b2523b6
remove unused codes, seperate layout detection out as a new api. Add new rag methed 'table' (#55)
1 år sedan
KevinHuSh
51482f3e2a
Some document API refined. (#53)
Add naive chunking method to RAG
1 år sedan
KevinHuSh
e6acaf6738
Add Q&A and Book, fix task running bugs (#50)
1 år sedan
KevinHuSh
6224edcd1b
Add task moduel, and pipline the task and every parser (#49)
1 år sedan
KevinHuSh
96a1a44cb6
add paper & manual parser (#46)
1 år sedan
KevinHuSh
072f9dd5bc
Add app to rag module: presentaion & laws (#43)
1 år sedan
KevinHuSh
484e5abc1f
llm configuation refine and trievalTest API refine (#40)
1 år sedan
KevinHuSh
30791976d5
build python version rag-flow (#21)
* clean rust version project
* clean rust version project
* build python version rag-flow
1 år sedan