156 Commits (v0.18.0)

Autor SHA1 Nachricht Datum
  QuintinTao 1b4016317e
fix bug chunking:expected string or bytes-like object (#7116) vor 6 Monaten
  Kevin Hu ed5f81b02e
Fix: abnormal cell mergeing. (#6991) vor 6 Monaten
  dylan 5aae73c230
Make error messages during PPT processing clearer. (#6980) vor 6 Monaten
  Kevin Hu 14a3efd756
Fix: docx image exceptions. (#6839) vor 6 Monaten
  Kevin Hu ee5aa51d43
Fix: point in tag issue. (#6436) vor 7 Monaten
  fansir 0e0ebaac5f
Feat: Adds hierarchical title path tracking for tables in DOCX documents to improve context association (#6374) vor 7 Monaten
  Kevin Hu 95497b4aab
Fix: adapt to old configurations. (#6321) vor 7 Monaten
  Yongteng Lei 9611185eb4
Feat: add VLM-boosted DocX parser (#6307) vor 7 Monaten
  Yongteng Lei e4380843c4
Feat: add fallback for PDF figure parser (#6305) vor 7 Monaten
  Yongteng Lei 1d6760dd84
Feat: add VLM-boosted PDF parser (#6278) vor 7 Monaten
  Yongteng Lei 5cf610af40
Feat: add vision LLM PDF parser (#6173) vor 7 Monaten
  Kevin Hu 1333d3c02a
Fix: float transfer exception. (#6197) vor 7 Monaten
  Kevin Hu 3a99c2b5f4
Refa: PARALLEL_DEVICES is a static parameter. (#6168) vor 7 Monaten
  Kevin Hu bfa8d342b3
Fix: retrieval debug mode issue. (#6150) vor 7 Monaten
  Debug Doctor 3e19044dee
Feat: add OCR's muti-gpus and parallel processing support (#5972) vor 7 Monaten
  Yongteng Lei 4ff609b6a8
Fix: optimize OCR garbage identification to reduce unnecessary filtering (#6027) vor 7 Monaten
  Yongteng Lei 7cd37c37cd
Feat: add CSV file parsing support (#5989) vor 7 Monaten
  hy89 b0c21b00d9
Refactor: Optimize error handling and support parsing of XLS(EXCEL97—2003) files. (#5633) vor 8 Monaten
  Kevin Hu b418ce5643
Fix table parser issue. (#5482) vor 8 Monaten
  Kevin Hu 4f40f685d9
Code refactor (#5371) vor 8 Monaten
  Kevin Hu c28bc41a96
Fix docx table issue. (#5117) vor 8 Monaten
  Kevin Hu c24137bd11
Fix too long integer for `Table`. (#4651) vor 9 Monaten
  Kevin Hu 9d717f0b6e
Fix csv reader exception. (#4628) vor 9 Monaten
  Kevin Hu 13f04b7cca
Fix pdf applying Q&A issue. (#4599) vor 9 Monaten
  Kevin Hu dd0ebbea35
Light GraphRAG (#4585) vor 9 Monaten
  Jin Hai 3894de895b
Update comments (#4569) vor 9 Monaten
  Kevin Hu f556f0239c
Fix dify retrieval issue. (#4473) vor 9 Monaten
  Kevin Hu e098fcf6ad
Fix csv for TAG. (#4454) vor 9 Monaten
  Kevin Hu c5da3cdd97
Tagging (#4426) vor 9 Monaten
  Yingfeng 50f209204e
Synchronize with enterprise version (#4325) vor 10 Monaten
  Kevin Hu 8fb18f37f6
Code refactor. (#4291) vor 10 Monaten
  TeslaZY dd13a5d05c
Fix some bugs in text2sql.(#4279)(#4281) (#4280) vor 10 Monaten
  ly0303521 101b8ff813
fix chunk method "Table" losing content when the Excel file has multi… (#4123) vor 10 Monaten
  liuhua 1d65299791
Fix rerank_model bug in chat and markdown bug (#4061) vor 10 Monaten
  Zhichang Yu 03f00c9e6f
Rename page_num_list, top_list, position_list (#3940) vor 10 Monaten
  Kevin Hu 927873bfa6
Fix syn error. (#3953) vor 10 Monaten
  Zhichang Yu 0d68a6cd1b
Fix errors detected by Ruff (#3918) vor 10 Monaten
  Jin Hai 821fdf02b4
Fix parsing JSON file error (#3829) vor 11 Monaten
  Jin Hai 08c1a5e1e8
Refactor parse progress (#3781) vor 11 Monaten
  Jin Hai e079656473
Update progress info and start welcome info (#3768) vor 11 Monaten
  kuschzzp e678819f70
Fix RGBA error (#3707) vor 11 Monaten
  Zhichang Yu bc701d7b4c
Edit chunk shall update instead of insert it (#3709) vor 11 Monaten
  Kevin Hu 609236f5c1
Let 'One' applicable for tables in docx (#3619) vor 11 Monaten
  Zhichang Yu 482c1b59c8
Check tika.parser return result (#3564) vor 11 Monaten
  Michal Masrna c4f2464935 fix: laws.py added missing import logging (#3501) vor 11 Monaten
  Zhichang Yu 30f6421760
Use consistent log file names, introduced initLogger (#3403) vor 11 Monaten
  Kevin Hu 83c6b1f308
set DLA active for KG (#3386) vor 11 Monaten
  Zhichang Yu a2a5631da4
Rework logging (#3358) vor 11 Monaten
  Zhichang Yu f4c52371ab
Integration with Infinity (#2894) vor 11 Monaten
  Kevin Hu f86826b7a0
refactor error message of qwen (#3074) vor 1 Jahr