179 Commits (v0.20.1)

Autor SHA1 Nachricht Datum
  yzz 550e65bb22
Fix: PlainParser using fix in presentation (#9239) vor 2 Monaten
  Jay Xu cae11201ef
fix "out of memory" if slide.get_thumbnail() to a huge image (#9211) vor 2 Monaten
  Kevin Hu d9fe279dde
Feat: Redesign and refactor agent module (#9113) vor 3 Monaten
  Yongteng Lei 39ef2ffba9
Feat: parsing supports jsonl or ldjson format (#9087) vor 3 Monaten
  Stephen Hu 92cfbcb382
Fix: when parse markdown support extract image at local (#8906) vor 3 Monaten
  Yongteng Lei e9b14142a5
Fix: fixed invalid save() arguments for slide thumbnails (#8851) vor 3 Monaten
  Yongteng Lei 51a8604dcb
Fix: fixed context loss caused by separating markdown tables from original text (#8844) vor 3 Monaten
  Stephen Hu ce140f1393
Fix:Better Support Table Value Type (#8822) vor 3 Monaten
  Stephen Hu 2b7adbd2d1
Fix: Improve Memory Usage For Presentation (#8792) vor 3 Monaten
  wenxuan.zhang f586dd0a96
Fix: docx parse error. (#8600) vor 4 Monaten
  Tuan Le 6b1221d2f6
Fix parser_config access for layout_recognize in presentation.py (#8492) vor 4 Monaten
  liuzhenghua 5256980ffb
Fix: Solve the OOM issue when passing large PDF files while using QA chunking method. (#8464) vor 4 Monaten
  HaiyangP d6a941ebf5
Fix the bug of long type value overflow (#8313) vor 4 Monaten
  Jin Hai 4a2ff633e0
Fix typo in code (#8327) vor 4 Monaten
  HaiyangP baf32ee461
Display only the duplicate column names and corresponding original source. (#8138) vor 4 Monaten
  Kevin Hu 24625e0695
Fix: presentation of PDF using vlm. (#8133) vor 4 Monaten
  Yongteng Lei bd4678bca6
Fix: Unnecessary truncation in markdown parser (#7972) vor 5 Monaten
  Kevin Hu bfe97d896d
Fix: docx get image exception. (#7636) vor 5 Monaten
  Kevin Hu 321a280031
Feat: add image preview to retrieval test. (#7610) vor 5 Monaten
  alkscr baa108f5cc
Fix: markdown table conversion error (#7570) vor 5 Monaten
  WhiteBear 5352bdf4da
Error storing tag in Redis (#7541) vor 5 Monaten
  Stephen Hu 1a5608d0f8
Fix: Add title_tks for Pictures (#7365) vor 6 Monaten
  Stephen Hu 1662c7eda3
Feat: Markdown add image (#7124) vor 6 Monaten
  QuintinTao 1b4016317e
fix bug chunking:expected string or bytes-like object (#7116) vor 6 Monaten
  Kevin Hu ed5f81b02e
Fix: abnormal cell mergeing. (#6991) vor 6 Monaten
  dylan 5aae73c230
Make error messages during PPT processing clearer. (#6980) vor 6 Monaten
  Kevin Hu 14a3efd756
Fix: docx image exceptions. (#6839) vor 6 Monaten
  Kevin Hu ee5aa51d43
Fix: point in tag issue. (#6436) vor 7 Monaten
  fansir 0e0ebaac5f
Feat: Adds hierarchical title path tracking for tables in DOCX documents to improve context association (#6374) vor 7 Monaten
  Kevin Hu 95497b4aab
Fix: adapt to old configurations. (#6321) vor 7 Monaten
  Yongteng Lei 9611185eb4
Feat: add VLM-boosted DocX parser (#6307) vor 7 Monaten
  Yongteng Lei e4380843c4
Feat: add fallback for PDF figure parser (#6305) vor 7 Monaten
  Yongteng Lei 1d6760dd84
Feat: add VLM-boosted PDF parser (#6278) vor 7 Monaten
  Yongteng Lei 5cf610af40
Feat: add vision LLM PDF parser (#6173) vor 7 Monaten
  Kevin Hu 1333d3c02a
Fix: float transfer exception. (#6197) vor 7 Monaten
  Kevin Hu 3a99c2b5f4
Refa: PARALLEL_DEVICES is a static parameter. (#6168) vor 7 Monaten
  Kevin Hu bfa8d342b3
Fix: retrieval debug mode issue. (#6150) vor 7 Monaten
  Debug Doctor 3e19044dee
Feat: add OCR's muti-gpus and parallel processing support (#5972) vor 7 Monaten
  Yongteng Lei 4ff609b6a8
Fix: optimize OCR garbage identification to reduce unnecessary filtering (#6027) vor 7 Monaten
  Yongteng Lei 7cd37c37cd
Feat: add CSV file parsing support (#5989) vor 7 Monaten
  hy89 b0c21b00d9
Refactor: Optimize error handling and support parsing of XLS(EXCEL97—2003) files. (#5633) vor 8 Monaten
  Kevin Hu b418ce5643
Fix table parser issue. (#5482) vor 8 Monaten
  Kevin Hu 4f40f685d9
Code refactor (#5371) vor 8 Monaten
  Kevin Hu c28bc41a96
Fix docx table issue. (#5117) vor 8 Monaten
  Kevin Hu c24137bd11
Fix too long integer for `Table`. (#4651) vor 9 Monaten
  Kevin Hu 9d717f0b6e
Fix csv reader exception. (#4628) vor 9 Monaten
  Kevin Hu 13f04b7cca
Fix pdf applying Q&A issue. (#4599) vor 9 Monaten
  Kevin Hu dd0ebbea35
Light GraphRAG (#4585) vor 9 Monaten
  Jin Hai 3894de895b
Update comments (#4569) vor 9 Monaten
  Kevin Hu f556f0239c
Fix dify retrieval issue. (#4473) vor 9 Monaten