浏览代码

Fix: markdown table conversion error (#7570)

### What problem does this PR solve?

Since `import markdown.markdown` has been changed to `import markdown`
in `rag/app/naive.py`, previous code for converting markdown tables
would call a markdown module instead of a callable function. This cause
error.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
tags/v0.19.0
alkscr 5 个月前
父节点
当前提交
baa108f5cc
没有帐户链接到提交者的电子邮件
共有 1 个文件被更改,包括 2 次插入4 次删除
  1. 2
    4
      rag/app/naive.py

+ 2
- 4
rag/app/naive.py 查看文件



from docx import Document from docx import Document
from docx.image.exceptions import InvalidImageStreamError, UnexpectedEndOfFileError, UnrecognizedImageError from docx.image.exceptions import InvalidImageStreamError, UnexpectedEndOfFileError, UnrecognizedImageError
import markdown
from markdown import markdown
from PIL import Image from PIL import Image
from tika import parser from tika import parser


return [] return []
from bs4 import BeautifulSoup from bs4 import BeautifulSoup
md = markdown.Markdown()
html_content = md.convert(text)
html_content = markdown(text)
soup = BeautifulSoup(html_content, 'html.parser') soup = BeautifulSoup(html_content, 'html.parser')
html_images = [img.get('src') for img in soup.find_all('img') if img.get('src')] html_images = [img.get('src') for img in soup.find_all('img') if img.get('src')]
return html_images return html_images
sections.append((sec_ + "\n" + sec, "")) sections.append((sec_ + "\n" + sec, ""))
else: else:
sections.append((sec, "")) sections.append((sec, ""))

for table in tables: for table in tables:
tbls.append(((None, markdown(table, extensions=['markdown.extensions.tables'])), "")) tbls.append(((None, markdown(table, extensions=['markdown.extensions.tables'])), ""))
return sections, tbls return sections, tbls

正在加载...
取消
保存