Kaynağa Gözat

Fix the issue of decoding a non-UTF-8 encoded file using UTF-8 encodi… (#378)

tags/0.3.4
Columbus 2 yıl önce
ebeveyn
işleme
eeb2c28526
No account linked to committer's email address
1 değiştirilmiş dosya ile 3 ekleme ve 1 silme
  1. 3
    1
      api/controllers/console/datasets/file.py

+ 3
- 1
api/controllers/console/datasets/file.py Dosyayı Görüntüle

import datetime import datetime
import hashlib import hashlib
import tempfile import tempfile
import chardet
import time import time
import uuid import uuid
from pathlib import Path from pathlib import Path
# ['txt', 'markdown', 'md'] # ['txt', 'markdown', 'md']
with open(filepath, "rb") as fp: with open(filepath, "rb") as fp:
data = fp.read() data = fp.read()
text = data.decode(encoding='utf-8').strip() if data else ''
encoding = chardet.detect(data)['encoding']
text = data.decode(encoding=encoding).strip() if data else ''


text = text[0:PREVIEW_WORDS_LIMIT] if text else '' text = text[0:PREVIEW_WORDS_LIMIT] if text else ''
return {'content': text} return {'content': text}

Loading…
İptal
Kaydet