Explorar el Código

Fix the issue of decoding a non-UTF-8 encoded file using UTF-8 encodi… (#378)

tags/0.3.4
Columbus hace 2 años
padre
commit
eeb2c28526
No account linked to committer's email address
Se han modificado 1 ficheros con 3 adiciones y 1 borrados
  1. 3
    1
      api/controllers/console/datasets/file.py

+ 3
- 1
api/controllers/console/datasets/file.py Ver fichero

@@ -1,6 +1,7 @@
import datetime
import hashlib
import tempfile
import chardet
import time
import uuid
from pathlib import Path
@@ -141,7 +142,8 @@ class FilePreviewApi(Resource):
# ['txt', 'markdown', 'md']
with open(filepath, "rb") as fp:
data = fp.read()
text = data.decode(encoding='utf-8').strip() if data else ''
encoding = chardet.detect(data)['encoding']
text = data.decode(encoding=encoding).strip() if data else ''

text = text[0:PREVIEW_WORDS_LIMIT] if text else ''
return {'content': text}

Cargando…
Cancelar
Guardar