### What problem does this PR solve? - Update readme - Add license ### Type of change - [x] Documentation Update --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>tags/v0.8.0
| > With default settings, you only need to enter `http://IP_OF_YOUR_MACHINE` (**sans** port number) as the default HTTP serving port `80` can be omitted when using the default configurations. | > With default settings, you only need to enter `http://IP_OF_YOUR_MACHINE` (**sans** port number) as the default HTTP serving port `80` can be omitted when using the default configurations. | ||||
| 6. In [service_conf.yaml](./docker/service_conf.yaml), select the desired LLM factory in `user_default_llm` and update the `API_KEY` field with the corresponding API key. | 6. In [service_conf.yaml](./docker/service_conf.yaml), select the desired LLM factory in `user_default_llm` and update the `API_KEY` field with the corresponding API key. | ||||
| > See [./docs/guides/llm_api_key_setup.md](./docs/guides/llm_api_key_setup.md) for more information. | |||||
| > See [llm_api_key_setup](https://ragflow.io/docs/dev/llm_api_key_setup) for more information. | |||||
| _The show is now on!_ | _The show is now on!_ | ||||
| - [Discord](https://discord.gg/4XxujFgUN7) | - [Discord](https://discord.gg/4XxujFgUN7) | ||||
| - [Twitter](https://twitter.com/infiniflowai) | - [Twitter](https://twitter.com/infiniflowai) | ||||
| - [GitHub Discussions](https://github.com/orgs/infiniflow/discussions) | |||||
| ## 🙌 Contributing | ## 🙌 Contributing | ||||
| </a> | </a> | ||||
| </p> | </p> | ||||
| <h4 align="center"> | |||||
| <a href="https://ragflow.io/docs/dev/">Document</a> | | |||||
| <a href="https://github.com/infiniflow/ragflow/issues/162">Roadmap</a> | | |||||
| <a href="https://twitter.com/infiniflowai">Twitter</a> | | |||||
| <a href="https://discord.gg/jEfRUwEYEV">Discord</a> | | |||||
| <a href="https://demo.ragflow.io">Demo</a> | |||||
| </h4> | |||||
| ## 💡 RAGFlow とは? | ## 💡 RAGFlow とは? | ||||
| [RAGFlow](https://ragflow.io/) は、深い文書理解に基づいたオープンソースの RAG (Retrieval-Augmented Generation) エンジンである。LLM(大規模言語モデル)を組み合わせることで、様々な複雑なフォーマットのデータから根拠のある引用に裏打ちされた、信頼できる質問応答機能を実現し、あらゆる規模のビジネスに適した RAG ワークフローを提供します。 | [RAGFlow](https://ragflow.io/) は、深い文書理解に基づいたオープンソースの RAG (Retrieval-Augmented Generation) エンジンである。LLM(大規模言語モデル)を組み合わせることで、様々な複雑なフォーマットのデータから根拠のある引用に裏打ちされた、信頼できる質問応答機能を実現し、あらゆる規模のビジネスに適した RAG ワークフローを提供します。 | ||||
| - 2024-05-21 ストリーミング出力とテキストチャンク取得APIをサポート。 | - 2024-05-21 ストリーミング出力とテキストチャンク取得APIをサポート。 | ||||
| - 2024-05-15 OpenAI GPT-4oを統合しました。 | - 2024-05-15 OpenAI GPT-4oを統合しました。 | ||||
| - 2024-05-08 LLM DeepSeek-V2を統合しました。 | - 2024-05-08 LLM DeepSeek-V2を統合しました。 | ||||
| - 2024-04-26 「ファイル管理」機能を追加しました。 | |||||
| - 2024-04-19 会話 API をサポートします ([詳細](./docs/references/api.md))。 | |||||
| - 2024-04-16 [BCEmbedding](https://github.com/netease-youdao/BCEmbedding) から埋め込みモデル「bce-embedding-base_v1」を追加します。 | |||||
| - 2024-04-16 [FastEmbed](https://github.com/qdrant/fastembed) は、軽量かつ高速な埋め込み用に設計されています。 | |||||
| - 2024-04-11 ローカル LLM デプロイメント用に [Xinference](./docs/guides/deploy_local_llm.md) をサポートします。 | |||||
| - 2024-04-10 メソッド「Laws」に新しいレイアウト認識モデルを追加します。 | |||||
| - 2024-04-08 [Ollama](./docs/guides/deploy_local_llm.md) を使用した大規模モデルのローカライズされたデプロイメントをサポートします。 | |||||
| - 2024-04-07 中国語インターフェースをサポートします。 | |||||
| ## 🌟 主な特徴 | ## 🌟 主な特徴 | ||||
| > デフォルトの設定を使用する場合、デフォルトの HTTP サービングポート `80` は省略できるので、与えられたシナリオでは、`http://IP_OF_YOUR_MACHINE`(ポート番号は省略)だけを入力すればよい。 | > デフォルトの設定を使用する場合、デフォルトの HTTP サービングポート `80` は省略できるので、与えられたシナリオでは、`http://IP_OF_YOUR_MACHINE`(ポート番号は省略)だけを入力すればよい。 | ||||
| 6. [service_conf.yaml](./docker/service_conf.yaml) で、`user_default_llm` で希望の LLM ファクトリを選択し、`API_KEY` フィールドを対応する API キーで更新する。 | 6. [service_conf.yaml](./docker/service_conf.yaml) で、`user_default_llm` で希望の LLM ファクトリを選択し、`API_KEY` フィールドを対応する API キーで更新する。 | ||||
| > 詳しくは [./docs/guides/llm_api_key_setup.md](./docs/guides/llm_api_key_setup.md) を参照してください。 | |||||
| > 詳しくは [llm_api_key_setup](https://ragflow.io/docs/dev/llm_api_key_setup) を参照してください。 | |||||
| _これで初期設定完了!ショーの開幕です!_ | _これで初期設定完了!ショーの開幕です!_ | ||||
| ## 📚 ドキュメンテーション | ## 📚 ドキュメンテーション | ||||
| - [Quickstart](./docs/quickstart.md) | |||||
| - [FAQ](./docs/references/faq.md) | |||||
| - [Quickstart](https://ragflow.io/docs/dev/) | |||||
| - [User guide](https://ragflow.io/docs/dev/category/user-guides) | |||||
| - [Reference](https://ragflow.io/docs/dev/category/references) | |||||
| - [FAQ](https://ragflow.io/docs/dev/faq) | |||||
| ## 📜 ロードマップ | ## 📜 ロードマップ | ||||
| - [Discord](https://discord.gg/4XxujFgUN7) | - [Discord](https://discord.gg/4XxujFgUN7) | ||||
| - [Twitter](https://twitter.com/infiniflowai) | - [Twitter](https://twitter.com/infiniflowai) | ||||
| - [GitHub Discussions](https://github.com/orgs/infiniflow/discussions) | |||||
| ## 🙌 コントリビュート | ## 🙌 コントリビュート | ||||
| </a> | </a> | ||||
| </p> | </p> | ||||
| <h4 align="center"> | |||||
| <a href="https://ragflow.io/docs/dev/">Document</a> | | |||||
| <a href="https://github.com/infiniflow/ragflow/issues/162">Roadmap</a> | | |||||
| <a href="https://twitter.com/infiniflowai">Twitter</a> | | |||||
| <a href="https://discord.gg/jEfRUwEYEV">Discord</a> | | |||||
| <a href="https://demo.ragflow.io">Demo</a> | |||||
| </h4> | |||||
| ## 💡 RAGFlow 是什么? | ## 💡 RAGFlow 是什么? | ||||
| [RAGFlow](https://ragflow.io/) 是一款基于深度文档理解构建的开源 RAG(Retrieval-Augmented Generation)引擎。RAGFlow 可以为各种规模的企业及个人提供一套精简的 RAG 工作流程,结合大语言模型(LLM)针对用户各类不同的复杂格式数据提供可靠的问答以及有理有据的引用。 | [RAGFlow](https://ragflow.io/) 是一款基于深度文档理解构建的开源 RAG(Retrieval-Augmented Generation)引擎。RAGFlow 可以为各种规模的企业及个人提供一套精简的 RAG 工作流程,结合大语言模型(LLM)针对用户各类不同的复杂格式数据提供可靠的问答以及有理有据的引用。 | ||||
| - 2024-05-21 支持流式结果输出和文本块获取API。 | - 2024-05-21 支持流式结果输出和文本块获取API。 | ||||
| - 2024-05-15 集成大模型 OpenAI GPT-4o。 | - 2024-05-15 集成大模型 OpenAI GPT-4o。 | ||||
| - 2024-05-08 集成大模型 DeepSeek。 | - 2024-05-08 集成大模型 DeepSeek。 | ||||
| - 2024-04-26 增添了'文件管理'功能。 | |||||
| - 2024-04-19 支持对话 API ([更多](./docs/references/api.md))。 | |||||
| - 2024-04-16 集成嵌入模型 [BCEmbedding](https://github.com/netease-youdao/BCEmbedding) 和 专为轻型和高速嵌入而设计的 [FastEmbed](https://github.com/qdrant/fastembed)。 | |||||
| - 2024-04-11 支持用 [Xinference](./docs/guides/deploy_local_llm.md) 本地化部署大模型。 | |||||
| - 2024-04-10 为‘Laws’版面分析增加了底层模型。 | |||||
| - 2024-04-08 支持用 [Ollama](./docs/guides/deploy_local_llm.md) 本地化部署大模型。 | |||||
| - 2024-04-07 支持中文界面。 | |||||
| ## 🌟 主要功能 | ## 🌟 主要功能 | ||||
| > 上面这个例子中,您只需输入 http://IP_OF_YOUR_MACHINE 即可:未改动过配置则无需输入端口(默认的 HTTP 服务端口 80)。 | > 上面这个例子中,您只需输入 http://IP_OF_YOUR_MACHINE 即可:未改动过配置则无需输入端口(默认的 HTTP 服务端口 80)。 | ||||
| 6. 在 [service_conf.yaml](./docker/service_conf.yaml) 文件的 `user_default_llm` 栏配置 LLM factory,并在 `API_KEY` 栏填写和你选择的大模型相对应的 API key。 | 6. 在 [service_conf.yaml](./docker/service_conf.yaml) 文件的 `user_default_llm` 栏配置 LLM factory,并在 `API_KEY` 栏填写和你选择的大模型相对应的 API key。 | ||||
| > 详见 [./docs/guides/llm_api_key_setup.md](./docs/guides/llm_api_key_setup.md)。 | |||||
| > 详见 [llm_api_key_setup](https://ragflow.io/docs/dev/llm_api_key_setup)。 | |||||
| _好戏开始,接着奏乐接着舞!_ | _好戏开始,接着奏乐接着舞!_ | ||||
| ``` | ``` | ||||
| ## 📚 技术文档 | ## 📚 技术文档 | ||||
| - [Quickstart](./docs/quickstart.md) | |||||
| - [FAQ](./docs/references/faq.md) | |||||
| - [Quickstart](https://ragflow.io/docs/dev/) | |||||
| - [User guide](https://ragflow.io/docs/dev/category/user-guides) | |||||
| - [Reference](https://ragflow.io/docs/dev/category/references) | |||||
| - [FAQ](https://ragflow.io/docs/dev/faq) | |||||
| ## 📜 路线图 | ## 📜 路线图 | ||||
| - [Discord](https://discord.gg/4XxujFgUN7) | - [Discord](https://discord.gg/4XxujFgUN7) | ||||
| - [Twitter](https://twitter.com/infiniflowai) | - [Twitter](https://twitter.com/infiniflowai) | ||||
| - [GitHub Discussions](https://github.com/orgs/infiniflow/discussions) | |||||
| ## 🙌 贡献指南 | ## 🙌 贡献指南 | ||||
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||||
| # you may not use this file except in compliance with the License. | |||||
| # You may obtain a copy of the License at | |||||
| # | |||||
| # http://www.apache.org/licenses/LICENSE-2.0 | |||||
| # | |||||
| # Unless required by applicable law or agreed to in writing, software | |||||
| # distributed under the License is distributed on an "AS IS" BASIS, | |||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||||
| # See the License for the specific language governing permissions and | |||||
| # limitations under the License. | |||||
| # | |||||
| from .pdf_parser import RAGFlowPdfParser as PdfParser, PlainParser | from .pdf_parser import RAGFlowPdfParser as PdfParser, PlainParser | ||||
| from .docx_parser import RAGFlowDocxParser as DocxParser | from .docx_parser import RAGFlowDocxParser as DocxParser |
| # -*- coding: utf-8 -*- | |||||
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||||
| # you may not use this file except in compliance with the License. | |||||
| # You may obtain a copy of the License at | |||||
| # | |||||
| # http://www.apache.org/licenses/LICENSE-2.0 | |||||
| # | |||||
| # Unless required by applicable law or agreed to in writing, software | |||||
| # distributed under the License is distributed on an "AS IS" BASIS, | |||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||||
| # See the License for the specific language governing permissions and | |||||
| # limitations under the License. | |||||
| # | |||||
| from docx import Document | from docx import Document | ||||
| import re | import re | ||||
| import pandas as pd | import pandas as pd |
| # -*- coding: utf-8 -*- | |||||
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||||
| # you may not use this file except in compliance with the License. | |||||
| # You may obtain a copy of the License at | |||||
| # | |||||
| # http://www.apache.org/licenses/LICENSE-2.0 | |||||
| # | |||||
| # Unless required by applicable law or agreed to in writing, software | |||||
| # distributed under the License is distributed on an "AS IS" BASIS, | |||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||||
| # See the License for the specific language governing permissions and | |||||
| # limitations under the License. | |||||
| # | |||||
| from openpyxl import load_workbook | from openpyxl import load_workbook | ||||
| import sys | import sys | ||||
| from io import BytesIO | from io import BytesIO |
| # -*- coding: utf-8 -*- | |||||
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||||
| # you may not use this file except in compliance with the License. | |||||
| # You may obtain a copy of the License at | |||||
| # | |||||
| # http://www.apache.org/licenses/LICENSE-2.0 | |||||
| # | |||||
| # Unless required by applicable law or agreed to in writing, software | |||||
| # distributed under the License is distributed on an "AS IS" BASIS, | |||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||||
| # See the License for the specific language governing permissions and | |||||
| # limitations under the License. | |||||
| # | |||||
| import os | import os | ||||
| import random | import random | ||||
| # See the License for the specific language governing permissions and | # See the License for the specific language governing permissions and | ||||
| # limitations under the License. | # limitations under the License. | ||||
| # | # | ||||
| from io import BytesIO | from io import BytesIO | ||||
| from pptx import Presentation | from pptx import Presentation | ||||
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||||
| # you may not use this file except in compliance with the License. | |||||
| # You may obtain a copy of the License at | |||||
| # | |||||
| # http://www.apache.org/licenses/LICENSE-2.0 | |||||
| # | |||||
| # Unless required by applicable law or agreed to in writing, software | |||||
| # distributed under the License is distributed on an "AS IS" BASIS, | |||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||||
| # See the License for the specific language governing permissions and | |||||
| # limitations under the License. | |||||
| # | |||||
| import datetime | import datetime | ||||
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||||
| # you may not use this file except in compliance with the License. | |||||
| # You may obtain a copy of the License at | |||||
| # | |||||
| # http://www.apache.org/licenses/LICENSE-2.0 | |||||
| # | |||||
| # Unless required by applicable law or agreed to in writing, software | |||||
| # distributed under the License is distributed on an "AS IS" BASIS, | |||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||||
| # See the License for the specific language governing permissions and | |||||
| # limitations under the License. | |||||
| # | |||||
| import re,json,os | import re,json,os | ||||
| import pandas as pd | import pandas as pd | ||||
| from rag.nlp import rag_tokenizer | from rag.nlp import rag_tokenizer |
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||||
| # you may not use this file except in compliance with the License. | |||||
| # You may obtain a copy of the License at | |||||
| # | |||||
| # http://www.apache.org/licenses/LICENSE-2.0 | |||||
| # | |||||
| # Unless required by applicable law or agreed to in writing, software | |||||
| # distributed under the License is distributed on an "AS IS" BASIS, | |||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||||
| # See the License for the specific language governing permissions and | |||||
| # limitations under the License. | |||||
| # | |||||
| TBL = {"94":"EMBA", | TBL = {"94":"EMBA", | ||||
| "6":"MBA", | "6":"MBA", | ||||
| "95":"MPA", | "95":"MPA", |
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||||
| # you may not use this file except in compliance with the License. | |||||
| # You may obtain a copy of the License at | |||||
| # | |||||
| # http://www.apache.org/licenses/LICENSE-2.0 | |||||
| # | |||||
| # Unless required by applicable law or agreed to in writing, software | |||||
| # distributed under the License is distributed on an "AS IS" BASIS, | |||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||||
| # See the License for the specific language governing permissions and | |||||
| # limitations under the License. | |||||
| # | |||||
| TBL = {"1":{"name":"IT/通信/电子","parent":"0"}, | TBL = {"1":{"name":"IT/通信/电子","parent":"0"}, | ||||
| "2":{"name":"互联网","parent":"0"}, | "2":{"name":"互联网","parent":"0"}, |
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||||
| # you may not use this file except in compliance with the License. | |||||
| # You may obtain a copy of the License at | |||||
| # | |||||
| # http://www.apache.org/licenses/LICENSE-2.0 | |||||
| # | |||||
| # Unless required by applicable law or agreed to in writing, software | |||||
| # distributed under the License is distributed on an "AS IS" BASIS, | |||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||||
| # See the License for the specific language governing permissions and | |||||
| # limitations under the License. | |||||
| # | |||||
| TBL = { | TBL = { | ||||
| "2":{"name":"北京","parent":"1"}, | "2":{"name":"北京","parent":"1"}, | ||||
| "3":{"name":"天津","parent":"1"}, | "3":{"name":"天津","parent":"1"}, |
| # -*- coding: UTF-8 -*- | |||||
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||||
| # you may not use this file except in compliance with the License. | |||||
| # You may obtain a copy of the License at | |||||
| # | |||||
| # http://www.apache.org/licenses/LICENSE-2.0 | |||||
| # | |||||
| # Unless required by applicable law or agreed to in writing, software | |||||
| # distributed under the License is distributed on an "AS IS" BASIS, | |||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||||
| # See the License for the specific language governing permissions and | |||||
| # limitations under the License. | |||||
| # | |||||
| import os, json,re,copy | import os, json,re,copy | ||||
| import pandas as pd | import pandas as pd | ||||
| current_file_path = os.path.dirname(os.path.abspath(__file__)) | current_file_path = os.path.dirname(os.path.abspath(__file__)) |
| # -*- coding: utf-8 -*- | |||||
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||||
| # you may not use this file except in compliance with the License. | |||||
| # You may obtain a copy of the License at | |||||
| # | |||||
| # http://www.apache.org/licenses/LICENSE-2.0 | |||||
| # | |||||
| # Unless required by applicable law or agreed to in writing, software | |||||
| # distributed under the License is distributed on an "AS IS" BASIS, | |||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||||
| # See the License for the specific language governing permissions and | |||||
| # limitations under the License. | |||||
| # | |||||
| import json | import json | ||||
| from deepdoc.parser.resume.entities import degrees, regions, industries | from deepdoc.parser.resume.entities import degrees, regions, industries | ||||
| # -*- coding: utf-8 -*- | |||||
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||||
| # you may not use this file except in compliance with the License. | |||||
| # You may obtain a copy of the License at | |||||
| # | |||||
| # http://www.apache.org/licenses/LICENSE-2.0 | |||||
| # | |||||
| # Unless required by applicable law or agreed to in writing, software | |||||
| # distributed under the License is distributed on an "AS IS" BASIS, | |||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||||
| # See the License for the specific language governing permissions and | |||||
| # limitations under the License. | |||||
| # | |||||
| import re, copy, time, datetime, demjson3, \ | import re, copy, time, datetime, demjson3, \ | ||||
| traceback, signal | traceback, signal | ||||
| import numpy as np | import numpy as np |
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||||
| # you may not use this file except in compliance with the License. | |||||
| # You may obtain a copy of the License at | |||||
| # | |||||
| # http://www.apache.org/licenses/LICENSE-2.0 | |||||
| # | |||||
| # Unless required by applicable law or agreed to in writing, software | |||||
| # distributed under the License is distributed on an "AS IS" BASIS, | |||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||||
| # See the License for the specific language governing permissions and | |||||
| # limitations under the License. | |||||
| # | |||||
| import pdfplumber | import pdfplumber | ||||
| from .ocr import OCR | from .ocr import OCR |
| # Licensed under the Apache License, Version 2.0 (the "License"); | |||||
| # you may not use this file except in compliance with the License. | |||||
| # You may obtain a copy of the License at | |||||
| # | |||||
| # http://www.apache.org/licenses/LICENSE-2.0 | |||||
| # | |||||
| # Unless required by applicable law or agreed to in writing, software | |||||
| # distributed under the License is distributed on an "AS IS" BASIS, | |||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||||
| # See the License for the specific language governing permissions and | |||||
| # limitations under the License. | |||||
| # | |||||
| import copy | import copy | ||||
| import re | import re | ||||
| import numpy as np | import numpy as np |