Преглед изворни кода

Docs: RAGFlow does not suppport batch metadata setting (#7795)

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change


- [x] Documentation Update
tags/v0.19.x
writinwaters пре 5 месеци
родитељ
комит
1fd92e6bee
No account linked to committer's email address

+ 1
- 1
docs/guides/chat/set_chat_variables.md Прегледај датотеку

`{knowledge}` is the system's reserved variable, representing the chunks retrieved from the knowledge base(s) specified by **Knowledge bases** under the **Assistant settings** tab. If your chat assistant is associated with certain knowledge bases, you can keep it as is. `{knowledge}` is the system's reserved variable, representing the chunks retrieved from the knowledge base(s) specified by **Knowledge bases** under the **Assistant settings** tab. If your chat assistant is associated with certain knowledge bases, you can keep it as is.


:::info NOTE :::info NOTE
It does not currently make a difference whether you set `{knowledge}` to optional or mandatory, but note that this design will be updated at a later point.
It currently makes no difference whether `{knowledge}` is set as optional or mandatory, but please note this design will be updated in due course.
::: :::


From v0.17.0 onward, you can start an AI chat without specifying knowledge bases. In this case, we recommend removing the `{knowledge}` variable to prevent unnecessary reference and keeping the **Empty response** field empty to avoid errors. From v0.17.0 onward, you can start an AI chat without specifying knowledge bases. In this case, we recommend removing the `{knowledge}` variable to prevent unnecessary reference and keeping the **Empty response** field empty to avoid errors.

+ 1
- 1
docs/guides/dataset/best_practices/accelerate_doc_indexing.mdx Прегледај датотеку

- On the configuration page of your knowledge base, switch off **Use RAPTOR to enhance retrieval**. - On the configuration page of your knowledge base, switch off **Use RAPTOR to enhance retrieval**.
- Extracting knowledge graph (GraphRAG) is time-consuming. - Extracting knowledge graph (GraphRAG) is time-consuming.
- Disable **Auto-keyword** and **Auto-question** on the configuration page of your knowledge base, as both depend on the LLM. - Disable **Auto-keyword** and **Auto-question** on the configuration page of your knowledge base, as both depend on the LLM.
- **v0.17.0+:** If your document is plain text PDF and does not require GPU-intensive processes like OCR (Optical Character Recognition), TSR (Table Structure Recognition), or DLA (Document Layout Analysis), you can choose **Naive** over **DeepDoc** or other time-consuming large model options in the **Document parser** dropdown. This will substantially reduce document parsing time.
- **v0.17.0+:** If all PDFs in your knowledge base are plain text and do not require GPU-intensive processes like OCR (Optical Character Recognition), TSR (Table Structure Recognition), or DLA (Document Layout Analysis), you can choose **Naive** over **DeepDoc** or other time-consuming large model options in the **Document parser** dropdown. This will substantially reduce document parsing time.

+ 1
- 1
docs/guides/dataset/enable_raptor.md Прегледај датотеку



### Prompt ### Prompt


The following prompt will be applied recursively for cluster summarization, with `{cluster_content}` serving as an internal parameter. We recommend that you keep it as-is for now. The design will be updated at a later point.
The following prompt will be applied recursively for cluster summarization, with `{cluster_content}` serving as an internal parameter. We recommend that you keep it as-is for now. The design will be updated in due course.


``` ```
Please summarize the following paragraphs... Paragraphs as following: Please summarize the following paragraphs... Paragraphs as following:

+ 5
- 5
docs/guides/dataset/select_pdf_parser.md Прегледај датотеку

--- ---
sidebar_position: 0
sidebar_position: 2
slug: /select_pdf_parser slug: /select_pdf_parser
--- ---


- **Laws** - **Laws**
- **Presentation** - **Presentation**
- **One** - **One**
- To use a third-party visual model for parsing PDFs, ensure you have set a default image2txt model under **Set default models** on the **Model providers** page.
- To use a third-party visual model for parsing PDFs, ensure you have set a default img2txt model under **Set default models** on the **Model providers** page.


## Procedure ## Procedure




2. Select the option that works best with your scenario: 2. Select the option that works best with your scenario:


- DeepDoc: (Default) The default visual model for OCR, TSR, and DLR tasks.
- Naive: Skip OCR, TSR, and DLR tasks if *all* your PDFs are plain text.
- A third-party visual model provided by a specific model provider.
- DeepDoc: (Default) The default visual model for OCR, TSR, and DLR tasks, which is time-consuming.
- Naive: Skip OCR, TSR, and DLR tasks if *all* your PDFs are plain text.
- A third-party visual model provided by a specific model provider.


:::caution WARNING :::caution WARNING
Third-party visual models are marked **Experimental**, because we have not fully tested these models for the aforementioned data extraction tasks. Third-party visual models are marked **Experimental**, because we have not fully tested these models for the aforementioned data extraction tasks.

+ 8
- 2
docs/guides/dataset/set_metadata.md Прегледај датотеку

--- ---
sidebar_position: 2
sidebar_position: 0
slug: /set_metada slug: /set_metada
--- ---


Ensure that your metadata is in JSON format; otherwise, your updates will not be applied. Ensure that your metadata is in JSON format; otherwise, your updates will not be applied.
::: :::


![Image](https://github.com/user-attachments/assets/379cf2c5-4e37-4b79-8aeb-53bf8e01d326)
![Image](https://github.com/user-attachments/assets/379cf2c5-4e37-4b79-8aeb-53bf8e01d326)

## Frequently asked questions

### Can I set metadata for multiple documents at once?

No, RAGFlow does not support batch metadata setting. If you still consider this feature essential, please [raise an issue](https://github.com/infiniflow/ragflow/issues) explaining your use case and its importance.

+ 1
- 1
docs/guides/models/llm_api_key_setup.md Прегледај датотеку

5. Click **OK** to confirm your changes. 5. Click **OK** to confirm your changes.


:::note :::note
To update an existing model API key at a later point:
To update an existing model API key:
![update api key](https://github.com/infiniflow/ragflow/assets/93570324/0bfba679-33f7-4f6b-9ed6-f0e6e4b228ad) ![update api key](https://github.com/infiniflow/ragflow/assets/93570324/0bfba679-33f7-4f6b-9ed6-f0e6e4b228ad)
::: :::

+ 0
- 2
docs/quickstart.mdx Прегледај датотеку



![add llm](https://github.com/infiniflow/ragflow/assets/93570324/10635088-028b-4b3d-add9-5c5a6e626814) ![add llm](https://github.com/infiniflow/ragflow/assets/93570324/10635088-028b-4b3d-add9-5c5a6e626814)


> Each RAGFlow account is able to use **text-embedding-v2** for free, an embedding model of Tongyi-Qianwen. This is why you can see Tongyi-Qianwen in the **Added models** list. And you may need to update your Tongyi-Qianwen API key at a later point.

2. Click on the desired LLM and update the API key accordingly (DeepSeek-V2 in this case): 2. Click on the desired LLM and update the API key accordingly (DeepSeek-V2 in this case):


![update api key](https://github.com/infiniflow/ragflow/assets/93570324/4e5e13ef-a98d-42e6-bcb1-0c6045fc1666) ![update api key](https://github.com/infiniflow/ragflow/assets/93570324/4e5e13ef-a98d-42e6-bcb1-0c6045fc1666)

+ 1
- 1
docs/release_notes.md Прегледај датотеку

- AI chat: Leverages Tavily-based web search to enhance contexts in agentic reasoning. To activate this, enter the correct Tavily API key under the **Assistant settings** tab of your chat assistant dialogue. - AI chat: Leverages Tavily-based web search to enhance contexts in agentic reasoning. To activate this, enter the correct Tavily API key under the **Assistant settings** tab of your chat assistant dialogue.
- AI chat: Supports starting a chat without specifying knowledge bases. - AI chat: Supports starting a chat without specifying knowledge bases.
- AI chat: HTML files can also be previewed and referenced, in addition to PDF files. - AI chat: HTML files can also be previewed and referenced, in addition to PDF files.
- Dataset: Adds a **PDF parser**, aka **Document parser**, dropdown menu to dataset configurations. This includes a DeepDoc model option, which is time-consuming, a much faster **naive** option (plain text), which skips DLA (Document Layout Analysis), OCR (Optical Character Recognition), and TSR (Table Structure Recognition) tasks, and several currently *experimental* large model options.
- Dataset: Adds a **PDF parser**, aka **Document parser**, dropdown menu to dataset configurations. This includes a DeepDoc model option, which is time-consuming, a much faster **naive** option (plain text), which skips DLA (Document Layout Analysis), OCR (Optical Character Recognition), and TSR (Table Structure Recognition) tasks, and several currently *experimental* large model options. See [here](./guides/dataset/select_pdf_parser.md).
- Agent component: **(x)** or a forward slash `/` can be used to insert available keys (variables) in the system prompt field of the **Generate** or **Template** component. - Agent component: **(x)** or a forward slash `/` can be used to insert available keys (variables) in the system prompt field of the **Generate** or **Template** component.
- Object storage: Supports using Aliyun OSS (Object Storage Service) as a file storage option. - Object storage: Supports using Aliyun OSS (Object Storage Service) as a file storage option.
- Models: Updates the supported model list for Tongyi-Qianwen (Qwen), adding DeepSeek-specific models; adds ModelScope as a model provider. - Models: Updates the supported model list for Tongyi-Qianwen (Qwen), adding DeepSeek-specific models; adds ModelScope as a model provider.

Loading…
Откажи
Сачувај