### What problem does this PR solve? v0.20.0 documents. ### Type of change - [x] Documentation Updatetags/v0.20.0
| @@ -1,19 +1,22 @@ | |||
| --- | |||
| sidebar_position: 2 | |||
| slug: /generate_component | |||
| slug: /agent_component | |||
| --- | |||
| # Generate component | |||
| # Agent component | |||
| The component that prompts the LLM to respond appropriately. | |||
| The component equipped with reasoning, tool usage, and multi-agent collaboration capabilities. | |||
| --- | |||
| A **Generate** component fine-tunes the LLM and sets its prompt. | |||
| An **Agent** component fine-tunes the LLM and sets its prompt. From v0.20.0 onwards, an **Agent** component is able to work independently and with the following capabilities: | |||
| - Autonomous reasoning with reflection and adjustment based on environmental feedback. | |||
| - Use of tools or subagents to complete tasks. | |||
| ## Scenarios | |||
| A **Generate** component is essential when you need the LLM to assist with summarizing, translating, or controlling various tasks. | |||
| An **Agent** component is essential when you need the LLM to assist with summarizing, translating, or controlling various tasks. | |||
| ## Configurations | |||
| @@ -43,6 +46,7 @@ Click the dropdown menu of **Model** to show the model configuration window. | |||
| - **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text. | |||
| - A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens. | |||
| - Defaults to 0.7. | |||
| - **Max tokens**: | |||
| :::tip NOTE | |||
| - It is not necessary to stick with the same model for all components. If a specific model is not performing well for a particular task, consider using a different one. | |||
| @@ -54,37 +58,21 @@ Click the dropdown menu of **Model** to show the model configuration window. | |||
| Typically, you use the system prompt to describe the task for the LLM, specify how it should respond, and outline other miscellaneous requirements. We do not plan to elaborate on this topic, as it can be as extensive as prompt engineering. However, please be aware that the system prompt is often used in conjunction with keys (variables), which serve as various data inputs for the LLM. | |||
| :::danger IMPORTANT | |||
| A **Generate** component relies on keys (variables) to specify its data inputs. Its immediate upstream component is *not* necessarily its data input, and the arrows in the workflow indicate *only* the processing sequence. Keys in a **Generate** component are used in conjunction with the system prompt to specify data inputs for the LLM. Use a forward slash `/` or the **(x)** button to show the keys to use. | |||
| An **Agent** component relies on keys (variables) to specify its data inputs. Its immediate upstream component is *not* necessarily its data input, and the arrows in the workflow indicate *only* the processing sequence. Keys in a **Agent** component are used in conjunction with the system prompt to specify data inputs for the LLM. Use a forward slash `/` or the **(x)** button to show the keys to use. | |||
| ::: | |||
| Below is a prompt excerpt of a **Generate** component from the **Interpreter** template (component ID: **Reflect**): | |||
| ```text | |||
| Your task is to read a source text and a translation to {target_lang}, and give constructive suggestions to improve the translation. The source text and initial translation, delimited by XML tags <SOURCE_TEXT></SOURCE_TEXT> and <TRANSLATION></TRANSLATION>, are as follows: | |||
| <SOURCE_TEXT> | |||
| {source_text} | |||
| </SOURCE_TEXT> | |||
| ### User prompt | |||
| <TRANSLATION> | |||
| {translation_1} | |||
| </TRANSLATION> | |||
| The user-defined prompt. Defaults to `sys.query`, the user query. | |||
| When writing suggestions, pay attention to whether there are ways to improve the translation's fluency, by applying {target_lang} grammar, spelling and punctuation rules, and ensuring there are no unnecessary repetitions. | |||
| - Each suggestion should address one specific part of the translation. | |||
| - Output the suggestions only. | |||
| ``` | |||
| Where `{source_text}` and `{target_lang}` are global variables defined by the **Begin** component, while `{translation_1}` is the output of another **Generate** component with the component ID **Translate directly**. | |||
| ### Tools | |||
| ### Cite | |||
| You can use an **Agent** component as a collaborator that reasons and reflects with the aid of other tools; for instance, **Retrieval** can serve as one such tool for an **Agent**. | |||
| This toggle sets whether to cite the original text as reference. | |||
| ### Agent | |||
| :::tip NOTE | |||
| This feature applies *only* after the original documents have been uploaded to the corresponding knowledge base(s) and file parsing is complete. | |||
| ::: | |||
| You use an **Agent** component as a collaborator that reasons and reflects with the aid of subagents or other tools, forming a multi-agent system. | |||
| ### Message window size | |||
| @@ -94,15 +82,18 @@ An integer specifying the number of previous dialogue rounds to input into the L | |||
| This feature is used for multi-turn dialogue *only*. | |||
| ::: | |||
| ### Max retrieves | |||
| Defines the maximum number of attempts the agent will make to retry a failed task or operation before stopping or reporting failure. | |||
| ### Delay after error | |||
| The waiting period in seconds that the agent observes before retrying a failed task, helping to prevent immediate repeated attempts and allowing system conditions to improve. Defaults to 1 second. | |||
| ### Max rounds | |||
| ## Examples | |||
| Defines the maximum number reflection rounds of the selected chat model. Defaults to 5 rounds. | |||
| You can explore our three-step interpreter agent template, where a **Generate** component (component ID: **Reflect**) takes three global variables: | |||
| ### Output | |||
| 1. Click the **Agent** tab at the top center of the page to access the **Agent** page. | |||
| 2. Click **+ Create agent** on the top right of the page to open the **agent template** page. | |||
| 3. On the **agent template** page, hover over the **Interpreter** card and click **Use this template**. | |||
| 4. Name your new agent and click **OK** to enter the workflow editor. | |||
| 5. Click on component **Reflect**, to display its **Configuration** window, where: | |||
| - `{target_lang}` and `{source_text}` are defined in the **Begin** component and require user input. | |||
| - `{translation_1}` is the output from the upstream component **Translate directly**. | |||
| The global variable name for the output of the **Agent** component, which can be referenced by other components in the workflow. | |||
| @@ -0,0 +1,57 @@ | |||
| --- | |||
| sidebar_position: 5 | |||
| slug: /await_response | |||
| --- | |||
| # Await response component | |||
| A component that halts the workflow and awaits user input. | |||
| --- | |||
| An **Await response** component halts the workflow, initiating a conversation and collecting key information via predefined forms. | |||
| ## Scenarios | |||
| An **Await response** component is essential where you need to display the agent's responses or require user-computer interaction. | |||
| ## Configurations | |||
| ### Guiding question | |||
| Whether to show the message defined in the **Message** field. | |||
| ### Message | |||
| The static message to send out. | |||
| Click **+ Add message** to add message options. When multiple messages are supplied, the **Message** component randomly selects one to send. | |||
| ### Input | |||
| You can define global variables within the **Await response** component, which can be either mandatory or optional. Once set, users will need to provide values for these variables when engaging with the agent. Click **+** to add a global variable, each with the following attributes: | |||
| - **Name**: _Required_ | |||
| A descriptive name providing additional details about the variable. | |||
| - **Type**: _Required_ | |||
| The type of the variable: | |||
| - **Single-line text**: Accepts a single line of text without line breaks. | |||
| - **Paragraph text**: Accepts multiple lines of text, including line breaks. | |||
| - **Dropdown options**: Requires the user to select a value for this variable from a dropdown menu. And you are required to set _at least_ one option for the dropdown menu. | |||
| - **file upload**: Requires the user to upload one or multiple files. | |||
| - **Number**: Accepts a number as input. | |||
| - **Boolean**: Requires the user to toggle between on and off. | |||
| - **Key**: _Required_ | |||
| The unique variable name. | |||
| - **Optional**: A toggle indicating whether the variable is optional. | |||
| :::tip NOTE | |||
| To pass in parameters from a client, call: | |||
| - HTTP method [Converse with agent](../../../references/http_api_reference.md#converse-with-agent), or | |||
| - Python method [Converse with agent](../../../references/python_api_reference.md#converse-with-agent). | |||
| ::: | |||
| :::danger IMPORTANT | |||
| If you set the key type as **file**, ensure the token count of the uploaded file does not exceed your model provider's maximum token limit; otherwise, the plain text in your file will be truncated and incomplete. | |||
| ::: | |||
| @@ -19,31 +19,35 @@ A **Begin** component is essential in all cases. Every agent includes a **Begin* | |||
| Click the component to display its **Configuration** window. Here, you can set an opening greeting and the input parameters (global variables) for the agent. | |||
| ### ID | |||
| ### Mode | |||
| The ID is the unique identifier for the component within the workflow. Unlike the IDs of other components, the ID of the **Begin** component _cannot_ be changed. | |||
| Mode defines how the workflow is triggered. | |||
| - Conversational: The agent is triggered from a conversation. | |||
| - Task: The agent starts without a conversation. | |||
| ### Opening greeting | |||
| An opening greeting is the agent's first message to the user. It can be a welcoming remark or an instruction to guide the user forward. | |||
| **Conversational mode only.** | |||
| An agent in conversational mode begins with an opening greeting. It is the agent's first message to the user in conversational mode, which can be a welcoming remark or an instruction to guide the user forward. | |||
| ### Global variables | |||
| You can define global variables within the **Begin** component, which can be either mandatory or optional. Once set, users will need to provide values for these variables when engaging with the agent. Click **+ Add variable** to add a global variable, each with the following attributes: | |||
| - **Key**: _Required_ | |||
| The unique variable name. | |||
| - **Name**: _Required_ | |||
| A descriptive name providing additional details about the variable. | |||
| For example, if **Key** is set to `lang`, you can set its **Name** to `Target language`. | |||
| - **Type**: _Required_ | |||
| The type of the variable: | |||
| - **line**: Accepts a single line of text without line breaks. | |||
| - **paragraph**: Accepts multiple lines of text, including line breaks. | |||
| - **options**: Requires the user to select a value for this variable from a dropdown menu. And you are required to set _at least_ one option for the dropdown menu. | |||
| - **file**: Requires the user to upload one or multiple files. | |||
| - **integer**: Accepts an integer as input. | |||
| - **boolean**: Requires the user to toggle between on and off. | |||
| - **Single-line text**: Accepts a single line of text without line breaks. | |||
| - **Paragraph text**: Accepts multiple lines of text, including line breaks. | |||
| - **Dropdown options**: Requires the user to select a value for this variable from a dropdown menu. And you are required to set _at least_ one option for the dropdown menu. | |||
| - **file upload**: Requires the user to upload one or multiple files. | |||
| - **Number**: Accepts a number as input. | |||
| - **Boolean**: Requires the user to toggle between on and off. | |||
| - **Key**: _Required_ | |||
| The unique variable name. | |||
| - **Optional**: A toggle indicating whether the variable is optional. | |||
| :::tip NOTE | |||
| @@ -54,37 +58,19 @@ To pass in parameters from a client, call: | |||
| ::: | |||
| :::danger IMPORTANT | |||
| - If you set the key type as **file**, ensure the token count of the uploaded file does not exceed your model provider's maximum token limit; otherwise, the plain text in your file will be truncated and incomplete. | |||
| - If your agent's **Begin** component takes a variable, you _cannot_ embed it into a webpage. | |||
| ::: | |||
| If you set the key type as **file**, ensure the token count of the uploaded file does not exceed your model provider's maximum token limit; otherwise, the plain text in your file will be truncated and incomplete. | |||
| ::: | |||
| :::note | |||
| You can tune document parsing and embedding efficiency by setting the environment variables `DOC_BULK_SIZE` and `EMBEDDING_BATCH_SIZE`. | |||
| ::: | |||
| ## Examples | |||
| As mentioned earlier, the **Begin** component is indispensable for an agent. Still, you can take a look at our three-step interpreter agent template, where the **Begin** component takes two global variables: | |||
| 1. Click the **Agent** tab at the top center of the page to access the **Agent** page. | |||
| 2. Click **+ Create agent** on the top right of the page to open the **agent template** page. | |||
| 3. On the **agent template** page, hover over the **Interpreter** card and click **Use this template**. | |||
| 4. Name your new agent and click **OK** to enter the workflow editor. | |||
| 5. Click on the **Begin** component to display its **Configuration** window. | |||
| ## Frequently asked questions | |||
| ### Is the uploaded file in a knowledge base? | |||
| No. Files uploaded to an agent as input are not stored in a knowledge base and hence will not be processed using RAGFlow's built-in OCR, DLR or TSR models, or chunked using RAGFlow's built-in chunking methods. | |||
| ### How to upload a webpage or file from a URL? | |||
| If you set the type of a variable as **file**, your users will be able to upload a file either from their local device or from an accessible URL. For example: | |||
|  | |||
| ### File size limit for an uploaded file | |||
| There is no _specific_ file size limit for a file uploaded to an agent. However, note that model providers typically have a default or explicit maximum token setting, which can range from 8196 to 128k: The plain text part of the uploaded file will be passed in as the key value, but if the file's token count exceeds this limit, the string will be truncated and incomplete. | |||
| @@ -1,5 +1,5 @@ | |||
| --- | |||
| sidebar_position: 5 | |||
| sidebar_position: 8 | |||
| slug: /categorize_component | |||
| --- | |||
| @@ -17,6 +17,15 @@ A **Categorize** component is essential when you need the LLM to help you identi | |||
| ## Configurations | |||
| ### Query variables | |||
| *Mandatory* | |||
| Select the source for categorization. | |||
| The **Categorize** component relies on query variables to specify its data inputs (queries). All global variables defined before the **Categorize** component are available in the dropdown list. | |||
| ### Input | |||
| The **Categorize** component relies on input variables to specify its data inputs (queries). Click **+ Add variable** in the **Input** section to add the desired input variables. There are two types of input variables: **Reference** and **Text**. | |||
| @@ -90,19 +99,9 @@ Additional examples that may help the LLM determine which inputs belong in this | |||
| Examples are more helpful than the description if you want the LLM to classify particular cases into this category. | |||
| ::: | |||
| #### Next step | |||
| Specifies the downstream component of this category. | |||
| - Once you specify the ID of the downstream component, a link is established between this category and the corresponding component. | |||
| - If you manually link this category to a downstream component on the canvas, the ID of that component is auto-populated. | |||
| ## Examples | |||
| Once a new category is added, navigate to the **Categorize** component on the canvas, find the **+** button next to the case, and click it to specify the downstream component(s). | |||
| You can explore our customer service agent template, where a **Categorize** component (component ID: **Question Categorize**) has four defined categories and takes data inputs from an **Interact** component (component ID: **Interface**): | |||
| 1. Click the **Agent** tab at the top center of the page to access the **Agent** page. | |||
| 2. Click **+ Create agent** on the top right of the page to open the **agent template** page. | |||
| 3. On the **agent template** page, hover over the **Interpreter** card and click **Use this template**. | |||
| 4. Name your new agent and click **OK** to enter the workflow editor. | |||
| #### Output | |||
| The global variable name for the output of the component, which can be referenced by other components in the workflow. Defaults to `category_name`. | |||
| @@ -13,19 +13,17 @@ A component that enables users to integrate Python or JavaScript codes into thei | |||
| A **Code** component is essential when you need to integrate complex code logic (Python or JavaScript) into your Agent for dynamic data processing. | |||
| ## Input variables | |||
| ## Configurations | |||
| You can specify multiple input sources for the **Code** component. Click **+ Add variable** in the **Input variables** section to include the desired input variables. | |||
| ### Input | |||
| After defining an input variable, you are required to select from the dropdown menu: | |||
| - A component ID under **Component Output**, or | |||
| - A global variable under **Begin input**, which is defined in the **Begin** component. | |||
| You can specify multiple input sources for the **Code** component. Click **+ Add variable** in the **Input variables** section to include the desired input variables. | |||
| ## Coding field | |||
| ### Code | |||
| This field allows you to enter and edit your source code. | |||
| ### A Python code example | |||
| #### A Python code example | |||
| ```Python | |||
| def main(arg1: str, arg2: str) -> dict: | |||
| @@ -34,7 +32,7 @@ This field allows you to enter and edit your source code. | |||
| } | |||
| ``` | |||
| ### A JavaScript code example | |||
| #### A JavaScript code example | |||
| ```JavaScript | |||
| @@ -49,4 +47,12 @@ This field allows you to enter and edit your source code. | |||
| } | |||
| ``` | |||
| ### Return values | |||
| You define the output variable(s) of the **Code** component here. | |||
| ### Output | |||
| The defined output variable(s) will be auto-populated here. | |||
| @@ -1,26 +0,0 @@ | |||
| --- | |||
| sidebar_position: 10 | |||
| slug: /concentrator_component | |||
| --- | |||
| # Concentrator component | |||
| A component that directs execution flow to multiple downstream components. | |||
| --- | |||
| The **Concentrator** component acts as a "repeater" of execution flow, transmitting a flow to multiple downstream components. | |||
| ## Scenarios | |||
| A **Concentrator** component enhances the current UX design. For a component originally designed to support only one downstream component, you can append a **Concentrator**, enabling it to have multiple downstream components. | |||
| ## Examples | |||
| Explore our general-purpose chatbot agent template, featuring a **Concentrator** component (component ID: **medical**) that relays an execution flow from category 2 of the **Categorize** component to two translator components: | |||
| 1. Click the **Agent** tab at the top center of the page to access the **Agent** page. | |||
| 2. Click **+ Create agent** on the top right of the page to open the **agent template** page. | |||
| 3. On the **agent template** page, hover over the **General-purpose chatbot** card and click **Use this template**. | |||
| 4. Name your new agent and click **OK** to enter the workflow editor. | |||
| @@ -1,21 +0,0 @@ | |||
| --- | |||
| sidebar_position: 3 | |||
| slug: /interact_component | |||
| --- | |||
| # Interact component | |||
| A component that accepts user inputs and displays responses. | |||
| --- | |||
| An **Interact** component serves as the interface between human and bot, receiving user inputs and displaying the agent's responses. | |||
| ## Scenarios | |||
| An **Interact** component is essential where you need to display the agent's responses or require user-computer interaction. | |||
| ## Examples | |||
| You can explore our three-step interpreter agent template, where the **Interact** component is used to display the final translation, or our customer service agent template, where the **Interact** component is the immediate downstream of **Begin** and is used to display multi-turn dialogue between the user and the agent. | |||
| @@ -1,5 +1,5 @@ | |||
| --- | |||
| sidebar_position: 12 | |||
| sidebar_position: 7 | |||
| slug: /iteration_component | |||
| --- | |||
| @@ -29,8 +29,6 @@ Each **Iteration** component includes an internal **IterationItem** component. T | |||
| The **IterationItem** component is visible *only* to the components encapsulated by the current **Iteration** components. | |||
| ::: | |||
|  | |||
| ### Build an internal workflow | |||
| You are allowed to pull other components into the **Iteration** component to build an internal workflow, and these "added internal components" are no longer visible to components outside of the current **Iteration** component. | |||
| @@ -64,14 +62,4 @@ The delimiter to use to split the text input into segments: | |||
| - Underline | |||
| - Forward slash | |||
| - Dash | |||
| - Semicolon | |||
| ## Examples | |||
| Explore our research report generator agent template, where the **Iteration** component (component ID: **Sections**) takes subtitles from the **Subtitles** component and generates sections for them: | |||
| 1. Click the **Agent** tab at the top center of the page to access the **Agent** page. | |||
| 2. Click **+ Create agent** on the top right of the page to open the **agent template** page. | |||
| 3. On the **agent template** page, hover over the **Customer service** card and click **Use this template**. | |||
| 4. Name your new agent and click **OK** to enter the workflow editor. | |||
| 5. Click on the **Iteration** component to display its **Configuration** window. | |||
| - Semicolon | |||
| @@ -1,76 +0,0 @@ | |||
| --- | |||
| sidebar_position: 6 | |||
| slug: /keyword_component | |||
| --- | |||
| # Keyword component | |||
| A component that extracts keywords from a user query. | |||
| --- | |||
| A **Keyword** component uses the specified LLM to extract keywords from a user query. | |||
| ## Scenarios | |||
| A **Keyword** component is essential where you need to prepare keywords for a potential keyword search. | |||
| ## Configurations | |||
| ### Input | |||
| The **Keyword** component relies on input variables to specify its data inputs (queries). Click **+ Add variable** in the **Input** section to add the desired input variables. There are two types of input variables: **Reference** and **Text**. | |||
| - **Reference**: Uses a component's output or a user input as the data source. You are required to select from the dropdown menu: | |||
| - A component ID under **Component Output**, or | |||
| - A global variable under **Begin input**, which is defined in the **Begin** component. | |||
| - **Text**: Uses fixed text as the query. You are required to enter static text. | |||
| ### Model | |||
| Click the dropdown menu of **Model** to show the model configuration window. | |||
| - **Model**: The chat model to use. | |||
| - Ensure you set the chat model correctly on the **Model providers** page. | |||
| - You can use different models for different components to increase flexibility or improve overall performance. | |||
| - **Freedom**: A shortcut to **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty** settings, indicating the freedom level of the model. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**. | |||
| This parameter has three options: | |||
| - **Improvise**: Produces more creative responses. | |||
| - **Precise**: (Default) Produces more conservative responses. | |||
| - **Balance**: A middle ground between **Improvise** and **Precise**. | |||
| - **Temperature**: The randomness level of the model's output. | |||
| Defaults to 0.1. | |||
| - Lower values lead to more deterministic and predictable outputs. | |||
| - Higher values lead to more creative and varied outputs. | |||
| - A temperature of zero results in the same output for the same prompt. | |||
| - **Top P**: Nucleus sampling. | |||
| - Reduces the likelihood of generating repetitive or unnatural text by setting a threshold *P* and restricting the sampling to tokens with a cumulative probability exceeding *P*. | |||
| - Defaults to 0.3. | |||
| - **Presence penalty**: Encourages the model to include a more diverse range of tokens in the response. | |||
| - A higher **presence penalty** value results in the model being more likely to generate tokens not yet been included in the generated text. | |||
| - Defaults to 0.4. | |||
| - **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text. | |||
| - A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens. | |||
| - Defaults to 0.7. | |||
| :::tip NOTE | |||
| - It is not necessary to stick with the same model for all components. If a specific model is not performing well for a particular task, consider using a different one. | |||
| - If you are uncertain about the mechanism behind **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**, simply choose one of the three options of **Preset**. | |||
| ::: | |||
| ### Number of keywords | |||
| An integer specifying the number of keywords to extract from the user query. Defaults to 3. Please note that the number of extracted keywords depends on the LLM's capabilities and the token count in the user query, and may *not* match the integer you set. | |||
| ## Examples | |||
| Explore our general-purpose chatbot agent template, where the **Keyword** component (component ID: **keywords**) is used to extract keywords from financial inputs for a potential stock search in the **akshare** component: | |||
| 1. Click the **Agent** tab at the top center of the page to access the **Agent** page. | |||
| 2. Click **+ Create agent** on the top right of the page to open the **agent template** page. | |||
| 3. On the **agent template** page, hover over the **General-purpose chatbot** card and click **Use this template**. | |||
| 4. Name your new agent and click **OK** to enter the workflow editor. | |||
| 5. Click on the **Keyword** component to display its **Configuration** window. | |||
| @@ -1,30 +1,21 @@ | |||
| --- | |||
| sidebar_position: 7 | |||
| sidebar_position: 4 | |||
| slug: /message_component | |||
| --- | |||
| # Message component | |||
| A component that sends out a static message. | |||
| A component that sends out a static or dynamic message. | |||
| --- | |||
| A **Message** component sends out a static message. If multiple messages are supplied, it randomly selects one to send. | |||
| As the final component of the workflow, a Message component returns the workflow’s ultimate data output accompanied by predefined message content. The system selects one message at random if multiple messages are provided. | |||
| ## Configurations | |||
| ### Messages | |||
| The message to send out. | |||
| The message to send out. Click `(x)` or type `/` to quickly insert variables. | |||
| Click **+ Add message** to add message options. When multiple messages are supplied, the **Message** component randomly selects one to send. | |||
| ## Examples | |||
| Explore our customer service agent template, where the **Message** component (component ID: **What else?**) randomly sends out a message to the user interface if the user inputs is related to personal contact information: | |||
| 1. Click the **Agent** tab at the top center of the page to access the **Agent** page. | |||
| 2. Click **+ Create agent** on the top right of the page to open the **agent template** page. | |||
| 3. On the **agent template** page, hover over the **Customer service** card and click **Use this template**. | |||
| 4. Name your new agent and click **OK** to enter the workflow editor. | |||
| 5. Click on the **Message** component to display its **Configuration** window. | |||
| @@ -1,22 +0,0 @@ | |||
| --- | |||
| sidebar_position: 18 | |||
| slug: /note_component | |||
| --- | |||
| # Note component | |||
| The component that keeps design notes. | |||
| --- | |||
| A **note** component allows you to keep design notes, including details about an agent, the output of specific components, the rationale of a particular design, or any information that may assist you, your users, or your fellow developers understand the agent. | |||
| ## Examples | |||
| Explore our customer service agent template, which has five **Note** components: | |||
| 1. Click the **Agent** tab at the top center of the page to access the **Agent** page. | |||
| 2. Click **+ Create agent** on the top right of the page to open the **agent template** page. | |||
| 3. On the **agent template** page, hover over the **Customer service** card and click **Use this template**. | |||
| 4. Name your new agent and click **OK** to enter the workflow editor. | |||
| 5. Click on the **note** component to add or update notes. | |||
| @@ -1,5 +1,5 @@ | |||
| --- | |||
| sidebar_position: 4 | |||
| sidebar_position: 3 | |||
| slug: /retrieval_component | |||
| --- | |||
| @@ -9,20 +9,26 @@ A component that retrieves information from specified datasets. | |||
| ## Scenarios | |||
| A **Retrieval** component is essential in most RAG scenarios, where information is extracted from designated knowledge bases before being sent to the LLM for content generation. | |||
| A **Retrieval** component is essential in most RAG scenarios, where information is extracted from designated knowledge bases before being sent to the LLM for content generation. As of v0.20.0, a **Retrieval** component can operate either as a workflow component or as a tool of an **Agent**, enabling the Agent to control its invocation and search queries. | |||
| ## Configurations | |||
| Click on a **Retrieval** component to open its configuration window. | |||
| ### Input | |||
| ### Query variables | |||
| The **Retrieval** component relies on input variables to specify its data inputs (queries). Click **+ Add variable** in the **Input** section to add the desired input variables. There are two types of input variables: **Reference** and **Text**. | |||
| *Mandatory* | |||
| - **Reference**: Uses a component's output or a user input as the data source. You are required to select from the dropdown menu: | |||
| - A component ID under **Component Output**, or | |||
| - A global variable under **Begin input**, which is defined in the **Begin** component. | |||
| - **Text**: Uses fixed text as the query. You are required to enter static text. | |||
| Select the query source for retrieval. | |||
| The **Retrieval** component relies on query variables to specify its data inputs (queries). All global variables defined before the **Retrieval** component are available in the dropdown list. | |||
| ### Knowledge bases | |||
| Select the knowledge base(s) to retrieve data from. | |||
| - If no knowledge base is selected, meaning conversations with the agent will not be based on any knowledge base, ensure that the **Empty response** field is left blank to avoid an error. | |||
| - If you select multiple knowledge bases, you must ensure that the knowledge bases (datasets) you select use the same embedding model; otherwise, an error message would occur. | |||
| ### Similarity threshold | |||
| @@ -51,25 +57,6 @@ If a rerank model is selected, a combination of weighted keyword similarity and | |||
| Using a rerank model will *significantly* increase the system's response time. | |||
| ::: | |||
| ### Tavily API key | |||
| *Optional* | |||
| Enter your Tavily API key here to enable Tavily web search during retrieval. See [here](https://app.tavily.com/home) for instructions on getting a Tavily API key. | |||
| ### Use knowledge graph | |||
| Whether to use knowledge graph(s) in the specified knowledge base(s) during retrieval for multi-hop question answering. When enabled, this would involve iterative searches across entity, relationship, and community report chunks, greatly increasing retrieval time. | |||
| ### Knowledge bases | |||
| *Optional* | |||
| Select the knowledge base(s) to retrieve data from. | |||
| - If no knowledge base is selected, meaning conversations with the agent will not be based on any knowledge base, ensure that the **Empty response** field is left blank to avoid an error. | |||
| - If you select multiple knowledge bases, you must ensure that the knowledge bases (datasets) you select use the same embedding model; otherwise, an error message would occur. | |||
| ### Empty response | |||
| - Set this as a response if no results are retrieved from the knowledge base(s) for your query, or | |||
| @@ -79,12 +66,14 @@ Select the knowledge base(s) to retrieve data from. | |||
| If you do not specify a knowledge base, you must leave this field blank; otherwise, an error would occur. | |||
| ::: | |||
| ## Examples | |||
| ### Cross-language search | |||
| Select one or more languages for cross‑language search. If no language is selected, the system searches with the original query. | |||
| ### Use knowledge graph | |||
| Whether to use knowledge graph(s) in the specified knowledge base(s) during retrieval for multi-hop question answering. When enabled, this would involve iterative searches across entity, relationship, and community report chunks, greatly increasing retrieval time. | |||
| Explore our customer service agent template, where the **Retrieval** component (component ID: **Search product info**) is used to search the dataset and send the Top N results to the LLM: | |||
| ### Output | |||
| 1. Click the **Agent** tab at the top center of the page to access the **Agent** page. | |||
| 2. Click **+ Create agent** on the top right of the page to open the **agent template** page. | |||
| 3. On the **agent template** page, hover over the **Customer service** card and click **Use this template**. | |||
| 4. Name your new agent and click **OK** to enter the workflow editor. | |||
| 5. Click on the **Retrieval** component to display its **Configuration** window. | |||
| The global variable name for the output of the **Retrieval** component, which can be referenced by other components in the workflow. | |||
| @@ -1,79 +0,0 @@ | |||
| --- | |||
| sidebar_position: 8 | |||
| slug: /rewrite_component | |||
| --- | |||
| # Rewrite component | |||
| A component that rewrites a user query. | |||
| --- | |||
| A **Rewrite** component uses a specified LLM to rewrite a user query from the **Interact** component, based on the context of previous dialogues. | |||
| ## Scenarios | |||
| A **Rewrite** component is essential when you need to optimize a user query based on the context of previous conversations. It is usually the upstream component of a **Retrieval** component. | |||
| :::tip NOTE | |||
| See also the [Keyword](./keyword.mdx) component, a similar component used for multi-turn optimization. | |||
| ::: | |||
| ## Configurations | |||
| :::tip NOTE | |||
| The **Rewrite** component uses the user-agent interaction from the **Interact** component as its data input. Therefore, there is no need to specify its data inputs in the Configurations. | |||
| ::: | |||
| ### Model | |||
| Click the dropdown menu of **Model** to show the model configuration window. | |||
| - **Model**: The chat model to use. | |||
| - Ensure you set the chat model correctly on the **Model providers** page. | |||
| - You can use different models for different components to increase flexibility or improve overall performance. | |||
| - **Freedom**: A shortcut to **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty** settings, indicating the freedom level of the model. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**. | |||
| This parameter has three options: | |||
| - **Improvise**: Produces more creative responses. | |||
| - **Precise**: (Default) Produces more conservative responses. | |||
| - **Balance**: A middle ground between **Improvise** and **Precise**. | |||
| - **Temperature**: The randomness level of the model's output. | |||
| Defaults to 0.1. | |||
| - Lower values lead to more deterministic and predictable outputs. | |||
| - Higher values lead to more creative and varied outputs. | |||
| - A temperature of zero results in the same output for the same prompt. | |||
| - **Top P**: Nucleus sampling. | |||
| - Reduces the likelihood of generating repetitive or unnatural text by setting a threshold *P* and restricting the sampling to tokens with a cumulative probability exceeding *P*. | |||
| - Defaults to 0.3. | |||
| - **Presence penalty**: Encourages the model to include a more diverse range of tokens in the response. | |||
| - A higher **presence penalty** value results in the model being more likely to generate tokens not yet been included in the generated text. | |||
| - Defaults to 0.4. | |||
| - **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text. | |||
| - A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens. | |||
| - Defaults to 0.7. | |||
| :::tip NOTE | |||
| - It is not necessary to stick with the same model for all components. If a specific model is not performing well for a particular task, consider using a different one. | |||
| - If you are uncertain about the mechanism behind **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**, simply choose one of the three options of **Preset configurations**. | |||
| ::: | |||
| ### Message window size | |||
| An integer specifying the number of previous dialogue rounds to input into the LLM. For example, if it is set to 12, the tokens from the last 12 dialogue rounds will be fed to the LLM. This feature consumes additional tokens. | |||
| Defaults to 1. | |||
| :::tip IMPORTANT | |||
| This feature is used for multi-turn dialogue *only*. If your **Categorize** component is not part of a multi-turn dialogue (i.e., it is not in a loop), leave this field as-is. | |||
| ::: | |||
| ## Examples | |||
| Explore our customer service agent template, where the **Rewrite** component (component ID: **Refine Question**) is used to optimize a product-specific user query based on context of previous dialogues before passing it on to the **Retrieval** component. | |||
| 1. Click the **Agent** tab at the top center of the page to access the **Agent** page. | |||
| 2. Click **+ Create agent** on the top right of the page to open the **agent template** page. | |||
| 3. On the **agent template** page, hover over the **Customer service** card and click **Use this template**. | |||
| 4. Name your new agent and click **OK** to enter the workflow editor. | |||
| 5. Click on the **Rewrite** component to display its **Configuration** window. | |||
| @@ -1,5 +1,5 @@ | |||
| --- | |||
| sidebar_position: 9 | |||
| sidebar_position: 6 | |||
| slug: /switch_component | |||
| --- | |||
| @@ -19,27 +19,21 @@ A **Switch** component is essential for condition-based direction of execution f | |||
| ### Case n | |||
| A **Switch** component must have at least one case, each with multiple specified conditions and *only one* downstream component. When multiple conditions are specified for a case, you must set the logical relationship between them to either AND or OR. | |||
| A **Switch** component must have at least one case, each with multiple specified conditions. When multiple conditions are specified for a case, you must set the logical relationship between them to either AND or OR. | |||
| #### Next step | |||
| Once a new case is added, navigate to the **Switch** component on the canvas, find the **+** button next to the case, and click it to specify the downstream component(s). | |||
| Specifies the downstream component of this case. | |||
| - *Once you specify the ID of the downstream component, a link is established between this case and the corresponding component.* | |||
| - *If you manually link this case to a downstream component on the canvas, the ID of that component is auto-populated.* | |||
| #### Condition | |||
| Evaluates whether the output of specific components meets certain conditions, with **Component ID**, **Operator**, and **Value** together forming a conditional expression. | |||
| Evaluates whether the output of specific components meets certain conditions | |||
| :::danger IMPORTANT | |||
| When you have added multiple conditions for a specific case, a **Logical operator** field appears, requiring you to set the logical relationship between these conditions as either AND or OR. | |||
|  | |||
| ::: | |||
| - **Component ID**: The ID of the corresponding component. | |||
| - **Operator**: The operator required to form a conditional expression. | |||
| - Equals | |||
| - Equals (default) | |||
| - Not equal | |||
| - Greater than | |||
| - Greater equal | |||
| @@ -53,10 +47,4 @@ When you have added multiple conditions for a specific case, a **Logical operato | |||
| - Not empty | |||
| - **Value**: A single value, which can be an integer, float, or string. | |||
| - Delimiters, multiple values, or expressions are *not* supported. | |||
| - Strings need not be wrapped in `""` or `''`. | |||
| ### ELSE | |||
| **Required**. Specifies the downstream component if none of the conditions defined above are met. | |||
| *Once you specify the ID of the downstream component, a link is established between ELSE and the corresponding component.* | |||
| @@ -1,49 +0,0 @@ | |||
| --- | |||
| sidebar_position: 11 | |||
| slug: /template_component | |||
| --- | |||
| # Template component | |||
| A component that formats user inputs or the outputs of other components. | |||
| --- | |||
| A **Template** component acts as a content formatter. It is usually the upstream component of an **Interact** component. | |||
| ## Scenarios | |||
| A **Template** component is useful for organizing various sources of data or information into specific formats. | |||
| ## Configurations | |||
| ### Content | |||
| Used together with Keys to organize various data or information sources into desired formats. Example: | |||
| ```text | |||
| <h2>{subtitle}</h2> | |||
| <div>{content}</div> | |||
| ``` | |||
| Where `{subtitle}` and `{content}` are defined keys. | |||
| ### Key | |||
| A **Template** component relies on keys (variables) to specify its data or information sources. Its immediate upstream component is *not* necessarily its input, and the arrows in the workflow indicate *only* the processing sequence. | |||
| Values of keys are categorized into two groups: | |||
| - **Component Output**: The value of the key should be a component ID. | |||
| - **Begin Input**: The value of the key should be the name of a global variable defined in the **Begin** component. | |||
| ## Examples | |||
| Explore our research report generator agent template, where the **Template** component (component ID: **Article**) organizes user input and the outputs of the **Sections** component into HTML format: | |||
| 1. Click the **Agent** tab at the top center of the page to access the **Agent** page. | |||
| 2. Click **+ Create agent** on the top right of the page to open the **agent template** page. | |||
| 3. On the **agent template** page, hover over the **Research report generator** card and click **Use this template**. | |||
| 4. Name your new agent and click **OK** to enter the workflow editor. | |||
| 5. Click on the **Template** component to display its **Configuration** window | |||
| @@ -0,0 +1,38 @@ | |||
| --- | |||
| sidebar_position: 15 | |||
| slug: /text_processing | |||
| --- | |||
| # Text processing component | |||
| A component that merges or splits texts. | |||
| --- | |||
| A **Text processing** component merges or splits texts. | |||
| ## Configurations | |||
| ### Method | |||
| - Split: Split the text | |||
| - Merge: Merge the text | |||
| ### Split_ref | |||
| Appears only when you select **Split** as method. | |||
| The variable to be split. Type `/` to quickly insert variables. | |||
| ### Script | |||
| Template for the merge. Appears only when you select **Merge** as method. Type `/` to quickly insert variables. | |||
| ### Delimiters | |||
| The delimiter(s) used to split or merge the text. | |||
| ### Output | |||
| The global variable name for the output of the component, which can be referenced by other components in the workflow. | |||
| @@ -11,6 +11,10 @@ Key concepts, basic operations, a quick view of the agent editor. | |||
| ## Key concepts | |||
| :::danger DEPRECATED! | |||
| A new version is coming soon. | |||
| ::: | |||
| Agents and RAG are complementary techniques, each enhancing the other’s capabilities in business applications. RAGFlow v0.8.0 introduces an agent mechanism, featuring a no-code workflow editor on the front end and a comprehensive graph-based task orchestration framework on the back end. This mechanism is built on top of RAGFlow's existing RAG solutions and aims to orchestrate search technologies such as query intent classification, conversation leading, and query rewriting to: | |||
| - Provide higher retrievals and, | |||
| @@ -7,14 +7,7 @@ slug: /embed_agent_into_webpage | |||
| You can use iframe to embed an agent into a third-party webpage. | |||
| :::caution WARNING | |||
| If your agent's **Begin** component takes a variable, you *cannot* embed it into a webpage. | |||
| ::: | |||
| 1. Before proceeding, you must [acquire an API key](../models/llm_api_key_setup.md); otherwise, an error message would appear. | |||
| 2. On the **Agent** page, click an intended agent **>** **Edit** to access its editing page. | |||
| 3. Click **Embed into webpage** on the top right corner of the canvas to show the **iframe** window: | |||
|  | |||
| 2. On the **Agent** page, click an intended agent to access its editing page. | |||
| 3. Click **Management > Embed into webpage** on the top right corner of the canvas to show the **iframe** window: | |||
| 4. Copy the iframe and embed it into a specific location on your webpage. | |||
| @@ -9,6 +9,10 @@ Create a general-purpose chatbot. | |||
| --- | |||
| :::danger DEPRECATED! | |||
| A new version is coming soon. | |||
| ::: | |||
| Chatbot is one of the most common AI scenarios. However, effectively understanding user queries and responding appropriately remains a challenge. RAGFlow's general-purpose chatbot agent is our attempt to tackle this longstanding issue. | |||
| This chatbot closely resembles the chatbot introduced in [Start an AI chat](../chat/start_chat.md), but with a key difference - it introduces a reflective mechanism that allows it to improve the retrieval from the target knowledge bases by rewriting the user's query. | |||
| @@ -1,420 +0,0 @@ | |||
| --- | |||
| sidebar_position: 10 | |||
| slug: /text2sql_agent | |||
| --- | |||
| # Create a Text2SQL agent | |||
| Build a Text2SQL agent leveraging RAGFlow's RAG capabilities. | |||
| :::info KUDOS | |||
| This document is contributed by our community contributor [TeslaZY](https://github.com/TeslaZY). 👏 | |||
| ::: | |||
| ## Scenario | |||
| The Text2SQL agent bridges the gap between Natural Language Processing (NLP) and Structured Query Language (SQL). Its key advantages are as follows: | |||
| - **Assisting non-technical users with SQL**: Not all users have a background in SQL or understand the structure of the tables involved in queries. With a Text2SQL agent, users can pose questions or request data in natural language without needing an in-depth knowledge of the database structure or SQL syntax. | |||
| - **Enhancing SQL development efficiency**: For those familiar with SQL, the Text2SQL agent streamlines the process by enabling users to construct complex queries quickly, without the need to code each part manually. | |||
| - **Minimizing errors**: Manually writing SQL queries can be error-prone, particularly for complex queries or for users not well-versed in the database structure. The Text2SQL agent can interpret natural language instructions and generate accurate SQL queries, thereby reducing potential syntax and logic errors. | |||
| - **Boosting data analysis capabilities**: In business intelligence and data analysis, swiftly gaining insights from data is critical. The Text2SQL agent facilitates extracting valuable information from databases more directly and conveniently, thus aiding in accelerating decision-making. | |||
| - **Automation and integration**: The Text2SQL agent can be integrated into larger systems to support automated workflows, such as automatic report generation and data monitoring. It can also integrate seamlessly with other services and technologies, offering richer application possibilities. | |||
| - **Support for multiple languages and varied expressions**: People can express the same idea in numerous ways. An effective Text2SQL system should be capable of understanding diverse expressions and accurately converting them into SQL queries. | |||
| In summary, the Text2SQL agent seeks to make database queries more intuitive and user-friendly while ensuring efficiency and accuracy. It caters to a broad spectrum of users, from completely non-technical individuals to seasoned data analysts and developers. | |||
| However, traditional Text2SQL solutions often require model fine-tuning, which can substantially escalate deployment and maintenance costs when implemented in enterprise environments alongside RAG or Agent components. RAGFlow’s RAG-based Text2SQL utilizes an existing (connected) large language model (LLM), allowing for seamless integration with other RAG/Agent components without the necessity for additional fine-tuned models. | |||
| ## Recipe | |||
| A list of components required: | |||
| - [Begin](./agent_component_reference/begin.mdx) | |||
| - [Interact](./agent_component_reference/interact.mdx) | |||
| - [Retrieval](./agent_component_reference/retrieval.mdx) | |||
| - [Generate](./agent_component_reference/generate.mdx) | |||
| - ExeSQL | |||
| ## Procedure | |||
| ### Preparation of Data | |||
| #### Database Environment | |||
| Mysql-8.0.39 | |||
| #### Database Table Creation Statements | |||
| ```sql | |||
| SET NAMES utf8mb4; | |||
| DROP TABLE IF EXISTS `Customers`; | |||
| CREATE TABLE `Customers` ( | |||
| `CustomerID` int NOT NULL AUTO_INCREMENT, | |||
| `UserName` varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL, | |||
| `Email` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL, | |||
| `PhoneNumber` varchar(20) COLLATE utf8mb4_unicode_ci DEFAULT NULL, | |||
| PRIMARY KEY (`CustomerID`) | |||
| ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; | |||
| DROP TABLE IF EXISTS `Products`; | |||
| CREATE TABLE `Products` ( | |||
| `ProductID` int NOT NULL AUTO_INCREMENT, | |||
| `ProductName` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL, | |||
| `Description` text COLLATE utf8mb4_unicode_ci, | |||
| `Price` decimal(10,2) DEFAULT NULL, | |||
| `StockQuantity` int DEFAULT NULL, | |||
| PRIMARY KEY (`ProductID`) | |||
| ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; | |||
| DROP TABLE IF EXISTS `Orders`; | |||
| CREATE TABLE `Orders` ( | |||
| `OrderID` int NOT NULL AUTO_INCREMENT, | |||
| `CustomerID` int DEFAULT NULL, | |||
| `OrderDate` date DEFAULT NULL, | |||
| `TotalPrice` decimal(10,2) DEFAULT NULL, | |||
| PRIMARY KEY (`OrderID`), | |||
| KEY `CustomerID` (`CustomerID`) | |||
| ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; | |||
| DROP TABLE IF EXISTS `OrderDetails`; | |||
| CREATE TABLE `OrderDetails` ( | |||
| `OrderDetailID` int NOT NULL AUTO_INCREMENT, | |||
| `OrderID` int DEFAULT NULL, | |||
| `ProductID` int DEFAULT NULL, | |||
| `UnitPrice` decimal(10,2) DEFAULT NULL, | |||
| `Quantity` int DEFAULT NULL, | |||
| `TotalPrice` decimal(10,2) DEFAULT NULL, | |||
| PRIMARY KEY (`OrderDetailID`), | |||
| KEY `OrderID` (`OrderID`), | |||
| KEY `ProductID` (`ProductID`) | |||
| ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; | |||
| ``` | |||
| #### Generate Test Data | |||
| ```sql | |||
| START TRANSACTION; | |||
| INSERT INTO Customers (UserName, Email, PhoneNumber) VALUES | |||
| ('Alice', 'alice@example.com', '123456789'), | |||
| ('Bob', 'bob@example.com', '987654321'), | |||
| ('Charlie', 'charlie@example.com', '112233445'), | |||
| ('Diana', 'diana@example.com', '555666777'), | |||
| ('Eve', 'eve@example.com', '999888777'), | |||
| ('Frank', 'frank@example.com', '123123123'), | |||
| ('Grace', 'grace@example.com', '456456456'), | |||
| ('Hugo', 'hugo@example.com', '789789789'), | |||
| ('Ivy', 'ivy@example.com', '321321321'), | |||
| ('Jack', 'jack@example.com', '654654654'); | |||
| INSERT INTO Products (ProductName, Description, Price, StockQuantity) VALUES | |||
| ('Laptop', 'High performance laptop', 1200.00, 50), | |||
| ('Smartphone', 'Latest model smartphone', 800.00, 100), | |||
| ('Tablet', 'Portable tablet device', 300.00, 75), | |||
| ('Headphones', 'Noise-cancelling headphones', 150.00, 200), | |||
| ('Camera', 'Professional camera', 600.00, 30), | |||
| ('Monitor', '24-inch Full HD monitor', 200.00, 45), | |||
| ('Keyboard', 'Mechanical keyboard', 100.00, 150), | |||
| ('Mouse', 'Ergonomic gaming mouse', 50.00, 250), | |||
| ('Speaker', 'Wireless Bluetooth speaker', 80.00, 120), | |||
| ('Router', 'Wi-Fi router with high speed', 120.00, 90); | |||
| INSERT INTO Orders (CustomerID, OrderDate, TotalPrice) VALUES | |||
| (1, '2024-01-15', 0), | |||
| (2, '2024-02-01', 0), | |||
| (3, '2024-03-05', 0), | |||
| (4, '2024-04-10', 0), | |||
| (5, '2024-05-15', 0), | |||
| (6, '2024-06-20', 0), | |||
| (7, '2024-07-25', 0), | |||
| (8, '2024-08-30', 0), | |||
| (9, '2024-09-05', 0), | |||
| (10, '2024-10-10', 0), | |||
| (1, '2024-11-15', 0), | |||
| (2, '2024-12-01', 0), | |||
| (3, '2024-01-05', 0), | |||
| (4, '2024-02-10', 0), | |||
| (5, '2024-03-15', 0), | |||
| (6, '2024-04-20', 0), | |||
| (7, '2024-05-25', 0), | |||
| (8, '2024-06-30', 0), | |||
| (9, '2024-07-05', 0), | |||
| (10, '2024-08-10', 0); | |||
| INSERT INTO OrderDetails (OrderID, ProductID, UnitPrice, Quantity, TotalPrice) VALUES | |||
| (1, 1, (SELECT Price FROM Products WHERE ProductID = 1), 2, (SELECT Price * 2 FROM Products WHERE ProductID = 1)), | |||
| (1, 2, (SELECT Price FROM Products WHERE ProductID = 2), 1, (SELECT Price FROM Products WHERE ProductID = 2)), | |||
| (2, 3, (SELECT Price FROM Products WHERE ProductID = 3), 3, (SELECT Price * 3 FROM Products WHERE ProductID = 3)), | |||
| (2, 4, (SELECT Price FROM Products WHERE ProductID = 4), 1, (SELECT Price FROM Products WHERE ProductID = 4)), | |||
| (3, 5, (SELECT Price FROM Products WHERE ProductID = 5), 1, (SELECT Price FROM Products WHERE ProductID = 5)), | |||
| (3, 6, (SELECT Price FROM Products WHERE ProductID = 6), 2, (SELECT Price * 2 FROM Products WHERE ProductID = 6)), | |||
| (4, 7, (SELECT Price FROM Products WHERE ProductID = 7), 5, (SELECT Price * 5 FROM Products WHERE ProductID = 7)), | |||
| (5, 8, (SELECT Price FROM Products WHERE ProductID = 8), 3, (SELECT Price * 3 FROM Products WHERE ProductID = 8)), | |||
| (5, 9, (SELECT Price FROM Products WHERE ProductID = 9), 2, (SELECT Price * 2 FROM Products WHERE ProductID = 9)), | |||
| (6, 10, (SELECT Price FROM Products WHERE ProductID = 10), 4, (SELECT Price * 4 FROM Products WHERE ProductID = 10)), | |||
| (7, 2, (SELECT Price FROM Products WHERE ProductID = 2), 4, (SELECT Price * 4 FROM Products WHERE ProductID = 2)), | |||
| (7, 8, (SELECT Price FROM Products WHERE ProductID = 8), 3, (SELECT Price * 3 FROM Products WHERE ProductID = 8)), | |||
| (8, 1, (SELECT Price FROM Products WHERE ProductID = 1), 1, (SELECT Price FROM Products WHERE ProductID = 1)), | |||
| (8, 9, (SELECT Price FROM Products WHERE ProductID = 9), 2, (SELECT Price * 2 FROM Products WHERE ProductID = 9)), | |||
| (8, 10, (SELECT Price FROM Products WHERE ProductID = 10), 5, (SELECT Price * 5 FROM Products WHERE ProductID = 10)), | |||
| (9, 3, (SELECT Price FROM Products WHERE ProductID = 3), 5, (SELECT Price * 5 FROM Products WHERE ProductID = 3)), | |||
| (9, 6, (SELECT Price FROM Products WHERE ProductID = 6), 1, (SELECT Price FROM Products WHERE ProductID = 6)), | |||
| (10, 4, (SELECT Price FROM Products WHERE ProductID = 4), 2, (SELECT Price * 2 FROM Products WHERE ProductID = 4)), | |||
| (10, 7, (SELECT Price FROM Products WHERE ProductID = 7), 3, (SELECT Price * 3 FROM Products WHERE ProductID = 7)), | |||
| (11, 5, (SELECT Price FROM Products WHERE ProductID = 5), 1, (SELECT Price FROM Products WHERE ProductID = 5)), | |||
| (11, 10, (SELECT Price FROM Products WHERE ProductID = 10), 4, (SELECT Price * 4 FROM Products WHERE ProductID = 10)), | |||
| (12, 1, (SELECT Price FROM Products WHERE ProductID = 1), 3, (SELECT Price * 3 FROM Products WHERE ProductID = 1)), | |||
| (12, 8, (SELECT Price FROM Products WHERE ProductID = 8), 2, (SELECT Price * 2 FROM Products WHERE ProductID = 8)), | |||
| (13, 2, (SELECT Price FROM Products WHERE ProductID = 2), 1, (SELECT Price FROM Products WHERE ProductID = 2)), | |||
| (13, 9, (SELECT Price FROM Products WHERE ProductID = 9), 3, (SELECT Price * 3 FROM Products WHERE ProductID = 9)), | |||
| (14, 3, (SELECT Price FROM Products WHERE ProductID = 3), 4, (SELECT Price * 4 FROM Products WHERE ProductID = 3)), | |||
| (14, 6, (SELECT Price FROM Products WHERE ProductID = 6), 2, (SELECT Price * 2 FROM Products WHERE ProductID = 6)), | |||
| (15, 4, (SELECT Price FROM Products WHERE ProductID = 4), 5, (SELECT Price * 5 FROM Products WHERE ProductID = 4)), | |||
| (15, 7, (SELECT Price FROM Products WHERE ProductID = 7), 1, (SELECT Price FROM Products WHERE ProductID = 7)), | |||
| (16, 5, (SELECT Price FROM Products WHERE ProductID = 5), 2, (SELECT Price * 2 FROM Products WHERE ProductID = 5)), | |||
| (16, 10, (SELECT Price FROM Products WHERE ProductID = 10), 3, (SELECT Price * 3 FROM Products WHERE ProductID = 10)), | |||
| (17, 1, (SELECT Price FROM Products WHERE ProductID = 1), 4, (SELECT Price * 4 FROM Products WHERE ProductID = 1)), | |||
| (17, 8, (SELECT Price FROM Products WHERE ProductID = 8), 1, (SELECT Price FROM Products WHERE ProductID = 8)), | |||
| (18, 2, (SELECT Price FROM Products WHERE ProductID = 2), 5, (SELECT Price * 5 FROM Products WHERE ProductID = 2)), | |||
| (18, 9, (SELECT Price FROM Products WHERE ProductID = 9), 2, (SELECT Price * 2 FROM Products WHERE ProductID = 9)), | |||
| (19, 3, (SELECT Price FROM Products WHERE ProductID = 3), 3, (SELECT Price * 3 FROM Products WHERE ProductID = 3)), | |||
| (19, 6, (SELECT Price FROM Products WHERE ProductID = 6), 4, (SELECT Price * 4 FROM Products WHERE ProductID = 6)), | |||
| (20, 4, (SELECT Price FROM Products WHERE ProductID = 4), 1, (SELECT Price FROM Products WHERE ProductID = 4)), | |||
| (20, 7, (SELECT Price FROM Products WHERE ProductID = 7), 5, (SELECT Price * 5 FROM Products WHERE ProductID = 7)); | |||
| UPDATE Orders o | |||
| JOIN ( | |||
| SELECT OrderID, SUM(TotalPrice) as order_total | |||
| FROM OrderDetails | |||
| GROUP BY OrderID | |||
| ) od ON o.OrderID = od.OrderID | |||
| SET o.TotalPrice = od.order_total; | |||
| COMMIT; | |||
| ``` | |||
| ### Configure Knowledge Base | |||
| For RAGFlow’s RAG-based Text2SQL, the following knowledge bases are typically required: | |||
| - **DDL**: Database table creation statements. | |||
| - **DB_Description**: Detailed descriptions of tables and columns. | |||
| - **Q->SQL**: Natural language query descriptions along with corresponding SQL query examples (Question-Answer pairs). | |||
| However, in specialized query scenarios, user queries might include abbreviations or synonyms for domain-specific terms. If a user references a synonym for a domain-specific term, the system may fail to generate the correct SQL query. Therefore, it is advisable to incorporate a thesaurus for synonyms to assist the Agent in generating more accurate SQL queries. | |||
| - **TextSQL_Thesaurus**: A thesaurus covering domain-specific terms and their synonyms. | |||
| #### Configure DDL Knowledge Base | |||
| 1. The content of the DDL text is as follows: | |||
| ```sql | |||
| CREATE TABLE Customers ( | |||
| CustomerID int NOT NULL AUTO_INCREMENT, | |||
| UserName varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL, | |||
| Email varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL, | |||
| PhoneNumber varchar(20) COLLATE utf8mb4_unicode_ci DEFAULT NULL, | |||
| PRIMARY KEY (CustomerID) | |||
| ) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; | |||
| CREATE TABLE Products ( | |||
| ProductID int NOT NULL AUTO_INCREMENT, | |||
| ProductName varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL, | |||
| Description text COLLATE utf8mb4_unicode_ci, | |||
| Price decimal(10,2) DEFAULT NULL, | |||
| StockQuantity int DEFAULT NULL, | |||
| PRIMARY KEY (ProductID) | |||
| ) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; | |||
| CREATE TABLE Orders ( | |||
| OrderID int NOT NULL AUTO_INCREMENT, | |||
| CustomerID int DEFAULT NULL, | |||
| OrderDate date DEFAULT NULL, | |||
| TotalPrice decimal(10,2) DEFAULT NULL, | |||
| PRIMARY KEY (OrderID), | |||
| KEY CustomerID (CustomerID) | |||
| ) ENGINE=InnoDB AUTO_INCREMENT=21 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; | |||
| CREATE TABLE OrderDetails ( | |||
| OrderDetailID int NOT NULL AUTO_INCREMENT, | |||
| OrderID int DEFAULT NULL, | |||
| ProductID int DEFAULT NULL, | |||
| UnitPrice decimal(10,2) DEFAULT NULL, | |||
| Quantity int DEFAULT NULL, | |||
| TotalPrice decimal(10,2) DEFAULT NULL, | |||
| PRIMARY KEY (OrderDetailID), | |||
| KEY OrderID (OrderID), | |||
| KEY ProductID (ProductID) | |||
| ) ENGINE=InnoDB AUTO_INCREMENT=40 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci; | |||
| ``` | |||
| 2. Set the chunk data for the DLL knowledge base | |||
|  | |||
| #### Configure DB_Description Knowledge Base | |||
| 1. the content of the DB_Description text is as follows: | |||
| 2. | |||
| ```markdown | |||
| ### Customers (Customer Information Table) | |||
| The Customers table records detailed information about different customers in the online store. Here is the meaning of each field within this table: | |||
| - CustomerID: A unique identifier for a customer, auto-incremented. | |||
| - UserName: The name used by the customer for logging into the online store or displayed on the site. | |||
| - Email: The email address of the customer, which can be used for account verification, password recovery, and order updates. | |||
| - PhoneNumber: The phone number of the customer, useful for contact purposes such as delivery notifications or customer service. | |||
| ### Products (Product Information Table) | |||
| The Products table contains information about the products offered by the online store. Each field within this table represents: | |||
| - ProductID: A unique identifier for a product, auto-incremented. | |||
| - ProductName: The name of the product, such as laptop, smartphone, nounch, etc. | |||
| - Description: Detailed information about the product. | |||
| - Price: The selling price of the product, stored as a decimal value to accommodate currency formatting. | |||
| - StockQuantity: The quantity of the product available in stock. | |||
| ### Orders (Order Information Table) | |||
| The Orders table tracks orders placed by customers. This table includes fields that denote: | |||
| - OrderID: A unique identifier for an order, auto-incremented. | |||
| - CustomerID: A foreign key that references the CustomerID in the Customers table, indicating which customer placed the order. | |||
| - OrderDate: The date when the order was placed. | |||
| - TotalPrice: The total price of all items in the order, calculated at the time of purchase. | |||
| ### OrderDetails (Order Details Table) | |||
| The OrderDetails table provides detailed information about each item in an order. Fields within this table include: | |||
| - OrderDetailID: A unique identifier for each line item in an order, auto-incremented. | |||
| - OrderID: A foreign key that references the OrderID in the Orders table, linking the detail to a specific order. | |||
| - ProductID: A foreign key that references the ProductID in the Products table, specifying which product was ordered. | |||
| - UnitPrice: The price per unit of the product at the time of order. | |||
| - Quantity: The number of units of the product ordered. | |||
| - TotalPrice: The total price for this particular item in the order, calculated as UnitPrice * Quantity. | |||
| ``` | |||
| 2. set the chunk data for the DB_Description knowledge base | |||
|  | |||
| #### Configure Q->SQL Knowledge Base | |||
| 1. Q->SQL Excel Document | |||
| [QA.xlsx](https://github.com/user-attachments/files/18258416/QA.xlsx) | |||
| 2. Upload the Q->SQL Excel document to the Q->SQL knowledge base and set the chunk data as follows via parsing: | |||
|  | |||
| #### Configure TextSQL_Thesaurus Knowledge Base | |||
| 1. the content of the TextSQL_Thesaurus text is as follows: | |||
| ```txt | |||
| ### | |||
| Standard noun: StockQuantity | |||
| Synonyms: stock,stockpile,inventory | |||
| ### | |||
| Standard noun: UserName | |||
| Synonyms: user name, user's name | |||
| ### | |||
| Standard noun: Quantity | |||
| Synonyms: amount,number | |||
| ### | |||
| Standard noun: Smartphone | |||
| Synonyms: phone, mobile phone, smart phone, mobilephone | |||
| ### | |||
| Standard noun: ProductName | |||
| Synonyms: product name, product's name | |||
| ### | |||
| Standard noun: tablet | |||
| Synonyms: pad,Pad | |||
| ### | |||
| Standard noun: laptop | |||
| Synonyms: laptop computer,laptop pc | |||
| ``` | |||
| 2. set the chunk data for the TextSQL_Thesaurus knowledge base | |||
|  | |||
| ### Build the Agent | |||
| 1. Create an Agent using the Text2SQL Agent template. | |||
| 2. Enter the configuration page of the Agent to start the setup process. | |||
| 3. Create a Retrieval node and name it Thesaurus; create an ExeSQL node. | |||
| 4. Configure the Q->SQL, DDL, DB_Description, and TextSQL_Thesaurus knowledge bases. Please refer to the following: | |||
|  | |||
| 5. Configure the Generate node, named LLM's prompt: | |||
| - Add this content to the prompt provided by the template to provide the thesaurus content to the LLM: | |||
| ```plaintext | |||
| ## You may use the following Thesaurus statements. For example, what I ask is from Synonyms, you must use Standard noun to generate SQL. Use responses to past questions also to guide you: {sql_thesaurus}. | |||
| ``` | |||
| - Ensure the mapping between keys and component IDs is configured correctly. | |||
| - The configuration result should look like this: | |||
|  | |||
| 6. Configure the ExecSQL node, filling in the configuration information for the MySQL database. | |||
|  | |||
| 7. Set an opener in the Begin component like: | |||
| ```plaintext | |||
| Hi! I'm your electronic products online store business data analysis assistant. What can I do for you? | |||
| ### Run and Test the Agent | |||
| 1. click the Run button to start the agent. | |||
| 2. input the question: | |||
| ``` | |||
| Help me summarize stock quantities for each product | |||
| ``` | |||
| 3. click the send button to send the question to the agent. | |||
| 4. The agent will respond with the following: | |||
|  | |||
| ### Debug the Agent | |||
| Since version 0.15.0, ragflow has introduced step-by-step execution for Agent components/tools, providing a robust mechanism for debugging and testing. Let's explore how to perform a step run. | |||
| 1. To enter Test Run mode, you can either click the triangle icon located above the component or access the component's detail page by clicking on the component itself. Once there, select the Test Run button in the upper right corner of the component details. | |||
|  | |||
|  | |||
| 2. Enter a question that does not exist in the Q->SQL knowledge base but is similar in nature. | |||
| Click the Run button to receive the component's output. | |||
| ``` | |||
| Find all customers who has bought a mobile phone | |||
| ``` | |||
|  | |||
| 3. As the image shows, no matching information was retrieved from the Q->SQL knowledge base, yet a similar question exists within the database. Adjust the Rerank model, "Similarity threshold," or "Keyword similarity weight" accordingly to return relevant content. | |||
|  | |||
|  | |||
| 4. Observe the inputs and outputs of the LLM node and ExeSQL node. | |||
|  | |||
| 5. The agent now produces the correct SQL query result. | |||
| 6. For a query about "mobile phone," the agent successfully generates the appropriate SQL query using "Smartphone." This showcases how the thesaurus guides the LLM in generating accurate SQL queries. | |||
| With this, you maybe appreciate the capabilities of Step Run. It undoubtedly assists in constructing more effective agents. | |||
| ## Troubleshooting | |||
| ### Total: 0 No record in the database! | |||
| 1. Confirm if the sql is correct. If so, check the connection information of the database. | |||
| 2. If the connection information is correct, maybe there is actually no data matching your query in the database. | |||
| ## Considerations | |||
| In real production scenarios within vertical domains, several considerations are essential for effective Text2SQL implementation: | |||
| 1. **Handling DDL and DB_Description**: Dealing with Data Definition Language (DDL) statements and database descriptions requires substantial debugging experience. It is crucial to discern which information is vital and which may be redundant, depending on the true business context. This includes determining the relevance of table attributes such as primary keys, foreign keys, indexes, and so forth. | |||
| 2. **Maintaining Quality QA Data**: Ensuring a high standard for question-and-answer data significantly aids the LLM in generating more accurate SQL queries. | |||
| 3. **Managing Domain-Specific Synonyms**: Professional domain synonyms can greatly impact the generation of SQL query conditions. Therefore, maintaining an extensive and up-to-date synonym library is critical to mitigate this challenge. | |||
| 4. **Facilitating User Feedback**: Implementing a feedback mechanism within the Agent allows users to provide correct SQL queries. Administrators can then use this feedback to automatically generate corresponding QA data, reducing the need for manual maintenance. | |||
| In summary, achieving high-quality output from Text2SQL remains contingent upon high-quality input. Constructing robust question-and-answer datasets is at the core of optimizing RAGFlow's Text2SQL capabilities. | |||
| @@ -13,10 +13,6 @@ Retrieval accuracy is the touchstone for a production-ready RAG framework. In ad | |||
| To use this feature, ensure you have at least one properly configured tag set, specify the tag set(s) on the **Configuration** page of your knowledge base (dataset), and then re-parse your documents to initiate the auto-tagging process. During this process, each chunk in your dataset is compared with every entry in the specified tag set(s), and tags are automatically applied based on similarity. | |||
| :::caution NOTE | |||
| The auto-tagging feature is *unavailable* on the [Infinity](https://github.com/infiniflow/infinity) document engine. | |||
| ::: | |||
| ## Scenarios | |||
| Auto-tagging applies in situations where chunks are so similar to each other that the intended chunks cannot be distinguished from the rest. For example, when you have a few chunks about iPhone and a majority about iPhone case or iPhone accessaries, it becomes difficult to retrieve those chunks about iPhone without additional information. | |||