OpenSource
/
ragflow


			
							123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107
							---
sidebar_position: 8
slug: /categorize_component
---

# Categorize component

A component that classifies user inputs and applies strategies accordingly. 

---

A **Categorize** component is usually the downstream of the **Interact** component.

## Scenarios

A **Categorize** component is essential when you need the LLM to help you identify user intentions and apply appropriate processing strategies.

## Configurations

### Query variables

*Mandatory*

Select the source for categorization.

The **Categorize** component relies on query variables to specify its data inputs (queries). All global variables defined before the **Categorize** component are available in the dropdown list.  


### Input

The **Categorize** component relies on input variables to specify its data inputs (queries). Click **+ Add variable** in the **Input** section to add the desired input variables. There are two types of input variables: **Reference** and **Text**.

- **Reference**: Uses a component's output or a user input as the data source. You are required to select from the dropdown menu:
  - A component ID under **Component Output**, or 
  - A global variable under **Begin input**, which is defined in the **Begin** component.
- **Text**: Uses fixed text as the query. You are required to enter static text.

### Model

Click the dropdown menu of **Model** to show the model configuration window.

- **Model**: The chat model to use.  
  - Ensure you set the chat model correctly on the **Model providers** page.
  - You can use different models for different components to increase flexibility or improve overall performance.
- **Freedom**: A shortcut to **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty** settings, indicating the freedom level of the model. From **Improvise**, **Precise**, to **Balance**, each preset configuration corresponds to a unique combination of **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**.  
  This parameter has three options:  
  - **Improvise**: Produces more creative responses.
  - **Precise**: (Default) Produces more conservative responses.
  - **Balance**: A middle ground between **Improvise** and **Precise**.
- **Temperature**: The randomness level of the model's output.  
  Defaults to 0.1.  
  - Lower values lead to more deterministic and predictable outputs.
  - Higher values lead to more creative and varied outputs.
  - A temperature of zero results in the same output for the same prompt.
- **Top P**: Nucleus sampling.  
  - Reduces the likelihood of generating repetitive or unnatural text by setting a threshold *P* and restricting the sampling to tokens with a cumulative probability exceeding *P*.
  - Defaults to 0.3.
- **Presence penalty**: Encourages the model to include a more diverse range of tokens in the response.  
  - A higher **presence penalty** value results in the model being more likely to generate tokens not yet been included in the generated text.
  - Defaults to 0.4.
- **Frequency penalty**: Discourages the model from repeating the same words or phrases too frequently in the generated text.  
  - A higher **frequency penalty** value results in the model being more conservative in its use of repeated tokens.
  - Defaults to 0.7.

:::tip NOTE
- It is not necessary to stick with the same model for all components. If a specific model is not performing well for a particular task, consider using a different one.
- If you are uncertain about the mechanism behind **Temperature**, **Top P**, **Presence penalty**, and **Frequency penalty**, simply choose one of the three options of **Preset configurations**.
:::

### Message window size

An integer specifying the number of previous dialogue rounds to input into the LLM. For example, if it is set to 12, the tokens from the last 12 dialogue rounds will be fed to the LLM. This feature consumes additional tokens.

Defaults to 1.

:::tip IMPORTANT
This feature is used for multi-turn dialogue *only*. If your **Categorize** component is not part of a multi-turn dialogue (i.e., it is not in a loop), leave this field as-is.
:::

### Category name

A **Categorize** component must have at least two categories. This field sets the name of the category. Click **+ Add Item** to include the intended categories. 

:::tip NOTE
You will notice that the category name is auto-populated. No worries. Each category is assigned a random name upon creation. Feel free to change it to a name that is understandable to the LLM.
:::

#### Description

Description of this category.  

You can input criteria, situation, or information that may help the LLM determine which inputs belong in this category.

#### Examples

Additional examples that may help the LLM determine which inputs belong in this category.

:::danger IMPORTANT
Examples are more helpful than the description if you want the LLM to classify particular cases into this category.
:::

Once a new category is added, navigate to the **Categorize** component on the canvas, find the **+** button next to the case, and click it to specify the downstream component(s).


#### Output

The global variable name for the output of the component, which can be referenced by other components in the workflow. Defaults to `category_name`.