{/** * @typedef Props * @property {string} apiBaseUrl */} import { CodeGroup } from '@/app/components/develop/code.tsx' import { Row, Col, Properties, Property, Heading, SubProperty, PropertyInstruction, Paragraph } from '@/app/components/develop/md.tsx' # Knowledge API

### Authentication Service API authenticates using an `API-Key`. It is suggested that developers store the `API-Key` in the backend instead of sharing or storing it in the client side to avoid the leakage of the `API-Key`, which may lead to property loss. All API requests should include your `API-Key` in the **`Authorization`** HTTP Header, as shown below: ```javascript Authorization: Bearer {API_KEY} ```

This API is based on an existing knowledge and creates a new document through text based on this knowledge. ### Path Knowledge ID ### Request Body Document name Document content Index mode - high_quality High quality: Embedding using embedding model, built as vector database index - economy Economy: Build using inverted index of keyword table index Format of indexed content - text_model Text documents are directly embedded; `economy` mode defaults to using this form - hierarchical_model Parent-child mode - qa_model Q&A Mode: Generates Q&A pairs for segmented documents and then embeds the questions In Q&A mode, specify the language of the document, for example: English, Chinese Processing rules - mode (string) Cleaning, segmentation mode, automatic / custom / hierarchical - rules (object) Custom rules (in automatic mode, this field is empty) - pre_processing_rules (array[object]) Preprocessing rules - id (string) Unique identifier for the preprocessing rule - enumerate - remove_extra_spaces Replace consecutive spaces, newlines, tabs - remove_urls_emails Delete URL, email address - enabled (bool) Whether to select this rule or not. If no document ID is passed in, it represents the default value. - segmentation (object) Segmentation rules - separator Custom segment identifier, currently only allows one delimiter to be set. Default is \n - max_tokens Maximum length (token) defaults to 1000 - parent_mode Retrieval mode of parent chunks: full-doc full text retrieval / paragraph paragraph retrieval - subchunk_segmentation (object) Child chunk rules - separator Segmentation identifier. Currently, only one delimiter is allowed. The default is *** - max_tokens The maximum length (tokens) must be validated to be shorter than the length of the parent chunk - chunk_overlap Define the overlap between adjacent chunks (optional) When no parameters are set for the knowledge base, the first upload requires the following parameters to be provided; if not provided, the default parameters will be used. Retrieval model - search_method (string) Search method - hybrid_search Hybrid search - semantic_search Semantic search - full_text_search Full-text search - reranking_enable (bool) Whether to enable reranking - reranking_mode (object) Rerank model configuration - reranking_provider_name (string) Rerank model provider - reranking_model_name (string) Rerank model name - top_k (int) Number of results to return - score_threshold_enabled (bool) Whether to enable score threshold - score_threshold (float) Score threshold Embedding model name Embedding model provider ```bash {{ title: 'cURL' }} curl --location --request POST '${props.apiBaseUrl}/datasets/{dataset_id}/document/create-by-text' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ --data-raw '{ "name": "text", "text": "text", "indexing_technique": "high_quality", "process_rule": { "mode": "automatic" } }' ``` ```json {{ title: 'Response' }} { "document": { "id": "", "position": 1, "data_source_type": "upload_file", "data_source_info": { "upload_file_id": "" }, "dataset_process_rule_id": "", "name": "text.txt", "created_from": "api", "created_by": "", "created_at": 1695690280, "tokens": 0, "indexing_status": "waiting", "error": null, "enabled": true, "disabled_at": null, "disabled_by": null, "archived": false, "display_status": "queuing", "word_count": 0, "hit_count": 0, "doc_form": "text_model" }, "batch": "" } ```

This API is based on an existing knowledge and creates a new document through a file based on this knowledge. ### Path Knowledge ID ### Request Body - original_document_id Source document ID (optional) - Used to re-upload the document or modify the document cleaning and segmentation configuration. The missing information is copied from the source document - The source document cannot be an archived document - When original_document_id is passed in, the update operation is performed on behalf of the document. process_rule is a fillable item. If not filled in, the segmentation method of the source document will be used by default - When original_document_id is not passed in, the new operation is performed on behalf of the document, and process_rule is required - indexing_technique Index mode - high_quality High quality: embedding using embedding model, built as vector database index - economy Economy: Build using inverted index of keyword table index - doc_form Format of indexed content - text_model Text documents are directly embedded; `economy` mode defaults to using this form - hierarchical_model Parent-child mode - qa_model Q&A Mode: Generates Q&A pairs for segmented documents and then embeds the questions - doc_language In Q&A mode, specify the language of the document, for example: English, Chinese - process_rule Processing rules - mode (string) Cleaning, segmentation mode, automatic / custom / hierarchical - rules (object) Custom rules (in automatic mode, this field is empty) - pre_processing_rules (array[object]) Preprocessing rules - id (string) Unique identifier for the preprocessing rule - enumerate - remove_extra_spaces Replace consecutive spaces, newlines, tabs - remove_urls_emails Delete URL, email address - enabled (bool) Whether to select this rule or not. If no document ID is passed in, it represents the default value. - segmentation (object) Segmentation rules - separator Custom segment identifier, currently only allows one delimiter to be set. Default is \n - max_tokens Maximum length (token) defaults to 1000 - parent_mode Retrieval mode of parent chunks: full-doc full text retrieval / paragraph paragraph retrieval - subchunk_segmentation (object) Child chunk rules - separator Segmentation identifier. Currently, only one delimiter is allowed. The default is *** - max_tokens The maximum length (tokens) must be validated to be shorter than the length of the parent chunk - chunk_overlap Define the overlap between adjacent chunks (optional) Files that need to be uploaded. When no parameters are set for the knowledge base, the first upload requires the following parameters to be provided; if not provided, the default parameters will be used. Retrieval model - search_method (string) Search method - hybrid_search Hybrid search - semantic_search Semantic search - full_text_search Full-text search - reranking_enable (bool) Whether to enable reranking - reranking_mode (object) Rerank model configuration - reranking_provider_name (string) Rerank model provider - reranking_model_name (string) Rerank model name - top_k (int) Number of results to return - score_threshold_enabled (bool) Whether to enable score threshold - score_threshold (float) Score threshold Embedding model name Embedding model provider ```bash {{ title: 'cURL' }} curl --location --request POST '${props.apiBaseUrl}/datasets/{dataset_id}/document/create-by-file' \ --header 'Authorization: Bearer {api_key}' \ --form 'data="{\"name\":\"Dify\",\"indexing_technique\":\"high_quality\",\"process_rule\":{\"rules\":{\"pre_processing_rules\":[{\"id\":\"remove_extra_spaces\",\"enabled\":true},{\"id\":\"remove_urls_emails\",\"enabled\":true}],\"segmentation\":{\"separator\":\"###\",\"max_tokens\":500}},\"mode\":\"custom\"}}";type=text/plain' \ --form 'file=@"/path/to/file"' ``` ```json {{ title: 'Response' }} { "document": { "id": "", "position": 1, "data_source_type": "upload_file", "data_source_info": { "upload_file_id": "" }, "dataset_process_rule_id": "", "name": "Dify.txt", "created_from": "api", "created_by": "", "created_at": 1695308667, "tokens": 0, "indexing_status": "waiting", "error": null, "enabled": true, "disabled_at": null, "disabled_by": null, "archived": false, "display_status": "queuing", "word_count": 0, "hit_count": 0, "doc_form": "text_model" }, "batch": "" } ```

### Request Body Knowledge name Knowledge description (optional) Index technique (optional) If this is not set, embedding_model, embedding_model_provider and retrieval_model will be set to null - high_quality High quality - economy Economy Permission - only_me Only me - all_team_members All team members - partial_members Partial members Provider (optional, default: vendor) - vendor Vendor - external External knowledge External knowledge API ID (optional) External knowledge ID (optional) Embedding model name (optional) Embedding model provider name (optional) Retrieval model (optional) - search_method (string) Search method - hybrid_search Hybrid search - semantic_search Semantic search - full_text_search Full-text search - reranking_enable (bool) Whether to enable reranking - reranking_model (object) Rerank model configuration - reranking_provider_name (string) Rerank model provider - reranking_model_name (string) Rerank model name - top_k (int) Number of results to return - score_threshold_enabled (bool) Whether to enable score threshold - score_threshold (float) Score threshold ```bash {{ title: 'cURL' }} curl --location --request POST '${apiBaseUrl}/v1/datasets' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ --data-raw '{ "name": "name", "permission": "only_me" }' ``` ```json {{ title: 'Response' }} { "id": "", "name": "name", "description": null, "provider": "vendor", "permission": "only_me", "data_source_type": null, "indexing_technique": null, "app_count": 0, "document_count": 0, "word_count": 0, "created_by": "", "created_at": 1695636173, "updated_by": "", "updated_at": 1695636173, "embedding_model": null, "embedding_model_provider": null, "embedding_available": null } ```

### Query Search keyword, optional Tag ID list, optional Page number, optional, default 1 Number of items returned, optional, default 20, range 1-100 Whether to include all datasets (only effective for owners), optional, defaults to false ```bash {{ title: 'cURL' }} curl --location --request GET '${props.apiBaseUrl}/datasets?page=1&limit=20' \ --header 'Authorization: Bearer {api_key}' ``` ```json {{ title: 'Response' }} { "data": [ { "id": "", "name": "name", "description": "desc", "permission": "only_me", "data_source_type": "upload_file", "indexing_technique": "", "app_count": 2, "document_count": 10, "word_count": 1200, "created_by": "", "created_at": "", "updated_by": "", "updated_at": "" }, ... ], "has_more": true, "limit": 20, "total": 50, "page": 1 } ```

### Path Knowledge Base ID ```bash {{ title: 'cURL' }} curl --location --request GET '${props.apiBaseUrl}/datasets/{dataset_id}' \ --header 'Authorization: Bearer {api_key}' ``` ```json {{ title: 'Response' }} { "id": "eaedb485-95ac-4ffd-ab1e-18da6d676a2f", "name": "Test Knowledge Base", "description": "", "provider": "vendor", "permission": "only_me", "data_source_type": null, "indexing_technique": null, "app_count": 0, "document_count": 0, "word_count": 0, "created_by": "e99a1635-f725-4951-a99a-1daaaa76cfc6", "created_at": 1735620612, "updated_by": "e99a1635-f725-4951-a99a-1daaaa76cfc6", "updated_at": 1735620612, "embedding_model": null, "embedding_model_provider": null, "embedding_available": true, "retrieval_model_dict": { "search_method": "semantic_search", "reranking_enable": false, "reranking_mode": null, "reranking_model": { "reranking_provider_name": "", "reranking_model_name": "" }, "weights": null, "top_k": 2, "score_threshold_enabled": false, "score_threshold": null }, "tags": [], "doc_form": null, "external_knowledge_info": { "external_knowledge_id": null, "external_knowledge_api_id": null, "external_knowledge_api_name": null, "external_knowledge_api_endpoint": null }, "external_retrieval_model": { "top_k": 2, "score_threshold": 0.0, "score_threshold_enabled": null } } ```

### Path Knowledge Base ID Index technique (optional) - high_quality High quality - economy Economy Permission - only_me Only me - all_team_members All team members - partial_members Partial members Specified embedding model provider, must be set up in the system first, corresponding to the provider field(Optional) Specified embedding model, corresponding to the model field(Optional) Retrieval model (optional, if not filled, it will be recalled according to the default method) - search_method (text) Search method: One of the following four keywords is required - keyword_search Keyword search - semantic_search Semantic search - full_text_search Full-text search - hybrid_search Hybrid search - reranking_enable (bool) Whether to enable reranking, required if the search mode is semantic_search or hybrid_search (optional) - reranking_mode (object) Rerank model configuration, required if reranking is enabled - reranking_provider_name (string) Rerank model provider - reranking_model_name (string) Rerank model name - weights (float) Semantic search weight setting in hybrid search mode - top_k (integer) Number of results to return (optional) - score_threshold_enabled (bool) Whether to enable score threshold - score_threshold (float) Score threshold Partial member list(Optional) ```bash {{ title: 'cURL' }} curl --location --request PATCH '${props.apiBaseUrl}/datasets/{dataset_id}' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ --data-raw '{ "name": "Test Knowledge Base", "indexing_technique": "high_quality", "permission": "only_me", "embedding_model_provider": "zhipuai", "embedding_model": "embedding-3", "retrieval_model": { "search_method": "keyword_search", "reranking_enable": false, "reranking_mode": null, "reranking_model": { "reranking_provider_name": "", "reranking_model_name": "" }, "weights": null, "top_k": 1, "score_threshold_enabled": false, "score_threshold": null }, "partial_member_list": [] }' ``` ```json {{ title: 'Response' }} { "id": "eaedb485-95ac-4ffd-ab1e-18da6d676a2f", "name": "Test Knowledge Base", "description": "", "provider": "vendor", "permission": "only_me", "data_source_type": null, "indexing_technique": "high_quality", "app_count": 0, "document_count": 0, "word_count": 0, "created_by": "e99a1635-f725-4951-a99a-1daaaa76cfc6", "created_at": 1735620612, "updated_by": "e99a1635-f725-4951-a99a-1daaaa76cfc6", "updated_at": 1735622679, "embedding_model": "embedding-3", "embedding_model_provider": "zhipuai", "embedding_available": null, "retrieval_model_dict": { "search_method": "semantic_search", "reranking_enable": false, "reranking_mode": null, "reranking_model": { "reranking_provider_name": "", "reranking_model_name": "" }, "weights": null, "top_k": 2, "score_threshold_enabled": false, "score_threshold": null }, "tags": [], "doc_form": null, "external_knowledge_info": { "external_knowledge_id": null, "external_knowledge_api_id": null, "external_knowledge_api_name": null, "external_knowledge_api_endpoint": null }, "external_retrieval_model": { "top_k": 2, "score_threshold": 0.0, "score_threshold_enabled": null }, "partial_member_list": [] } ```

### Path Knowledge ID ```bash {{ title: 'cURL' }} curl --location --request DELETE '${props.apiBaseUrl}/datasets/{dataset_id}' \ --header 'Authorization: Bearer {api_key}' ``` ```text {{ title: 'Response' }} 204 No Content ```

This API is based on an existing knowledge and updates the document through text based on this knowledge. ### Path Knowledge ID Document ID ### Request Body Document name (optional) Document content (optional) Processing rules - mode (string) Cleaning, segmentation mode, automatic / custom / hierarchical - rules (object) Custom rules (in automatic mode, this field is empty) - pre_processing_rules (array[object]) Preprocessing rules - id (string) Unique identifier for the preprocessing rule - enumerate - remove_extra_spaces Replace consecutive spaces, newlines, tabs - remove_urls_emails Delete URL, email address - enabled (bool) Whether to select this rule or not. If no document ID is passed in, it represents the default value. - segmentation (object) Segmentation rules - separator Custom segment identifier, currently only allows one delimiter to be set. Default is \n - max_tokens Maximum length (token) defaults to 1000 - parent_mode Retrieval mode of parent chunks: full-doc full text retrieval / paragraph paragraph retrieval - subchunk_segmentation (object) Child chunk rules - separator Segmentation identifier. Currently, only one delimiter is allowed. The default is *** - max_tokens The maximum length (tokens) must be validated to be shorter than the length of the parent chunk - chunk_overlap Define the overlap between adjacent chunks (optional) ```bash {{ title: 'cURL' }} curl --location --request POST '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{document_id}/update-by-text' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ --data-raw '{ "name": "name", "text": "text" }' ``` ```json {{ title: 'Response' }} { "document": { "id": "", "position": 1, "data_source_type": "upload_file", "data_source_info": { "upload_file_id": "" }, "dataset_process_rule_id": "", "name": "name.txt", "created_from": "api", "created_by": "", "created_at": 1695308667, "tokens": 0, "indexing_status": "waiting", "error": null, "enabled": true, "disabled_at": null, "disabled_by": null, "archived": false, "display_status": "queuing", "word_count": 0, "hit_count": 0, "doc_form": "text_model" }, "batch": "" } ```

This API is based on an existing knowledge, and updates documents through files based on this knowledge ### Path Knowledge ID Document ID ### Request Body Document name (optional) Files to be uploaded Processing rules - mode (string) Cleaning, segmentation mode, automatic / custom / hierarchical - rules (object) Custom rules (in automatic mode, this field is empty) - pre_processing_rules (array[object]) Preprocessing rules - id (string) Unique identifier for the preprocessing rule - enumerate - remove_extra_spaces Replace consecutive spaces, newlines, tabs - remove_urls_emails Delete URL, email address - enabled (bool) Whether to select this rule or not. If no document ID is passed in, it represents the default value. - segmentation (object) Segmentation rules - separator Custom segment identifier, currently only allows one delimiter to be set. Default is \n - max_tokens Maximum length (token) defaults to 1000 - parent_mode Retrieval mode of parent chunks: full-doc full text retrieval / paragraph paragraph retrieval - subchunk_segmentation (object) Child chunk rules - separator Segmentation identifier. Currently, only one delimiter is allowed. The default is *** - max_tokens The maximum length (tokens) must be validated to be shorter than the length of the parent chunk - chunk_overlap Define the overlap between adjacent chunks (optional) ```bash {{ title: 'cURL' }} curl --location --request POST '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{document_id}/update-by-file' \ --header 'Authorization: Bearer {api_key}' \ --form 'data="{\"name\":\"Dify\",\"indexing_technique\":\"high_quality\",\"process_rule\":{\"rules\":{\"pre_processing_rules\":[{\"id\":\"remove_extra_spaces\",\"enabled\":true},{\"id\":\"remove_urls_emails\",\"enabled\":true}],\"segmentation\":{\"separator\":\"###\",\"max_tokens\":500}},\"mode\":\"custom\"}}";type=text/plain' \ --form 'file=@"/path/to/file"' ``` ```json {{ title: 'Response' }} { "document": { "id": "", "position": 1, "data_source_type": "upload_file", "data_source_info": { "upload_file_id": "" }, "dataset_process_rule_id": "", "name": "Dify.txt", "created_from": "api", "created_by": "", "created_at": 1695308667, "tokens": 0, "indexing_status": "waiting", "error": null, "enabled": true, "disabled_at": null, "disabled_by": null, "archived": false, "display_status": "queuing", "word_count": 0, "hit_count": 0, "doc_form": "text_model" }, "batch": "20230921150427533684" } ```

### Path Knowledge ID Batch number of uploaded documents ```bash {{ title: 'cURL' }} curl --location --request GET '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{batch}/indexing-status' \ --header 'Authorization: Bearer {api_key}' \ ``` ```json {{ title: 'Response' }} { "data":[{ "id": "", "indexing_status": "indexing", "processing_started_at": 1681623462.0, "parsing_completed_at": 1681623462.0, "cleaning_completed_at": 1681623462.0, "splitting_completed_at": 1681623462.0, "completed_at": null, "paused_at": null, "error": null, "stopped_at": null, "completed_segments": 24, "total_segments": 100 }] } ```

### Path Knowledge ID Document ID ```bash {{ title: 'cURL' }} curl --location --request DELETE '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{document_id}' \ --header 'Authorization: Bearer {api_key}' \ ``` ```text {{ title: 'Response' }} 204 No Content ```

### Path Knowledge ID ### Query Search keywords, currently only search document names (optional) Page number (optional) Number of items returned, default 20, range 1-100 (optional) ```bash {{ title: 'cURL' }} curl --location --request GET '${props.apiBaseUrl}/datasets/{dataset_id}/documents' \ --header 'Authorization: Bearer {api_key}' \ ``` ```json {{ title: 'Response' }} { "data": [ { "id": "", "position": 1, "data_source_type": "file_upload", "data_source_info": null, "dataset_process_rule_id": null, "name": "dify", "created_from": "", "created_by": "", "created_at": 1681623639, "tokens": 0, "indexing_status": "waiting", "error": null, "enabled": true, "disabled_at": null, "disabled_by": null, "archived": false }, ], "has_more": false, "limit": 20, "total": 9, "page": 1 } ```

Get a document's detail. ### Path - `dataset_id` (string) Dataset ID - `document_id` (string) Document ID ### Query - `metadata` (string) Metadata filter, can be `all`, `only`, or `without`. Default is `all`. ### Response Returns the document's detail. ### Request Example ```bash {{ title: 'cURL' }} curl -X GET '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{document_id}' \ -H 'Authorization: Bearer {api_key}' ``` ### Response Example ```json {{ title: 'Response' }} { "id": "f46ae30c-5c11-471b-96d0-464f5f32a7b2", "position": 1, "data_source_type": "upload_file", "data_source_info": { "upload_file": { ... } }, "dataset_process_rule_id": "24b99906-845e-499f-9e3c-d5565dd6962c", "dataset_process_rule": { "mode": "hierarchical", "rules": { "pre_processing_rules": [ { "id": "remove_extra_spaces", "enabled": true }, { "id": "remove_urls_emails", "enabled": false } ], "segmentation": { "separator": "**********page_ending**********", "max_tokens": 1024, "chunk_overlap": 0 }, "parent_mode": "paragraph", "subchunk_segmentation": { "separator": "\n", "max_tokens": 512, "chunk_overlap": 0 } } }, "document_process_rule": { "id": "24b99906-845e-499f-9e3c-d5565dd6962c", "dataset_id": "48a0db76-d1a9-46c1-ae35-2baaa919a8a9", "mode": "hierarchical", "rules": { "pre_processing_rules": [ { "id": "remove_extra_spaces", "enabled": true }, { "id": "remove_urls_emails", "enabled": false } ], "segmentation": { "separator": "**********page_ending**********", "max_tokens": 1024, "chunk_overlap": 0 }, "parent_mode": "paragraph", "subchunk_segmentation": { "separator": "\n", "max_tokens": 512, "chunk_overlap": 0 } } }, "name": "xxxx", "created_from": "web", "created_by": "17f71940-a7b5-4c77-b60f-2bd645c1ffa0", "created_at": 1750464191, "tokens": null, "indexing_status": "waiting", "completed_at": null, "updated_at": 1750464191, "indexing_latency": null, "error": null, "enabled": true, "disabled_at": null, "disabled_by": null, "archived": false, "segment_count": 0, "average_segment_length": 0, "hit_count": null, "display_status": "queuing", "doc_form": "hierarchical_model", "doc_language": "Chinese Simplified" } ``` ___

### Path Knowledge ID - `enable` - Enable document - `disable` - Disable document - `archive` - Archive document - `un_archive` - Unarchive document ### Request Body List of document IDs ```bash {{ title: 'cURL' }} curl --location --request PATCH '${props.apiBaseUrl}/datasets/{dataset_id}/documents/status/{action}' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ --data-raw '{ "document_ids": ["doc-id-1", "doc-id-2"] }' ``` ```json {{ title: 'Response' }} { "result": "success" } ```

### Path Knowledge ID Document ID ### Request Body - content (text) Text content / question content, required - answer (text) Answer content, if the mode of the knowledge is Q&A mode, pass the value (optional) - keywords (list) Keywords (optional) ```bash {{ title: 'cURL' }} curl --location --request POST '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{document_id}/segments' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ --data-raw '{ "segments": [ { "content": "1", "answer": "1", "keywords": ["a"] } ] }' ``` ```json {{ title: 'Response' }} { "data": [{ "id": "", "position": 1, "document_id": "", "content": "1", "answer": "1", "word_count": 25, "tokens": 0, "keywords": [ "a" ], "index_node_id": "", "index_node_hash": "", "hit_count": 0, "enabled": true, "disabled_at": null, "disabled_by": null, "status": "completed", "created_by": "", "created_at": 1695312007, "indexing_at": 1695312007, "completed_at": 1695312007, "error": null, "stopped_at": null }], "doc_form": "text_model" } ```

### Path Knowledge ID Document ID ### Query Keyword (optional) Search status, completed Page number (optional) Number of items returned, default 20, range 1-100 (optional) ```bash {{ title: 'cURL' }} curl --location --request GET '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{document_id}/segments' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' ``` ```json {{ title: 'Response' }} { "data": [{ "id": "", "position": 1, "document_id": "", "content": "1", "answer": "1", "word_count": 25, "tokens": 0, "keywords": [ "a" ], "index_node_id": "", "index_node_hash": "", "hit_count": 0, "enabled": true, "disabled_at": null, "disabled_by": null, "status": "completed", "created_by": "", "created_at": 1695312007, "indexing_at": 1695312007, "completed_at": 1695312007, "error": null, "stopped_at": null }], "doc_form": "text_model", "has_more": false, "limit": 20, "total": 9, "page": 1 } ```

Get details of a specific document segment in the specified knowledge base ### Path Knowledge Base ID Document ID Segment ID ```bash {{ title: 'cURL' }} curl --location --request GET '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{document_id}/segments/{segment_id}' \ --header 'Authorization: Bearer {api_key}' ``` ```json {{ title: 'Response' }} { "data": { "id": "chunk_id", "position": 2, "document_id": "document_id", "content": "Segment content text", "sign_content": "Signature content text", "answer": "Answer content (if in Q&A mode)", "word_count": 470, "tokens": 382, "keywords": ["keyword1", "keyword2"], "index_node_id": "index_node_id", "index_node_hash": "index_node_hash", "hit_count": 0, "enabled": true, "status": "completed", "created_by": "creator_id", "created_at": creation_timestamp, "updated_at": update_timestamp, "indexing_at": indexing_timestamp, "completed_at": completion_timestamp, "error": null, "child_chunks": [] }, "doc_form": "text_model" } ```

### Path Knowledge ID Document ID Document Segment ID ```bash {{ title: 'cURL' }} curl --location --request DELETE '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{document_id}/segments/{segment_id}' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' ``` ```text {{ title: 'Response' }} 204 No Content ```

### POST Knowledge ID Document ID Document Segment ID ### Request Body - content (text) Text content / question content, required - answer (text) Answer content, passed if the knowledge is in Q&A mode (optional) - keywords (list) Keyword (optional) - enabled (bool) False / true (optional) - regenerate_child_chunks (bool) Whether to regenerate child chunks (optional) ```bash {{ title: 'cURL' }} curl --location --request POST '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{document_id}/segments/{segment_id}' \ --header 'Content-Type: application/json' \ --data-raw '{ "segment": { "content": "1", "answer": "1", "keywords": ["a"], "enabled": false } }' ``` ```json {{ title: 'Response' }} { "data": { "id": "", "position": 1, "document_id": "", "content": "1", "answer": "1", "word_count": 25, "tokens": 0, "keywords": [ "a" ], "index_node_id": "", "index_node_hash": "", "hit_count": 0, "enabled": true, "disabled_at": null, "disabled_by": null, "status": "completed", "created_by": "", "created_at": 1695312007, "indexing_at": 1695312007, "completed_at": 1695312007, "error": null, "stopped_at": null }, "doc_form": "text_model" } ```

### Path Knowledge ID Document ID Segment ID ### Request Body Child chunk content ```bash {{ title: 'cURL' }} curl --location --request POST '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{document_id}/segments/{segment_id}/child_chunks' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ --data-raw '{ "content": "Child chunk content" }' ``` ```json {{ title: 'Response' }} { "data": { "id": "", "segment_id": "", "content": "Child chunk content", "word_count": 25, "tokens": 0, "index_node_id": "", "index_node_hash": "", "status": "completed", "created_by": "", "created_at": 1695312007, "indexing_at": 1695312007, "completed_at": 1695312007, "error": null, "stopped_at": null } } ```

### Path Knowledge ID Document ID Segment ID ### Query Search keyword (optional) Page number (optional, default: 1) Items per page (optional, default: 20, max: 100) ```bash {{ title: 'cURL' }} curl --location --request GET '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{document_id}/segments/{segment_id}/child_chunks?page=1&limit=20' \ --header 'Authorization: Bearer {api_key}' ``` ```json {{ title: 'Response' }} { "data": [{ "id": "", "segment_id": "", "content": "Child chunk content", "word_count": 25, "tokens": 0, "index_node_id": "", "index_node_hash": "", "status": "completed", "created_by": "", "created_at": 1695312007, "indexing_at": 1695312007, "completed_at": 1695312007, "error": null, "stopped_at": null }], "total": 1, "total_pages": 1, "page": 1, "limit": 20 } ```

### Path Knowledge ID Document ID Segment ID Child Chunk ID ```bash {{ title: 'cURL' }} curl --location --request DELETE '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{document_id}/segments/{segment_id}/child_chunks/{child_chunk_id}' \ --header 'Authorization: Bearer {api_key}' ``` ```text {{ title: 'Response' }} 204 No Content ```

### Path Knowledge ID Document ID Segment ID Child Chunk ID ### Request Body Child chunk content ```bash {{ title: 'cURL' }} curl --location --request PATCH '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{document_id}/segments/{segment_id}/child_chunks/{child_chunk_id}' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ --data-raw '{ "content": "Updated child chunk content" }' ``` ```json {{ title: 'Response' }} { "data": { "id": "", "segment_id": "", "content": "Updated child chunk content", "word_count": 25, "tokens": 0, "index_node_id": "", "index_node_hash": "", "status": "completed", "created_by": "", "created_at": 1695312007, "indexing_at": 1695312007, "completed_at": 1695312007, "error": null, "stopped_at": null } } ```

### Path Knowledge ID Document ID ```bash {{ title: 'cURL' }} curl --location --request GET '${props.apiBaseUrl}/datasets/{dataset_id}/documents/{document_id}/upload-file' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' ``` ```json {{ title: 'Response' }} { "id": "file_id", "name": "file_name", "size": 1024, "extension": "txt", "url": "preview_url", "download_url": "download_url", "mime_type": "text/plain", "created_by": "user_id", "created_at": 1728734540, } ```

### Path Knowledge ID ### Request Body Query keyword Retrieval parameters (optional, if not filled, it will be recalled according to the default method) - search_method (text) Search method: One of the following four keywords is required - keyword_search Keyword search - semantic_search Semantic search - full_text_search Full-text search - hybrid_search Hybrid search - reranking_enable (bool) Whether to enable reranking, required if the search mode is semantic_search or hybrid_search (optional) - reranking_mode (object) Rerank model configuration, required if reranking is enabled - reranking_provider_name (string) Rerank model provider - reranking_model_name (string) Rerank model name - weights (float) Semantic search weight setting in hybrid search mode - top_k (integer) Number of results to return (optional) - score_threshold_enabled (bool) Whether to enable score threshold - score_threshold (float) Score threshold - metadata_filtering_conditions (object) Metadata filtering conditions - logical_operator (string) Logical operator: and | or - conditions (array[object]) Conditions list - name (string) Metadata field name - comparison_operator (string) Comparison operator, allowed values: - String comparison: - contains: Contains - not contains: Does not contain - start with: Starts with - end with: Ends with - is: Equals - is not: Does not equal - empty: Is empty - not empty: Is not empty - Numeric comparison: - =: Equals - ≠: Does not equal - >: Greater than - < : Less than - ≥: Greater than or equal - ≤: Less than or equal - Time comparison: - before: Before - after: After - value (string|number|null) Comparison value Unused field ```bash {{ title: 'cURL' }} curl --location --request POST '${props.apiBaseUrl}/datasets/{dataset_id}/retrieve' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ --data-raw '{ "query": "test", "retrieval_model": { "search_method": "keyword_search", "reranking_enable": false, "reranking_mode": null, "reranking_model": { "reranking_provider_name": "", "reranking_model_name": "" }, "weights": null, "top_k": 2, "score_threshold_enabled": false, "score_threshold": null } }' ``` ```json {{ title: 'Response' }} { "query": { "content": "test" }, "records": [ { "segment": { "id": "7fa6f24f-8679-48b3-bc9d-bdf28d73f218", "position": 1, "document_id": "a8c6c36f-9f5d-4d7a-8472-f5d7b75d71d2", "content": "Operation guide", "answer": null, "word_count": 847, "tokens": 280, "keywords": [ "install", "java", "base", "scripts", "jdk", "manual", "internal", "opens", "add", "vmoptions" ], "index_node_id": "39dd8443-d960-45a8-bb46-7275ad7fbc8e", "index_node_hash": "0189157697b3c6a418ccf8264a09699f25858975578f3467c76d6bfc94df1d73", "hit_count": 0, "enabled": true, "disabled_at": null, "disabled_by": null, "status": "completed", "created_by": "dbcb1ab5-90c8-41a7-8b78-73b235eb6f6f", "created_at": 1728734540, "indexing_at": 1728734552, "completed_at": 1728734584, "error": null, "stopped_at": null, "document": { "id": "a8c6c36f-9f5d-4d7a-8472-f5d7b75d71d2", "data_source_type": "upload_file", "name": "readme.txt", } }, "score": 3.730463140527718e-05, "tsne_position": null } ] } ```

### Path Knowledge ID ### Request Body - type (string) Metadata type, required - name (string) Metadata name, required ```bash {{ title: 'cURL' }} ``` ```json {{ title: 'Response' }} { "id": "abc", "type": "string", "name": "test", } ```

### Path Knowledge ID Metadata ID ### Request Body - name (string) Metadata name, required ```bash {{ title: 'cURL' }} ``` ```json {{ title: 'Response' }} { "id": "abc", "type": "string", "name": "test", } ```

### Path Knowledge ID Metadata ID ```bash {{ title: 'cURL' }} ```

### Path Knowledge ID disable/enable ```bash {{ title: 'cURL' }} ```

### Path Knowledge ID ### Request Body - document_id (string) Document ID - metadata_list (list) Metadata list - id (string) Metadata ID - value (string) Metadata value - name (string) Metadata name ```bash {{ title: 'cURL' }}

### Params Knowledge ID ```bash {{ title: 'cURL' }} ``` ```json {{ title: 'Response' }} { "doc_metadata": [ { "id": "", "name": "name", "type": "string", "use_count": 0, }, ... ], "built_in_field_enabled": true } ```

### Query ```bash {{ title: 'cURL' }} curl --location --request GET '${props.apiBaseUrl}/workspaces/current/models/model-types/text-embedding' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ ``` ```json {{ title: 'Response' }} { "data": [ { "provider": "zhipuai", "label": { "zh_Hans": "智谱 AI", "en_US": "ZHIPU AI" }, "icon_small": { "zh_Hans": "http://127.0.0.1:5001/console/api/workspaces/current/model-providers/zhipuai/icon_small/zh_Hans", "en_US": "http://127.0.0.1:5001/console/api/workspaces/current/model-providers/zhipuai/icon_small/en_US" }, "icon_large": { "zh_Hans": "http://127.0.0.1:5001/console/api/workspaces/current/model-providers/zhipuai/icon_large/zh_Hans", "en_US": "http://127.0.0.1:5001/console/api/workspaces/current/model-providers/zhipuai/icon_large/en_US" }, "status": "active", "models": [ { "model": "embedding-3", "label": { "zh_Hans": "embedding-3", "en_US": "embedding-3" }, "model_type": "text-embedding", "features": null, "fetch_from": "predefined-model", "model_properties": { "context_size": 8192 }, "deprecated": false, "status": "active", "load_balancing_enabled": false }, { "model": "embedding-2", "label": { "zh_Hans": "embedding-2", "en_US": "embedding-2" }, "model_type": "text-embedding", "features": null, "fetch_from": "predefined-model", "model_properties": { "context_size": 8192 }, "deprecated": false, "status": "active", "load_balancing_enabled": false }, { "model": "text_embedding", "label": { "zh_Hans": "text_embedding", "en_US": "text_embedding" }, "model_type": "text-embedding", "features": null, "fetch_from": "predefined-model", "model_properties": { "context_size": 512 }, "deprecated": false, "status": "active", "load_balancing_enabled": false } ] } ] } ```

Okay, I will translate the Chinese text in your document while keeping all formatting and code content unchanged. ### Request Body (text) New tag name, required, maximum length 50 ```bash {{ title: 'cURL' }} curl --location --request POST '${props.apiBaseUrl}/datasets/tags' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ --data-raw '{"name": "testtag1"}' ``` ```json {{ title: 'Response' }} { "id": "eddb66c2-04a1-4e3a-8cb2-75abd01e12a6", "name": "testtag1", "type": "knowledge", "binding_count": 0 } ```

### Request Body ```bash {{ title: 'cURL' }} curl --location --request GET '${props.apiBaseUrl}/datasets/tags' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' ``` ```json {{ title: 'Response' }} [ { "id": "39d6934c-ed36-463d-b4a7-377fa1503dc0", "name": "testtag1", "type": "knowledge", "binding_count": "0" }, ... ] ```

### Request Body (text) Modified tag name, required, maximum length 50 (text) Tag ID, required ```bash {{ title: 'cURL' }} curl --location --request PATCH '${props.apiBaseUrl}/datasets/tags' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ --data-raw '{"name": "testtag2", "tag_id": "e1a0a3db-ee34-4e04-842a-81555d5316fd"}' ``` ```json {{ title: 'Response' }} { "id": "eddb66c2-04a1-4e3a-8cb2-75abd01e12a6", "name": "tag-renamed", "type": "knowledge", "binding_count": 0 } ```

### Request Body (text) Tag ID, required ```bash {{ title: 'cURL' }} curl --location --request DELETE '${props.apiBaseUrl}/datasets/tags' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ --data-raw '{"tag_id": "e1a0a3db-ee34-4e04-842a-81555d5316fd"}' ``` ```json {{ title: 'Response' }} {"result": "success"} ```

### Request Body (list) List of Tag IDs, required (text) Dataset ID, required ```bash {{ title: 'cURL' }} curl --location --request POST '${props.apiBaseUrl}/datasets/tags/binding' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ --data-raw '{"tag_ids": ["65cc29be-d072-4e26-adf4-2f727644da29","1e5348f3-d3ff-42b8-a1b7-0a86d518001a"], "target_id": "a932ea9f-fae1-4b2c-9b65-71c56e2cacd6"}' ``` ```json {{ title: 'Response' }} {"result": "success"} ```

### Request Body (text) Tag ID, required (text) Dataset ID, required ```bash {{ title: 'cURL' }} curl --location --request POST '${props.apiBaseUrl}/datasets/tags/unbinding' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ --data-raw '{"tag_id": "1e5348f3-d3ff-42b8-a1b7-0a86d518001a", "target_id": "a932ea9f-fae1-4b2c-9b65-71c56e2cacd6"}' ``` ```json {{ title: 'Response' }} {"result": "success"} ```

### Path (text) Dataset ID /tags' \\\n--header 'Authorization: Bearer {api_key}' \\\n--header 'Content-Type: application/json' \\\n`} > ```bash {{ title: 'cURL' }} curl --location --request POST '${props.apiBaseUrl}/datasets//tags' \ --header 'Authorization: Bearer {api_key}' \ --header 'Content-Type: application/json' \ ``` ```json {{ title: 'Response' }} { "data": [ {"id": "4a601f4f-f8a2-4166-ae7c-58c3b252a524", "name": "123" }, ... ], "total": 3 } ```

### Error message Error code Error status Error message ```json {{ title: 'Response' }} { "code": "no_file_uploaded", "message": "Please upload your file.", "status": 400 } ```

code	status	message
no_file_uploaded	400	Please upload your file.
too_many_files	400	Only one file is allowed.
file_too_large	413	File size exceeded.
unsupported_file_type	415	File type not allowed.
high_quality_dataset_only	400	Current operation only supports 'high-quality' datasets.
dataset_not_initialized	400	The dataset is still being initialized or indexing. Please wait a moment.
archived_document_immutable	403	The archived document is not editable.
dataset_name_duplicate	409	The dataset name already exists. Please modify your dataset name.
invalid_action	400	Invalid action.
document_already_finished	400	The document has been processed. Please refresh the page or go to the document details.
document_indexing	400	The document is being processed and cannot be edited.
invalid_metadata	400	The metadata content is incorrect. Please check and verify.