POST /api/v1/dataset
Creates a dataset.
http://{address}/api/v1/datasetcontent-Type: application/json"id": string"name": string"avatar": string"tenant_id": string"description": string"language": string"embedding_model": string"permission": string"document_count": integer"chunk_count": integer"parse_method": string"parser_config": Dataset.ParserConfig# "id": id must not be provided.
# "name": name is required and can't be duplicated.
# "tenant_id": tenant_id must not be provided.
# "embedding_model": embedding_model must not be provided.
# "navie" means general.
curl --request POST \
--url http://{address}/api/v1/dataset \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}' \
--data '{
"name": "test",
"chunk_count": 0,
"document_count": 0,
"parse_method": "naive"
}'
"id": (Body parameter)
The ID of the created dataset used to uniquely identify different datasets.
id must not be provided."name": (Body parameter)
The name of the dataset, which must adhere to the following requirements:
name must still be unique."avatar": (Body parameter)
Base64 encoding of the avatar.
"tenant_id": (Body parameter)
The ID of the tenant associated with the dataset, used to link it with specific users.
tenant_id must not be provided.tenant_id cannot be changed."description": (Body parameter)
The description of the dataset.
"language": (Body parameter)
The language setting for the dataset.
"embedding_model": (Body parameter)
Embedding model used in the dataset to generate vector embeddings.
embedding_model must not be provided.embedding_model cannot be changed."permission": (Body parameter)
Specifies who can manipulate the dataset.
"document_count": (Body parameter)
Document count of the dataset.
document_count cannot be changed."chunk_count": (Body parameter)
Chunk count of the dataset.
chunk_count cannot be changed."parse_method": (Body parameter)
Parsing method of the dataset.
parse_method, chunk_count must be greater than 0."parser_config": (Body parameter)
The configuration settings for the dataset parser.
The successful response includes a JSON object like the following:
{
"code": 0,
"data": {
"avatar": null,
"chunk_count": 0,
"create_date": "Thu, 10 Oct 2024 05:57:37 GMT",
"create_time": 1728539857641,
"created_by": "69736c5e723611efb51b0242ac120007",
"description": null,
"document_count": 0,
"embedding_model": "BAAI/bge-large-zh-v1.5",
"id": "8d73076886cc11ef8c270242ac120006",
"language": "English",
"name": "test_1",
"parse_method": "naive",
"parser_config": {
"pages": [
[
1,
1000000
]
]
},
"permission": "me",
"similarity_threshold": 0.2,
"status": "1",
"tenant_id": "69736c5e723611efb51b0242ac120007",
"token_num": 0,
"update_date": "Thu, 10 Oct 2024 05:57:37 GMT",
"update_time": 1728539857641,
"vector_similarity_weight": 0.3
}
}
"error_code": integer0: The operation succeeds.
The error response includes a JSON object like the following:
{
"code": 102,
"message": "Duplicated knowledgebase name in creating dataset."
}
DELETE /api/v1/dataset
Deletes datasets by ids or names.
http://{address}/api/v1/datasetcontent-Type: application/json"names": List[string]"ids": List[string]# Either id or name must be provided, but not both.
curl --request DELETE \
--url http://{address}/api/v1/dataset \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}' \
--data '{
"names": ["test_1", "test_2"]
}'
"names": (Body parameter)
Dataset names to delete."ids": (Body parameter)
Dataset IDs to delete."names" and "ids" are exclusive.
The successful response includes a JSON object like the following:
{
"code": 0
}
"error_code": integer0: The operation succeeds.
The error response includes a JSON object like the following:
{
"code": 102,
"message": "You don't own the dataset."
}
PUT /api/v1/dataset/{dataset_id}
Updates a dataset by its id.
http://{address}/api/v1/dataset/{dataset_id}content-Type: application/json# "id": id is required.
# "name": If you update name, it can't be duplicated.
# "tenant_id": If you update tenant_id, it can't be changed
# "embedding_model": If you update embedding_model, it can't be changed.
# "chunk_count": If you update chunk_count, it can't be changed.
# "document_count": If you update document_count, it can't be changed.
# "parse_method": If you update parse_method, chunk_count must be 0.
# "navie" means general.
curl --request PUT \
--url http://{address}/api/v1/dataset/{dataset_id} \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}' \
--data '{
"name": "test",
"tenant_id": "4fb0cd625f9311efba4a0242ac120006",
"embedding_model": "BAAI/bge-zh-v1.5",
"chunk_count": 0,
"document_count": 0,
"parse_method": "navie"
}'
(Refer to the “Create Dataset” for the complete structure of the request parameters.)
The successful response includes a JSON object like the following:
{
"code": 0
}
"error_code": integer0: The operation succeeds.
The error response includes a JSON object like the following:
{
"code": 102,
"message": "Can't change tenant_id."
}
GET /api/v1/dataset?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}
List all datasets
http://{address}/api/v1/dataset?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id}# If no page parameter is passed, the default is 1
# If no page_size parameter is passed, the default is 1024
# If no order_by parameter is passed, the default is "create_time"
# If no desc parameter is passed, the default is True
curl --request GET \
--url http://{address}/api/v1/dataset?page={page}&page_size={page_size}&orderby={orderby}&desc={desc}&name={dataset_name}&id={dataset_id} \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
path: (Path parameter)
The current page number to retrieve from the paginated data. This parameter determines which set of records will be fetched.path_size: (Path parameter)
The number of records to retrieve per page. This controls how many records will be included in each page.orderby: (Path parameter)
The field by which the records should be sorted. This specifies the attribute or column used to order the results.desc: (Path parameter)
A boolean flag indicating whether the sorting should be in descending order.name: (Path parameter)
Dataset name"id": (Path parameter)"name": (Path parameter)The successful response includes a JSON object like the following:
{
"code": 0,
"data": [
{
"avatar": "",
"chunk_count": 59,
"create_date": "Sat, 14 Sep 2024 01:12:37 GMT",
"create_time": 1726276357324,
"created_by": "69736c5e723611efb51b0242ac120007",
"description": null,
"document_count": 1,
"embedding_model": "BAAI/bge-large-zh-v1.5",
"id": "6e211ee0723611efa10a0242ac120007",
"language": "English",
"name": "mysql",
"parse_method": "knowledge_graph",
"parser_config": {
"chunk_token_num": 8192,
"delimiter": "\\n!?;。;!?",
"entity_types": [
"organization",
"person",
"location",
"event",
"time"
]
},
"permission": "me",
"similarity_threshold": 0.2,
"status": "1",
"tenant_id": "69736c5e723611efb51b0242ac120007",
"token_num": 12744,
"update_date": "Thu, 10 Oct 2024 04:07:23 GMT",
"update_time": 1728533243536,
"vector_similarity_weight": 0.3
}
]
}
The error response includes a JSON object like the following:
{
"code": 102,
"message": "The dataset doesn't exist"
}
POST /api/v1/dataset/{dataset_id}/document
Uploads files to a dataset.
/api/v1/dataset/{dataset_id}/documentcurl --request POST \
--url http://{address}/api/v1/dataset/{dataset_id}/document \
--header 'Content-Type: multipart/form-data' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}' \
--form 'file=@test.txt'
"dataset_id": (Path parameter)
The dataset id"file": (Body parameter)The successful response includes a JSON object like the following:
{
"code": 0
}
"error_code": integer0: The operation succeeds.
The error response includes a JSON object like the following:
{
"code": 3016,
"message": "Can't connect database"
}
GET /api/v1/dataset/{dataset_id}/document/{document_id}
Downloads files from a dataset.
/api/v1/dataset/{dataset_id}/document/{document_id}content-Type: application/jsonOutput:
’{FILE_NAME}’
curl --request GET \
--url http://{address}/api/v1/dataset/{dataset_id}/document/{documents_id} \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
--output '{FILE_NAME}'
"dataset_id": (PATH parameter)
The dataset id"documents_id": (PATH parameter)The successful response includes a JSON object like the following:
{
"code": 0
}
"error_code": integer0: The operation succeeds.
The error response includes a JSON object like the following:
{
"code": 3016,
"message": "Can't connect database"
}
GET /api/v1/dataset/{dataset_id}/info?keywords={keyword}&page={page}&page_size={limit}&orderby={orderby}&desc={desc}&name={name}
List files to a dataset.
/api/v1/dataset/{dataset_id}/info?keywords={keyword}&page={page}&page_size={limit}&orderby={orderby}&desc={desc}&name={namecontent-Type: application/jsoncurl --request GET \
--url http://{address}/api/v1/dataset/{dataset_id}/info?keywords=rag&page=0&page_size=10&orderby=create_time&desc=yes \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
"dataset_id": (PATH parameter)
The dataset idkeywords: (Filter parameter)
The keywords matches the search key workds;page: (Filter parameter)
The current page number to retrieve from the paginated data. This parameter determines which set of records will be fetched.page_size: (Filter parameter)
The number of records to retrieve per page. This controls how many records will be included in each page.orderby: (Filter parameter)
The field by which the records should be sorted. This specifies the attribute or column used to order the results.desc: (Filter parameter)
A boolean flag indicating whether the sorting should be in descending order.name: (Filter parameter)
File name.The successful response includes a JSON object like the following:
{
"code": 0,
"data": {
"docs": [
{
"chunk_count": 0,
"create_date": "Wed, 18 Sep 2024 08:20:49 GMT",
"create_time": 1726647649379,
"created_by": "134408906b6811efbcd20242ac120005",
"id": "e970a94a759611efae5b0242ac120004",
"knowledgebase_id": "e95f574e759611efbc850242ac120004",
"location": "Test Document222.txt",
"name": "Test Document222.txt",
"parser_config": {
"chunk_token_count": 128,
"delimiter": "\n!?。;!?",
"layout_recognize": true,
"task_page_size": 12
},
"parser_method": "naive",
"process_begin_at": null,
"process_duation": 0.0,
"progress": 0.0,
"progress_msg": "",
"run": "0",
"size": 46,
"source_type": "local",
"status": "1",
"thumbnail": null,
"token_count": 0,
"type": "doc",
"update_date": "Wed, 18 Sep 2024 08:20:49 GMT",
"update_time": 1726647649379
},
{
"chunk_count": 0,
"create_date": "Wed, 18 Sep 2024 08:20:49 GMT",
"create_time": 1726647649340,
"created_by": "134408906b6811efbcd20242ac120005",
"id": "e96aad9c759611ef9ab60242ac120004",
"knowledgebase_id": "e95f574e759611efbc850242ac120004",
"location": "Test Document111.txt",
"name": "Test Document111.txt",
"parser_config": {
"chunk_token_count": 128,
"delimiter": "\n!?。;!?",
"layout_recognize": true,
"task_page_size": 12
},
"parser_method": "naive",
"process_begin_at": null,
"process_duation": 0.0,
"progress": 0.0,
"progress_msg": "",
"run": "0",
"size": 46,
"source_type": "local",
"status": "1",
"thumbnail": null,
"token_count": 0,
"type": "doc",
"update_date": "Wed, 18 Sep 2024 08:20:49 GMT",
"update_time": 1726647649340
}
],
"total": 2
},
}
"error_code": integer0: The operation succeeds.
The error response includes a JSON object like the following:
{
"code": 3016,
"message": "Can't connect database"
}
PUT /api/v1/dataset/{dataset_id}/info/{document_id}
Update a file in a dataset
/api/v1/dataset/{dataset_id}/documentcontent-Type: application/jsoncurl --request PUT \
--url http://{address}/api/v1/dataset/{dataset_id}/info/{document_id} \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
--raw '{
"document_id": "f6b170ac758811efa0660242ac120004",
"document_name": "manual.txt",
"thumbnail": null,
"knowledgebase_id": "779333c0758611ef910f0242ac120004",
"parser_method": "manual",
"parser_config": {"chunk_token_count": 128, "delimiter": "\n!?。;!?", "layout_recognize": true, "task_page_size": 12},
"source_type": "local", "type": "doc",
"created_by": "134408906b6811efbcd20242ac120005",
"size": 0, "token_count": 0, "chunk_count": 0,
"progress": 0.0,
"progress_msg": "",
"process_begin_at": null,
"process_duration": 0.0
}'
"document_id": (Body parameter)"document_name": (Body parameter)The successful response includes a JSON object like the following:
{
"code": 0
}
The error response includes a JSON object like the following:
{
"code": 3016,
"message": "Can't connect database"
}
POST /api/v1/dataset/{dataset_id}/chunk
Parse files into chunks in a dataset
/api/v1/dataset/{dataset_id}/chunkcontent-Type: application/jsoncurl --request POST \
--url http://{address}/api/v1/dataset/{dataset_id}/chunk \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
--raw '{
"documents": ["f6b170ac758811efa0660242ac120004", "97ad64b6759811ef9fc30242ac120004"]
}'
"dataset_id": (Path parameter)"documents": (Body parameter)
The successful response includes a JSON object like the following:
{
"code": 0
}
The error response includes a JSON object like the following:
{
"code": 3016,
"message": "Can't connect database"
}
DELETE /api/v1/dataset/{dataset_id}/chunk
Stop file parsing
/api/v1/dataset/{dataset_id}/chunkcontent-Type: application/jsoncurl --request DELETE \
--url http://{address}/api/v1/dataset/{dataset_id}/chunk \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
--raw '{
"documents": ["f6b170ac758811efa0660242ac120004", "97ad64b6759811ef9fc30242ac120004"]
}'
"dataset_id": (Path parameter)"documents": (Body parameter)
The successful response includes a JSON object like the following:
{
"code": 0
}
The error response includes a JSON object like the following:
{
"code": 3016,
"message": "Can't connect database"
}
GET /api/v1/dataset/{dataset_id}/document/{document_id}/chunk
Get document chunk list
/api/v1/dataset/{dataset_id}/document/{document_id}/chunkcontent-Type: application/jsoncurl --request GET \
--url http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}/chunk \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
"dataset_id": (Path parameter)"document_id": (Path parameter)The successful response includes a JSON object like the following:
{
"code": 0
"data": {
"chunks": [
{
"available_int": 1,
"content": "<em>advantag</em>of ragflow increas accuraci and relev:by incorpor retriev inform , ragflow can gener respons that are more accur",
"document_keyword": "ragflow_test.txt",
"document_id": "77df9ef4759a11ef8bdd0242ac120004",
"id": "4ab8c77cfac1a829c8d5ed022a0808c0",
"image_id": "",
"important_keywords": [],
"positions": [
""
]
}
],
"doc": {
"chunk_count": 5,
"create_date": "Wed, 18 Sep 2024 08:46:16 GMT",
"create_time": 1726649176833,
"created_by": "134408906b6811efbcd20242ac120005",
"id": "77df9ef4759a11ef8bdd0242ac120004",
"knowledgebase_id": "77d9d24e759a11ef880c0242ac120004",
"location": "ragflow_test.txt",
"name": "ragflow_test.txt",
"parser_config": {
"chunk_token_count": 128,
"delimiter": "\n!?。;!?",
"layout_recognize": true,
"task_page_size": 12
},
"parser_method": "naive",
"process_begin_at": "Wed, 18 Sep 2024 08:46:16 GMT",
"process_duation": 7.3213,
"progress": 1.0,
"progress_msg": "\nTask has been received.\nStart to parse.\nFinish parsing.\nFinished slicing files(5). Start to embedding the content.\nFinished embedding(6.16)! Start to build index!\nDone!",
"run": "3",
"size": 4209,
"source_type": "local",
"status": "1",
"thumbnail": null,
"token_count": 746,
"type": "doc",
"update_date": "Wed, 18 Sep 2024 08:46:23 GMT",
"update_time": 1726649183321
},
"total": 1
},
}
The error response includes a JSON object like the following:
{
"code": 3016,
"message": "Can't connect database"
}
DELETE /api/v1/dataset/{dataset_id}/document/{document_id}/chunk
Delete document chunks
/api/v1/dataset/{dataset_id}/document/{document_id}/chunkcontent-Type: application/jsoncurl --request DELETE \
--url http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}/chunk \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
--raw '{
"chunks": ["f6b170ac758811efa0660242ac120004", "97ad64b6759811ef9fc30242ac120004"]
}'
PUT /api/v1/dataset/{dataset_id}/document/{document_id}/chunk
Update document chunk
/api/v1/dataset/{dataset_id}/document/{document_id}/chunkcontent-Type: application/jsoncurl --request PUT \
--url http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}/chunk \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
--raw '{
"chunk_id": "d87fb0b7212c15c18d0831677552d7de",
"knowledgebase_id": null,
"name": "",
"content": "ragflow123",
"important_keywords": [],
"document_id": "e6bbba92759511efaa900242ac120004",
"status": "1"
}'
POST /api/v1/dataset/{dataset_id}/document/{document_id}/chunk
Insert document chunks
/api/v1/dataset/{dataset_id}/document/{document_id}/chunkcontent-Type: application/jsoncurl --request POST \
--url http://{address}/api/v1/dataset/{dataset_id}/document/{document_id}/chunk \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
--raw '{
"document_id": "97ad64b6759811ef9fc30242ac120004",
"content": ["ragflow content", "ragflow content"]
}'
GET /api/v1/dataset/{dataset_id}/retrieval
Retrieval test of a dataset
/api/v1/dataset/{dataset_id}/retrievalcontent-Type: application/jsoncurl --request GET \
--url http://{address}/api/v1/dataset/{dataset_id}/retrieval \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
--raw '{
"query_text": "This is a cat."
}'
POST /api/v1/chat
Create a chat
/api/v1/chatcontent-Type: application/jsoncurl --request POST \
--url http://{address}/api/v1/chat \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {YOUR_ACCESS_TOKEN}'
--data-binary '{
"avatar": "path",
"create_date": "Wed, 04 Sep 2024 10:08:01 GMT",
"create_time": 1725444481128,
"description": "A helpful Assistant",
"do_refer": "",
"knowledgebases": [
{
"avatar": null,
"chunk_count": 0,
"description": null,
"document_count": 0,
"embedding_model": "",
"id": "d6d0e8e868cd11ef92250242ac120006",
"language": "English",
"name": "Test_assistant",
"parse_method": "naive",
"parser_config": {
"pages": [
[
1,
1000000
]
]
},
"permission": "me",
"tenant_id": "4fb0cd625f9311efba4a0242ac120006"
}
],
"language": "English",
"llm": {
"frequency_penalty": 0.7,
"max_tokens": 512,
"model_name": "deepseek-chat",
"presence_penalty": 0.4,
"temperature": 0.1,
"top_p": 0.3
},
"name": "Miss R",
"prompt": {
"empty_response": "Sorry! Can't find the context!",
"keywords_similarity_weight": 0.7,
"opener": "Hi! I am your assistant, what can I do for you?",
"prompt": "You are an intelligent assistant. Please summarize the content of the knowledge base to answer the question. Please list the data in the knowledge base and answer in detail. When all knowledge base content is irrelevant to the question, your answer must include the sentence 'The answer you are looking for is not found in the knowledge base!' Answers need to consider chat history.\nHere is the knowledge base:\n{knowledge}\nThe above is the knowledge base.",
"rerank_model": "",
"show_quote": true,
"similarity_threshold": 0.2,
"top_n": 8,
"variables": [
{
"key": "knowledge",
"optional": true
}
]
},
"prompt_type": "simple",
"status": "1",
"top_k": 1024,
"update_date": "Wed, 04 Sep 2024 10:08:01 GMT",
"update_time": 1725444481128
}'
PUT /api/v1/chat
Update a chat
/api/v1/chatcontent-Type: application/jsoncurl --request PUT \ --url http://{address}/api/v1/chat \ --header ‘Content-Type: application/json’ \ --header ‘Authorization: Bearer {YOUR_ACCESS_TOKEN}’ \ --data-binary ‘{
"id":"554e96746aaa11efb06b0242ac120005",
"name":"Test"
}’
DELETE /api/v1/chat/{chat_id}
Delete a chat
/api/v1/chat/{chat_id}content-Type: application/jsoncurl --request PUT \ --url http://{address}/api/v1/chat/554e96746aaa11efb06b0242ac120005 \ --header ‘Content-Type: application/json’ \ --header ‘Authorization: Bearer {YOUR_ACCESS_TOKEN}’ }’
GET /api/v1/chat
List all chat assistants
/api/v1/chatcontent-Type: application/jsoncurl --request GET \ --url http://{address}/api/v1/chat \ --header ‘Content-Type: application/json’ \ --header ‘Authorization: Bearer {YOUR_ACCESS_TOKEN}’
POST /api/v1/chat/{chat_id}/session
Create a chat session
/api/v1/chat/{chat_id}/sessioncontent-Type: application/jsoncurl --request POST \ --url http://{address}/api/v1/chat/{chat_id}/session \ --header ‘Content-Type: application/json’ \ --header ‘Authorization: Bearer {YOUR_ACCESS_TOKEN}’ \ --data-binary ‘{
"name": "new session"
}’
GET /api/v1/chat/{chat_id}/session
List all the session of a chat
/api/v1/chat/{chat_id}/sessioncontent-Type: application/jsoncurl --request GET \ --url http://{address}/api/v1/chat/554e96746aaa11efb06b0242ac120005/session \ --header ‘Content-Type: application/json’ \ --header ‘Authorization: Bearer {YOUR_ACCESS_TOKEN}’
DELETE /api/v1/chat/{chat_id}/session/{session_id}
Delete a chat session
/api/v1/chat/{chat_id}/session/{session_id}content-Type: application/jsoncurl --request DELETE \ --url http://{address}/api/v1/chat/554e96746aaa11efb06b0242ac120005/session/791aed9670ea11efbb7e0242ac120007 \ --header ‘Content-Type: application/json’ \ --header ‘Authorization: Bearer {YOUR_ACCESS_TOKEN}’
PUT /api/v1/chat/{chat_id}/session/{session_id}
Update a chat session
/api/v1/chat/{chat_id}/session/{session_id}content-Type: application/jsoncurl --request PUT \ --url http://{address}/api/v1/chat/554e96746aaa11efb06b0242ac120005/session/791aed9670ea11efbb7e0242ac120007 \ --header ‘Content-Type: application/json’ \ --header ‘Authorization: Bearer {YOUR_ACCESS_TOKEN}’ --data-binary ‘{
"name": "Updated session"
}’
POST /api/v1/chat/{chat_id}/session/{session_id}/completion
Chat with a chat session
/api/v1/chat/{chat_id}/session/{session_id}/completioncontent-Type: application/jsoncurl --request POST \ --url http://{address}/api/v1/chat/554e96746aaa11efb06b0242ac120005/session/791aed9670ea11efbb7e0242ac120007/completion \ --header ‘Content-Type: application/json’ \ --header ‘Authorization: Bearer {YOUR_ACCESS_TOKEN}’ --data-binary ‘{
"question": "Hello!",
"stream": true,
}’