…erence but failed. ### What problem does this PR solve? ### Type of change - [x] Documentation Update

пре 1 година · 5aec1e3e17
--- a/api/http_api.md
+++ b/api/http_api.md
--- a/api/python_api_reference.md
+++ b/api/python_api_reference.md
 #### keywords: `str`
 The keywords to match document titles. Defaults to `None`.
 The keywords used to match document titles. Defaults to `None`.
 #### offset: `int`
 - `created_by`: `str` The creator of the document. Defaults to `""`.
 - `size`: `int` The document size in bytes. Defaults to `0`.
 - `token_count`: `int` The number of tokens in the document. Defaults to `0`.
 - `chunk_count`: `int` The number of chunks that the document is split into. Defaults to `0`.
 - `chunk_count`: `int` The number of chunks in the document. Defaults to `0`.
 - `progress`: `float` The current processing progress as a percentage. Defaults to `0.0`.
 - `progress_msg`: `str` A message indicating the current progress status. Defaults to `""`.
 - `process_begin_at`: `datetime` The start time of document processing. Defaults to `None`.
 ```python
 from ragflow import RAGFlow
 rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 dataset = rag.create_dataset(name="kb_1")
 rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 dataset = rag_object.create_dataset(name="kb_1")
 filename1 = "~/ragflow.txt"
 blob = open(filename1 , "rb").read()
 DataSet.async_parse_documents(document_ids:list[str]) -> None
 ```
 Parses documents in the current dataset.
 ### Parameters
 #### document_ids: `list[str]`, *Required*
 ### Returns
 - Success: No value is returned.
 - Success: No value is returned.????????????????????
 - Failure: `Exception`
 ### Examples
 DataSet.async_cancel_parse_documents(document_ids:list[str])-> None
 ```
 Stops parsing specified documents.
 ### Parameters
 #### document_ids: `list[str]`, *Required*
 ---
 ## List chunks
 ## Add chunk
 ```python
 Document.list_chunks(keywords: str = None, offset: int = 0, limit: int = -1, id : str = None) -> list[Chunk]
 Document.add_chunk(content:str) -> Chunk ?????????????????????
 ```
 Retrieves a list of document chunks.
 Adds a chunk to the current document.
 ### Parameters
 #### keywords: `str`  
 List chunks whose name has the given keywords. Defaults to `None`
 #### offset: `int`
 The starting index for the chunks to retrieve. Defaults to `1`
 #### content: `str`, *Required*
 #### limit  
 The text content of the chunk.
 The maximum number of chunks to retrieve.  Default: `30`
 #### important_keywords: `list[str]`  ??????????????????????
 #### id
 The ID of the chunk to retrieve. Default: `None`
 The key terms or phrases to tag with the chunk.
 ### Returns
 - Success: A list of `Chunk` objects.
 - Success: A `Chunk` object.
 - Failure: `Exception`.
 A `Chunk` object contains the following attributes:
 - `id`: `str` 
 - `content`: `str` Content of the chunk.
 - `important_keywords`: `list[str]` A list of key terms or phrases to tag with the chunk.
 - `create_time`: `str` The time when the chunk was created (added to the document).
 - `create_timestamp`: `float` The timestamp representing the creation time of the chunk, expressed in seconds since January 1, 1970.
 - `knowledgebase_id`: `str` The ID of the associated dataset.
 - `document_name`: `str` The name of the associated document.
 - `document_id`: `str` The ID of the associated document.
 - `available`: `int`???? The chunk's availability status in the dataset. Value options:
  - `0`: Unavailable
  - `1`: Available
 ### Examples
 ```python
 from ragflow import RAGFlow
 rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 dataset = rag_object.list_datasets("123")
 dataset = dataset[0]
 dataset.async_parse_documents(["wdfxb5t547d"])
 for chunk in doc.list_chunks(keywords="rag", offset=0, limit=12):
    print(chunk)
 dataset = rag_object.list_datasets(id="123")
 dtaset = dataset[0]
 doc = dataset.list_documents(id="wdfxb5t547d")
 doc = doc[0]
 chunk = doc.add_chunk(content="xxxxxxx")
 ```
 ## Add chunk
 ---
 ## List chunks
 ```python
 Document.add_chunk(content:str) -> Chunk
 Document.list_chunks(keywords: str = None, offset: int = 0, limit: int = -1, id : str = None) -> list[Chunk]
 ```
 Retrieves a list of chunks from the current document.
 ### Parameters
 #### content: *Required*
 #### keywords: `str`  
 The keywords used to match chunk content. Defaults to `None`
 The text content of the chunk.
 #### offset: `int`
 The starting index for the chunks to retrieve. Defaults to `1`??????
 #### limit  
 #### important_keywords :`list[str]`
 The maximum number of chunks to retrieve.  Default: `30`?????????
 List the key terms or phrases that are significant or central to the chunk's content.
 #### id
 The ID of the chunk to retrieve. Default: `None`
 ### Returns
 chunk
 - Success: A list of `Chunk` objects.
 - Failure: `Exception`.
 ### Examples
 ```python
 from ragflow import RAGFlow
 rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 dataset = rag.list_datasets(id="123")
 dtaset = dataset[0]
 doc = dataset.list_documents(id="wdfxb5t547d")
 doc = doc[0]
 chunk = doc.add_chunk(content="xxxxxxx")
 rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 dataset = rag_object.list_datasets("123")
 dataset = dataset[0]
 dataset.async_parse_documents(["wdfxb5t547d"])
 for chunk in doc.list_chunks(keywords="rag", offset=0, limit=12):
    print(chunk)
 ```
 ---
 ## Delete chunk
 ## Delete chunks
 ```python
 Document.delete_chunks(chunk_ids: list[str])
 ```python
 from ragflow import RAGFlow
 rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 ds = rag.list_datasets(id="123")
 ds = ds[0]
 doc = ds.list_documents(id="wdfxb5t547d")
 rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 dataset = rag_object.list_datasets(id="123")
 dataset = dataset[0]
 doc = dataset.list_documents(id="wdfxb5t547d")
 doc = doc[0]
 chunk = doc.add_chunk(content="xxxxxxx")
 doc.delete_chunks(["id_1","id_2"])
 A dictionary representing the attributes to update, with the following keys:
 - `"content"`: `str` Content of the chunk.
 - `"important_keywords"`: `list[str]` A list of key terms to attach to the chunk.
 - `"important_keywords"`: `list[str]` A list of key terms or phrases to tag with the chunk.
 - `"available"`: `int` The chunk's availability status in the dataset. Value options:
  - `0`: Unavailable
  - `1`: Available
 RAGFlow.retrieve(question:str="", datasets:list[str]=None, document=list[str]=None, offset:int=1, limit:int=30, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,higlight:bool=False) -> list[Chunk]
 ```
 ???????
 ### Parameters
 #### question: `str` *Required*
 The user query or query keywords. Defaults to `""`.
 #### datasets: `list[str]`, *Required*
 #### datasets: `list[str]`, *Required*?????
 The datasets to search from.
 #### limit: `int`
 The maximum number of chunks to retrieve. Defaults to `6`.
 The maximum number of chunks to retrieve. Defaults to `6`.???????????????
 #### Similarity_threshold: `float`
 The number of chunks engaged in vector cosine computaton. Defaults to `1024`.
 #### rerank_id
 #### rerank_id: `str`
 The ID of the rerank model.  Defaults to `None`.
 The ID of the rerank model. Defaults to `None`.
 #### keyword
 #### keyword: `bool`
 Indicates whether keyword-based matching is enabled:
 - `True`: Enabled.
 - `False`: Disabled.
 - `False`: Disabled (default).
 #### highlight:`bool`
 #### highlight: `bool`
 Specifying whether to enable highlighting of matched terms in the results (True) or not (False).
 from ragflow import RAGFlow
 rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 ds = rag_object.list_datasets(name="ragflow")
 ds = ds[0]
 dataset = rag_object.list_datasets(name="ragflow")
 dataset = dataset[0]
 name = 'ragflow_test.txt'
 path = './test_data/ragflow_test.txt'
 rag_object.create_document(ds, name=name, blob=open(path, "rb").read())
 doc = ds.list_documents(name=name)
 rag_object.create_document(dataset, name=name, blob=open(path, "rb").read())
 doc = dataset.list_documents(name=name)
 doc = doc[0]
 ds.async_parse_documents([doc.id])
 dataset.async_parse_documents([doc.id])
 for c in rag_object.retrieve(question="What's ragflow?", 
             datasets=[ds.id], documents=[doc.id], 
             datasets=[dataset.id], documents=[doc.id], 
             offset=1, limit=30, similarity_threshold=0.2, 
             vector_similarity_weight=0.3,
             top_k=1024
 The following shows the attributes of a `Chat` object:
 #### name: *Required*
 #### name: `str`, *Required*????????
 The name of the chat assistant. Defaults to `"assistant"`.
 #### avatar
 #### avatar: `str`
 Base64 encoding of the avatar. Defaults to `""`.
 The IDs of the associated datasets. Defaults to `[""]`.
 #### llm
 #### llm: `Chat.LLM`
 The llm of the created chat. Defaults to `None`. When the value is `None`, a dictionary with the following values will be generated as the default.
 - `max_token`, `int`  
  This sets the maximum length of the model’s output, measured in the number of tokens (words or pieces of words). Defaults to `512`.
 #### Prompt
 #### prompt: `Chat.Prompt`
 Instructions for the LLM to follow.  A `Prompt` object contains the following attributes:
 ```python
 from ragflow import RAGFlow
 rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 kbs = rag.list_datasets(name="kb_1")
 list_kb=[]
 for kb in kbs:
    list_kb.append(kb.id)
 assi = rag.create_chat("Miss R", knowledgebases=list_kb)
 rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 datasets = rag_object.list_datasets(name="kb_1")
 dataset_ids = []
 for dataset in datasets:
    dataset_ids.append(dataset.id)
 assistant = rag_object.create_chat("Miss R", knowledgebases=dataset_ids)
 ```
 ---
 ## Update chat
 ## Update chat assistant
 ```python
 Chat.update(update_message: dict)
 ```python
 from ragflow import RAGFlow
 rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 knowledge_base = rag.list_datasets(name="kb_1")
 assistant = rag.create_chat("Miss R", knowledgebases=knowledge_base)
 rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 datasets = rag_object.list_datasets(name="kb_1")
 assistant = rag_object.create_chat("Miss R", knowledgebases=datasets)
 assistant.update({"name": "Stefan", "llm": {"temperature": 0.8}, "prompt": {"top_n": 8}})
 ```
 ---
 ## Delete chats
 ## Delete chat assistants
 ```python
 RAGFlow.delete_chats(ids: list[str] = None)
 ```python
 from ragflow import RAGFlow
 rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 rag.delete_chats(ids=["id_1","id_2"])
 rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 rag_object.delete_chats(ids=["id_1","id_2"])
 ```
 ---
 ## List chats
 ## List chat assistants
 ```python
 RAGFlow.list_chats(
 ### Parameters
 #### page
 #### page: `int`
 Specifies the page on which the chat assistants will be displayed. Defaults to `1`.
 #### page_size
 #### page_size: `int`
 The number of chat assistants on each page. Defaults to `1024`.
 #### order_by
 #### orderby: `str`
 The attribute by which the results are sorted. Defaults to `"create_time"`.
 The attribute by which the results are sorted. Available options:
 #### desc
 - `"create_time"` (default)
 - `"update_time"`
 #### desc: `bool`
 Indicates whether the retrieved chat assistants should be sorted in descending order. Defaults to `True`.
 #### id: `string`  
 #### id: `str`  
 The ID of the chat to retrieve. Defaults to `None`.
 The ID of the chat assistant to retrieve. Defaults to `None`.
 #### name: `string`  
 #### name: `str`  
 The name of the chat to retrieve. Defaults to `None`.
 The name of the chat assistant to retrieve. Defaults to `None`.
 ### Returns
 ---
 :::tip API GROUPING
 Chat-session APIs
 Chat Session APIs
 :::
 ---
 ### Parameters
 #### name
 #### name: `str`
 The name of the chat session to create.
 ```python
 from ragflow import RAGFlow
 rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 assistant = rag.list_chats(name="Miss R")
 rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 assistant = rag_object.list_chats(name="Miss R")
 assistant = assistant[0]
 session = assistant.create_session()
 ```
 ---
 ## Update session
 ```python
 ```python
 from ragflow import RAGFlow
 rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 assistant = rag.list_chats(name="Miss R")
 rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 assistant = rag_object.list_chats(name="Miss R")
 assistant = assistant[0]
 session = assistant.create_session("session_name")
 session.update({"name": "updated_name"})
 ---
 ## Chat
 ```python
 Session.ask(question: str, stream: bool = False) -> Optional[Message, iter[Message]]
 ```
 Asks a question to start a conversation.
 ### Parameters
 #### question *Required*
 The question to start an AI chat. Defaults to `None`.
 #### stream
 Indicates whether to output responses in a streaming way:
 - `True`: Enable streaming.
 - `False`: (Default) Disable streaming.
 ### Returns
 - A `Message` object containing the response to the question if `stream` is set to `False`
 - An iterator containing multiple `message` objects (`iter[Message]`) if `stream` is set to `True`
 The following shows the attributes of a `Message` object:
 #### id: `str`
 The auto-generated message ID.
 #### content: `str`
 The content of the message. Defaults to `"Hi! I am your assistant, can I help you?"`.
 #### reference: `list[Chunk]`
 A list of `Chunk` objects representing references to the message, each containing the following attributes:
 - `id` `str`  
  The chunk ID.
 - `content` `str`  
  The content of the chunk.
 - `image_id` `str`  
  The ID of the snapshot of the chunk.
 - `document_id` `str`  
  The ID of the referenced document.
 - `document_name` `str`  
  The name of the referenced document.
 - `position` `list[str]`  
  The location information of the chunk within the referenced document.
 - `knowledgebase_id` `str`  
  The ID of the dataset to which the referenced document belongs.
 - `similarity` `float`
  A composite similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity.
 - `vector_similarity` `float`  
  A vector similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between vector embeddings.
 - `term_similarity` `float`  
  A keyword similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between keywords.
 ### Examples
 ```python
 from ragflow import RAGFlow
 rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 assistant = rag.list_chats(name="Miss R")
 assistant = assistant[0]
 session = assistant.create_session()    
 print("\n==================== Miss R =====================\n")
 print(assistant.get_prologue())
 while True:
    question = input("\n==================== User =====================\n> ")
    print("\n==================== Miss R =====================\n")
    cont = ""
    for ans in session.ask(question, stream=True):
        print(answer.content[len(cont):], end='', flush=True)
        cont = answer.content
 ```
 ---
 ## List sessions
 ```python
 ### Parameters
 #### page
 #### page: `int`
 Specifies the page on which the sessions will be displayed. Defaults to `1`.
 #### page_size
 #### page_size: `int`
 The number of sessions on each page. Defaults to `1024`.
 #### orderby
 #### orderby: `str`
 The field by which sessions should be sorted. Available options:
 - `"create_time"` (default)
 - `"update_time"`
 #### desc
 #### desc: `bool`
 Indicates whether the retrieved sessions should be sorted in descending order. Defaults to `True`.
 #### id
 #### id: `str`
 The ID of the chat session to retrieve. Defaults to `None`.
 #### name
 #### name: `str`
 The name of the chat to retrieve. Defaults to `None`.
 The name of the chat session to retrieve. Defaults to `None`.
 ### Returns
 ```python
 from ragflow import RAGFlow
 rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 assistant = rag.list_chats(name="Miss R")
 rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 assistant = rag_object.list_chats(name="Miss R")
 assistant = assistant[0]
 assistant.delete_sessions(ids=["id_1","id_2"])
 ```
 ---
 ## Chat
 ```python
 Session.ask(question: str, stream: bool = False) -> Optional[Message, iter[Message]]
 ```
 Asks a question to start a conversation.
 ### Parameters
 #### question: `str` *Required*
 The question to start an AI chat.
 #### stream: `str`
 Indicates whether to output responses in a streaming way:
 - `True`: Enable streaming.
 - `False`: (Default) Disable streaming.
 ### Returns
 - A `Message` object containing the response to the question if `stream` is set to `False`
 - An iterator containing multiple `message` objects (`iter[Message]`) if `stream` is set to `True`
 The following shows the attributes of a `Message` object:
 #### id: `str`
 The auto-generated message ID.
 #### content: `str`
 The content of the message. Defaults to `"Hi! I am your assistant, can I help you?"`.
 #### reference: `list[Chunk]`
 A list of `Chunk` objects representing references to the message, each containing the following attributes:
 - `id` `str`  
  The chunk ID.
 - `content` `str`  
  The content of the chunk.
 - `image_id` `str`  
  The ID of the snapshot of the chunk.
 - `document_id` `str`  
  The ID of the referenced document.
 - `document_name` `str`  
  The name of the referenced document.
 - `position` `list[str]`  
  The location information of the chunk within the referenced document.
 - `knowledgebase_id` `str`  
  The ID of the dataset to which the referenced document belongs.
 - `similarity` `float`
  A composite similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity.
 - `vector_similarity` `float`  
  A vector similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between vector embeddings.
 - `term_similarity` `float`  
  A keyword similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between keywords.
 ### Examples
 ```python
 from ragflow import RAGFlow
 rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
 assistant = rag_object.list_chats(name="Miss R")
 assistant = assistant[0]
 session = assistant.create_session()    
 print("\n==================== Miss R =====================\n")
 print(assistant.get_prologue())
 while True:
    question = input("\n==================== User =====================\n> ")
    print("\n==================== Miss R =====================\n")
    cont = ""
    for ans in session.ask(question, stream=True):
        print(answer.content[len(cont):], end='', flush=True)
        cont = answer.content
 ```