Преглед изворни кода

DRAFT: Miscellaneous updates to HTTP API. Tried to finish off Python API ref… (#2909)

…erence but failed.

### What problem does this PR solve?



### Type of change


- [x] Documentation Update
tags/v0.13.0
writinwaters пре 1 година
родитељ
комит
5aec1e3e17
No account linked to committer's email address
2 измењених фајлова са 546 додато и 398 уклоњено
  1. 328
    211
      api/http_api.md
  2. 218
    187
      api/python_api_reference.md

+ 328
- 211
api/http_api.md
Разлика између датотеке није приказан због своје велике величине
Прегледај датотеку


+ 218
- 187
api/python_api_reference.md Прегледај датотеку



#### keywords: `str` #### keywords: `str`


The keywords to match document titles. Defaults to `None`.
The keywords used to match document titles. Defaults to `None`.


#### offset: `int` #### offset: `int`


- `created_by`: `str` The creator of the document. Defaults to `""`. - `created_by`: `str` The creator of the document. Defaults to `""`.
- `size`: `int` The document size in bytes. Defaults to `0`. - `size`: `int` The document size in bytes. Defaults to `0`.
- `token_count`: `int` The number of tokens in the document. Defaults to `0`. - `token_count`: `int` The number of tokens in the document. Defaults to `0`.
- `chunk_count`: `int` The number of chunks that the document is split into. Defaults to `0`.
- `chunk_count`: `int` The number of chunks in the document. Defaults to `0`.
- `progress`: `float` The current processing progress as a percentage. Defaults to `0.0`. - `progress`: `float` The current processing progress as a percentage. Defaults to `0.0`.
- `progress_msg`: `str` A message indicating the current progress status. Defaults to `""`. - `progress_msg`: `str` A message indicating the current progress status. Defaults to `""`.
- `process_begin_at`: `datetime` The start time of document processing. Defaults to `None`. - `process_begin_at`: `datetime` The start time of document processing. Defaults to `None`.
```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow


rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
dataset = rag.create_dataset(name="kb_1")
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
dataset = rag_object.create_dataset(name="kb_1")


filename1 = "~/ragflow.txt" filename1 = "~/ragflow.txt"
blob = open(filename1 , "rb").read() blob = open(filename1 , "rb").read()
DataSet.async_parse_documents(document_ids:list[str]) -> None DataSet.async_parse_documents(document_ids:list[str]) -> None
``` ```


Parses documents in the current dataset.

### Parameters ### Parameters


#### document_ids: `list[str]`, *Required* #### document_ids: `list[str]`, *Required*


### Returns ### Returns


- Success: No value is returned.
- Success: No value is returned.????????????????????
- Failure: `Exception` - Failure: `Exception`


### Examples ### Examples
DataSet.async_cancel_parse_documents(document_ids:list[str])-> None DataSet.async_cancel_parse_documents(document_ids:list[str])-> None
``` ```


Stops parsing specified documents.

### Parameters ### Parameters


#### document_ids: `list[str]`, *Required* #### document_ids: `list[str]`, *Required*


--- ---


## List chunks
## Add chunk


```python ```python
Document.list_chunks(keywords: str = None, offset: int = 0, limit: int = -1, id : str = None) -> list[Chunk]
Document.add_chunk(content:str) -> Chunk ?????????????????????
``` ```


Retrieves a list of document chunks.
Adds a chunk to the current document.


### Parameters ### Parameters


#### keywords: `str`
List chunks whose name has the given keywords. Defaults to `None`

#### offset: `int`

The starting index for the chunks to retrieve. Defaults to `1`
#### content: `str`, *Required*


#### limit
The text content of the chunk.


The maximum number of chunks to retrieve. Default: `30`
#### important_keywords: `list[str]` ??????????????????????


#### id

The ID of the chunk to retrieve. Default: `None`
The key terms or phrases to tag with the chunk.


### Returns ### Returns


- Success: A list of `Chunk` objects.
- Success: A `Chunk` object.
- Failure: `Exception`. - Failure: `Exception`.


A `Chunk` object contains the following attributes:

- `id`: `str`
- `content`: `str` Content of the chunk.
- `important_keywords`: `list[str]` A list of key terms or phrases to tag with the chunk.
- `create_time`: `str` The time when the chunk was created (added to the document).
- `create_timestamp`: `float` The timestamp representing the creation time of the chunk, expressed in seconds since January 1, 1970.
- `knowledgebase_id`: `str` The ID of the associated dataset.
- `document_name`: `str` The name of the associated document.
- `document_id`: `str` The ID of the associated document.
- `available`: `int`???? The chunk's availability status in the dataset. Value options:
- `0`: Unavailable
- `1`: Available


### Examples ### Examples


```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow


rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380") rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
dataset = rag_object.list_datasets("123")
dataset = dataset[0]
dataset.async_parse_documents(["wdfxb5t547d"])
for chunk in doc.list_chunks(keywords="rag", offset=0, limit=12):
print(chunk)
dataset = rag_object.list_datasets(id="123")
dtaset = dataset[0]
doc = dataset.list_documents(id="wdfxb5t547d")
doc = doc[0]
chunk = doc.add_chunk(content="xxxxxxx")
``` ```


## Add chunk
---

## List chunks


```python ```python
Document.add_chunk(content:str) -> Chunk
Document.list_chunks(keywords: str = None, offset: int = 0, limit: int = -1, id : str = None) -> list[Chunk]
``` ```


Retrieves a list of chunks from the current document.

### Parameters ### Parameters


#### content: *Required*
#### keywords: `str`
The keywords used to match chunk content. Defaults to `None`


The text content of the chunk.
#### offset: `int`

The starting index for the chunks to retrieve. Defaults to `1`??????

#### limit


#### important_keywords :`list[str]`
The maximum number of chunks to retrieve. Default: `30`?????????


List the key terms or phrases that are significant or central to the chunk's content.
#### id

The ID of the chunk to retrieve. Default: `None`


### Returns ### Returns


chunk
- Success: A list of `Chunk` objects.
- Failure: `Exception`.


### Examples ### Examples


```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow


rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
dataset = rag.list_datasets(id="123")
dtaset = dataset[0]
doc = dataset.list_documents(id="wdfxb5t547d")
doc = doc[0]
chunk = doc.add_chunk(content="xxxxxxx")
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
dataset = rag_object.list_datasets("123")
dataset = dataset[0]
dataset.async_parse_documents(["wdfxb5t547d"])
for chunk in doc.list_chunks(keywords="rag", offset=0, limit=12):
print(chunk)
``` ```


--- ---


## Delete chunk
## Delete chunks


```python ```python
Document.delete_chunks(chunk_ids: list[str]) Document.delete_chunks(chunk_ids: list[str])
```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow


rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
ds = rag.list_datasets(id="123")
ds = ds[0]
doc = ds.list_documents(id="wdfxb5t547d")
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
dataset = rag_object.list_datasets(id="123")
dataset = dataset[0]
doc = dataset.list_documents(id="wdfxb5t547d")
doc = doc[0] doc = doc[0]
chunk = doc.add_chunk(content="xxxxxxx") chunk = doc.add_chunk(content="xxxxxxx")
doc.delete_chunks(["id_1","id_2"]) doc.delete_chunks(["id_1","id_2"])
A dictionary representing the attributes to update, with the following keys: A dictionary representing the attributes to update, with the following keys:


- `"content"`: `str` Content of the chunk. - `"content"`: `str` Content of the chunk.
- `"important_keywords"`: `list[str]` A list of key terms to attach to the chunk.
- `"important_keywords"`: `list[str]` A list of key terms or phrases to tag with the chunk.
- `"available"`: `int` The chunk's availability status in the dataset. Value options: - `"available"`: `int` The chunk's availability status in the dataset. Value options:
- `0`: Unavailable - `0`: Unavailable
- `1`: Available - `1`: Available
RAGFlow.retrieve(question:str="", datasets:list[str]=None, document=list[str]=None, offset:int=1, limit:int=30, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,higlight:bool=False) -> list[Chunk] RAGFlow.retrieve(question:str="", datasets:list[str]=None, document=list[str]=None, offset:int=1, limit:int=30, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,higlight:bool=False) -> list[Chunk]
``` ```


???????

### Parameters ### Parameters


#### question: `str` *Required* #### question: `str` *Required*


The user query or query keywords. Defaults to `""`. The user query or query keywords. Defaults to `""`.


#### datasets: `list[str]`, *Required*
#### datasets: `list[str]`, *Required*?????


The datasets to search from. The datasets to search from.




#### limit: `int` #### limit: `int`


The maximum number of chunks to retrieve. Defaults to `6`.
The maximum number of chunks to retrieve. Defaults to `6`.???????????????


#### Similarity_threshold: `float` #### Similarity_threshold: `float`




The number of chunks engaged in vector cosine computaton. Defaults to `1024`. The number of chunks engaged in vector cosine computaton. Defaults to `1024`.


#### rerank_id
#### rerank_id: `str`


The ID of the rerank model. Defaults to `None`.
The ID of the rerank model. Defaults to `None`.


#### keyword
#### keyword: `bool`


Indicates whether keyword-based matching is enabled: Indicates whether keyword-based matching is enabled:


- `True`: Enabled. - `True`: Enabled.
- `False`: Disabled.
- `False`: Disabled (default).


#### highlight:`bool`
#### highlight: `bool`


Specifying whether to enable highlighting of matched terms in the results (True) or not (False). Specifying whether to enable highlighting of matched terms in the results (True) or not (False).


from ragflow import RAGFlow from ragflow import RAGFlow


rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380") rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
ds = rag_object.list_datasets(name="ragflow")
ds = ds[0]
dataset = rag_object.list_datasets(name="ragflow")
dataset = dataset[0]
name = 'ragflow_test.txt' name = 'ragflow_test.txt'
path = './test_data/ragflow_test.txt' path = './test_data/ragflow_test.txt'
rag_object.create_document(ds, name=name, blob=open(path, "rb").read())
doc = ds.list_documents(name=name)
rag_object.create_document(dataset, name=name, blob=open(path, "rb").read())
doc = dataset.list_documents(name=name)
doc = doc[0] doc = doc[0]
ds.async_parse_documents([doc.id])
dataset.async_parse_documents([doc.id])
for c in rag_object.retrieve(question="What's ragflow?", for c in rag_object.retrieve(question="What's ragflow?",
datasets=[ds.id], documents=[doc.id],
datasets=[dataset.id], documents=[doc.id],
offset=1, limit=30, similarity_threshold=0.2, offset=1, limit=30, similarity_threshold=0.2,
vector_similarity_weight=0.3, vector_similarity_weight=0.3,
top_k=1024 top_k=1024


The following shows the attributes of a `Chat` object: The following shows the attributes of a `Chat` object:


#### name: *Required*
#### name: `str`, *Required*????????


The name of the chat assistant. Defaults to `"assistant"`. The name of the chat assistant. Defaults to `"assistant"`.


#### avatar
#### avatar: `str`


Base64 encoding of the avatar. Defaults to `""`. Base64 encoding of the avatar. Defaults to `""`.




The IDs of the associated datasets. Defaults to `[""]`. The IDs of the associated datasets. Defaults to `[""]`.


#### llm
#### llm: `Chat.LLM`


The llm of the created chat. Defaults to `None`. When the value is `None`, a dictionary with the following values will be generated as the default. The llm of the created chat. Defaults to `None`. When the value is `None`, a dictionary with the following values will be generated as the default.


- `max_token`, `int` - `max_token`, `int`
This sets the maximum length of the model’s output, measured in the number of tokens (words or pieces of words). Defaults to `512`. This sets the maximum length of the model’s output, measured in the number of tokens (words or pieces of words). Defaults to `512`.


#### Prompt
#### prompt: `Chat.Prompt`


Instructions for the LLM to follow. A `Prompt` object contains the following attributes: Instructions for the LLM to follow. A `Prompt` object contains the following attributes:


```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow


rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
kbs = rag.list_datasets(name="kb_1")
list_kb=[]
for kb in kbs:
list_kb.append(kb.id)
assi = rag.create_chat("Miss R", knowledgebases=list_kb)
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
datasets = rag_object.list_datasets(name="kb_1")
dataset_ids = []
for dataset in datasets:
dataset_ids.append(dataset.id)
assistant = rag_object.create_chat("Miss R", knowledgebases=dataset_ids)
``` ```


--- ---


## Update chat
## Update chat assistant


```python ```python
Chat.update(update_message: dict) Chat.update(update_message: dict)
```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow


rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
knowledge_base = rag.list_datasets(name="kb_1")
assistant = rag.create_chat("Miss R", knowledgebases=knowledge_base)
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
datasets = rag_object.list_datasets(name="kb_1")
assistant = rag_object.create_chat("Miss R", knowledgebases=datasets)
assistant.update({"name": "Stefan", "llm": {"temperature": 0.8}, "prompt": {"top_n": 8}}) assistant.update({"name": "Stefan", "llm": {"temperature": 0.8}, "prompt": {"top_n": 8}})
``` ```


--- ---


## Delete chats
## Delete chat assistants


```python ```python
RAGFlow.delete_chats(ids: list[str] = None) RAGFlow.delete_chats(ids: list[str] = None)
```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow


rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
rag.delete_chats(ids=["id_1","id_2"])
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
rag_object.delete_chats(ids=["id_1","id_2"])
``` ```


--- ---


## List chats
## List chat assistants


```python ```python
RAGFlow.list_chats( RAGFlow.list_chats(


### Parameters ### Parameters


#### page
#### page: `int`


Specifies the page on which the chat assistants will be displayed. Defaults to `1`. Specifies the page on which the chat assistants will be displayed. Defaults to `1`.


#### page_size
#### page_size: `int`


The number of chat assistants on each page. Defaults to `1024`. The number of chat assistants on each page. Defaults to `1024`.


#### order_by
#### orderby: `str`


The attribute by which the results are sorted. Defaults to `"create_time"`.
The attribute by which the results are sorted. Available options:


#### desc
- `"create_time"` (default)
- `"update_time"`

#### desc: `bool`


Indicates whether the retrieved chat assistants should be sorted in descending order. Defaults to `True`. Indicates whether the retrieved chat assistants should be sorted in descending order. Defaults to `True`.


#### id: `string`
#### id: `str`


The ID of the chat to retrieve. Defaults to `None`.
The ID of the chat assistant to retrieve. Defaults to `None`.


#### name: `string`
#### name: `str`


The name of the chat to retrieve. Defaults to `None`.
The name of the chat assistant to retrieve. Defaults to `None`.


### Returns ### Returns


--- ---


:::tip API GROUPING :::tip API GROUPING
Chat-session APIs
Chat Session APIs
::: :::


--- ---


### Parameters ### Parameters


#### name
#### name: `str`


The name of the chat session to create. The name of the chat session to create.


```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow


rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag.list_chats(name="Miss R")
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag_object.list_chats(name="Miss R")
assistant = assistant[0] assistant = assistant[0]
session = assistant.create_session() session = assistant.create_session()
``` ```


---

## Update session ## Update session


```python ```python
```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow


rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag.list_chats(name="Miss R")
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag_object.list_chats(name="Miss R")
assistant = assistant[0] assistant = assistant[0]
session = assistant.create_session("session_name") session = assistant.create_session("session_name")
session.update({"name": "updated_name"}) session.update({"name": "updated_name"})


--- ---


## Chat

```python
Session.ask(question: str, stream: bool = False) -> Optional[Message, iter[Message]]
```

Asks a question to start a conversation.

### Parameters

#### question *Required*

The question to start an AI chat. Defaults to `None`.

#### stream

Indicates whether to output responses in a streaming way:

- `True`: Enable streaming.
- `False`: (Default) Disable streaming.

### Returns

- A `Message` object containing the response to the question if `stream` is set to `False`
- An iterator containing multiple `message` objects (`iter[Message]`) if `stream` is set to `True`

The following shows the attributes of a `Message` object:

#### id: `str`

The auto-generated message ID.

#### content: `str`

The content of the message. Defaults to `"Hi! I am your assistant, can I help you?"`.

#### reference: `list[Chunk]`

A list of `Chunk` objects representing references to the message, each containing the following attributes:

- `id` `str`
The chunk ID.
- `content` `str`
The content of the chunk.
- `image_id` `str`
The ID of the snapshot of the chunk.
- `document_id` `str`
The ID of the referenced document.
- `document_name` `str`
The name of the referenced document.
- `position` `list[str]`
The location information of the chunk within the referenced document.
- `knowledgebase_id` `str`
The ID of the dataset to which the referenced document belongs.
- `similarity` `float`
A composite similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity.
- `vector_similarity` `float`
A vector similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between vector embeddings.
- `term_similarity` `float`
A keyword similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between keywords.


### Examples

```python
from ragflow import RAGFlow

rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag.list_chats(name="Miss R")
assistant = assistant[0]
session = assistant.create_session()

print("\n==================== Miss R =====================\n")
print(assistant.get_prologue())

while True:
question = input("\n==================== User =====================\n> ")
print("\n==================== Miss R =====================\n")
cont = ""
for ans in session.ask(question, stream=True):
print(answer.content[len(cont):], end='', flush=True)
cont = answer.content
```

---

## List sessions ## List sessions


```python ```python


### Parameters ### Parameters


#### page
#### page: `int`


Specifies the page on which the sessions will be displayed. Defaults to `1`. Specifies the page on which the sessions will be displayed. Defaults to `1`.


#### page_size
#### page_size: `int`


The number of sessions on each page. Defaults to `1024`. The number of sessions on each page. Defaults to `1024`.


#### orderby
#### orderby: `str`


The field by which sessions should be sorted. Available options: The field by which sessions should be sorted. Available options:


- `"create_time"` (default) - `"create_time"` (default)
- `"update_time"` - `"update_time"`


#### desc
#### desc: `bool`


Indicates whether the retrieved sessions should be sorted in descending order. Defaults to `True`. Indicates whether the retrieved sessions should be sorted in descending order. Defaults to `True`.


#### id
#### id: `str`


The ID of the chat session to retrieve. Defaults to `None`. The ID of the chat session to retrieve. Defaults to `None`.


#### name
#### name: `str`


The name of the chat to retrieve. Defaults to `None`.
The name of the chat session to retrieve. Defaults to `None`.


### Returns ### Returns


```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow


rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag.list_chats(name="Miss R")
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag_object.list_chats(name="Miss R")
assistant = assistant[0] assistant = assistant[0]
assistant.delete_sessions(ids=["id_1","id_2"]) assistant.delete_sessions(ids=["id_1","id_2"])
```

---

## Chat

```python
Session.ask(question: str, stream: bool = False) -> Optional[Message, iter[Message]]
```

Asks a question to start a conversation.

### Parameters

#### question: `str` *Required*

The question to start an AI chat.

#### stream: `str`

Indicates whether to output responses in a streaming way:

- `True`: Enable streaming.
- `False`: (Default) Disable streaming.

### Returns

- A `Message` object containing the response to the question if `stream` is set to `False`
- An iterator containing multiple `message` objects (`iter[Message]`) if `stream` is set to `True`

The following shows the attributes of a `Message` object:

#### id: `str`

The auto-generated message ID.

#### content: `str`

The content of the message. Defaults to `"Hi! I am your assistant, can I help you?"`.

#### reference: `list[Chunk]`

A list of `Chunk` objects representing references to the message, each containing the following attributes:

- `id` `str`
The chunk ID.
- `content` `str`
The content of the chunk.
- `image_id` `str`
The ID of the snapshot of the chunk.
- `document_id` `str`
The ID of the referenced document.
- `document_name` `str`
The name of the referenced document.
- `position` `list[str]`
The location information of the chunk within the referenced document.
- `knowledgebase_id` `str`
The ID of the dataset to which the referenced document belongs.
- `similarity` `float`
A composite similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity.
- `vector_similarity` `float`
A vector similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between vector embeddings.
- `term_similarity` `float`
A keyword similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between keywords.


### Examples

```python
from ragflow import RAGFlow

rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag_object.list_chats(name="Miss R")
assistant = assistant[0]
session = assistant.create_session()

print("\n==================== Miss R =====================\n")
print(assistant.get_prologue())

while True:
question = input("\n==================== User =====================\n> ")
print("\n==================== Miss R =====================\n")
cont = ""
for ans in session.ask(question, stream=True):
print(answer.content[len(cont):], end='', flush=True)
cont = answer.content
``` ```

Loading…
Откажи
Сачувај