|
|
|
@@ -154,23 +154,23 @@ The chunking method of the dataset to create. Available options: |
|
|
|
The parser configuration of the dataset. A `ParserConfig` object's attributes vary based on the selected `chunk_method`: |
|
|
|
|
|
|
|
- `chunk_method`=`"naive"`: |
|
|
|
`{"chunk_token_num":128,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"user_raptor":False}}`. |
|
|
|
`{"chunk_token_num":128,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"use_raptor":False}}`. |
|
|
|
- `chunk_method`=`"qa"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"manuel"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"table"`: |
|
|
|
`None` |
|
|
|
- `chunk_method`=`"paper"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"book"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"laws"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"picture"`: |
|
|
|
`None` |
|
|
|
- `chunk_method`=`"presentation"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"one"`: |
|
|
|
`None` |
|
|
|
- `chunk_method`=`"knowledge-graph"`: |
|
|
|
@@ -403,21 +403,21 @@ A dictionary representing the attributes to update, with the following keys: |
|
|
|
- `"email"`: Email |
|
|
|
- `"parser_config"`: `dict[str, Any]` The parsing configuration for the document. Its attributes vary based on the selected `"chunk_method"`: |
|
|
|
- `"chunk_method"`=`"naive"`: |
|
|
|
`{"chunk_token_num":128,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"user_raptor":False}}`. |
|
|
|
`{"chunk_token_num":128,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"use_raptor":False}}`. |
|
|
|
- `chunk_method`=`"qa"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"manuel"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"table"`: |
|
|
|
`None` |
|
|
|
- `chunk_method`=`"paper"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"book"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"laws"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"presentation"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"picture"`: |
|
|
|
`None` |
|
|
|
- `chunk_method`=`"one"`: |
|
|
|
@@ -543,21 +543,21 @@ A `Document` object contains the following attributes: |
|
|
|
- `status`: `str` Reserved for future use. |
|
|
|
- `parser_config`: `ParserConfig` Configuration object for the parser. Its attributes vary based on the selected `chunk_method`: |
|
|
|
- `chunk_method`=`"naive"`: |
|
|
|
`{"chunk_token_num":128,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"user_raptor":False}}`. |
|
|
|
`{"chunk_token_num":128,"delimiter":"\\n","html4excel":False,"layout_recognize":True,"raptor":{"use_raptor":False}}`. |
|
|
|
- `chunk_method`=`"qa"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"manuel"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"table"`: |
|
|
|
`None` |
|
|
|
- `chunk_method`=`"paper"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"book"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"laws"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"presentation"`: |
|
|
|
`{"raptor": {"user_raptor": False}}` |
|
|
|
`{"raptor": {"use_raptor": False}}` |
|
|
|
- `chunk_method`=`"picure"`: |
|
|
|
`None` |
|
|
|
- `chunk_method`=`"one"`: |