Vous ne pouvez pas sélectionner plus de 25 sujets Les noms de sujets doivent commencer par une lettre ou un nombre, peuvent contenir des tirets ('-') et peuvent comporter jusqu'à 35 caractères.

accelerate_question_answering.mdx 2.9KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
  1. ---
  2. sidebar_position: 2
  3. slug: /accelerate_question_answering
  4. ---
  5. # Accelerate answering
  6. import APITable from '@site/src/components/APITable';
  7. A checklist to speed up question answering.
  8. ---
  9. Please note that some of your settings may consume a significant amount of time. If you often find that your question answering is time-consuming, here is a checklist to consider:
  10. - In the **Prompt Engine** tab of your **Chat Configuration** dialogue, disabling **Multi-turn optimization** will reduce the time required to get an answer from the LLM.
  11. - In the **Prompt Engine** tab of your **Chat Configuration** dialogue, leaving the **Rerank model** field empty will significantly decrease retrieval time.
  12. - When using a rerank model, ensure you have a GPU for acceleration; otherwise, the reranking process will be *prohibitively* slow.
  13. :::tip NOTE
  14. Please note that rerank models are essential in certain scenarios. There is always a trade-off between speed and performance; you must weigh the pros against cons for your specific case.
  15. :::
  16. - In the **Assistant Setting** tab of your **Chat Configuration** dialogue, disabling **Keyword analysis** will reduce the time to receive an answer from the LLM.
  17. - When chatting with your chat assistant, click the light bulb icon above the *current* dialogue and scroll down the popup window to view the time taken for each task:
  18. ![enlighten](https://github.com/user-attachments/assets/fedfa2ee-21a7-451b-be66-20125619923c)
  19. ```mdx-code-block
  20. <APITable>
  21. ```
  22. | Item name | Description |
  23. | ----------------- | --------------------------------------------------------------------------------------------- |
  24. | Total | Total time spent on this conversation round, including chunk retrieval and answer generation. |
  25. | Check LLM | Time to validate the specified LLM. |
  26. | Create retriever | Time to create a chunk retriever. |
  27. | Bind embedding | Time to initialize an embedding model instance. |
  28. | Bind LLM | Time to initialize an LLM instance. |
  29. | Tune question | Time to optimize the user query using the context of the mult-turn conversation. |
  30. | Bind reranker | Time to initialize an reranker model instance for chunk retrieval. |
  31. | Generate keywords | Time to extract keywords from the user query. |
  32. | Retrieval | Time to retrieve the chunks. |
  33. | Generate answer | Time to generate the answer. |
  34. ```mdx-code-block
  35. </APITable>
  36. ```