Step 1. Create and Upload Content
The knowledge base feature offers a simple and effective way to store and manage external data, allowing agents to interact with specific datasets, thereby improving the accuracy and utility of models' responses.
Upon data upload, iSiri automatically segments the uploaded documents into content fragments and retrieves the most relevant content through various means. The LLM models then uses the searched and recalled content to generate the final response. iSiri's knowledge base effectively mitigates issues like model hallucinations and insufficient domain-specific knowledge, enhancing response accuracy.
The first step is to upload the content to the knowledge base. iSiri supports importing text and table data, and offers multiple ways to do so. Proper segmentation of uploaded content can improve the relevance of the recalled information, thereby increasing the response accuracy of large models.
It is recommended to familiarize yourself with the use cases and import methods for different knowledge types before uploading content, to better manage the knowledge base.
Text vs Table Comparison
Aspect
Text Type
Table Type
Use Case
Text-based knowledge bases allow retrieval and recall of content fragments for applications like Q&A.
Table-based knowledge bases support indexed column matching (row-wise), and can handle NL2SQL queries and calculations.
Import Methods
Local files (.txt, .pdf, .doc, .docx), or manual input.
Local tables (.csv, .xlsx), API integration, third-party (Feishu tables), or manual input.
Segmentation
Automatic, GraphRAG, or Customize segmentation.
Default to row-based segmentation; no further setup is needed.
Upload Text Content
Follow these steps to upload text content:
Log in to the iSiri platform.
In the left navigation panel, choose "Prompt Studio" and click "Knowledge Base" at the bottom
On the "Knowledge Base" page, click "+ Create" at the top right.
On the "Create Knowledge" page, complete the content upload by filling in the Name, Tag (easy to sort files), Describe (the purpose or functionality), Permissions (this content is used by public or only for private usage), Embeddings (model to vectorize the content. One agent should search through files with the same embeddings model).
Select "File Type" accordingly and click "Next step."
Choose the "Segment mode," then token counts, estimation fee, and Segmented Preview will be visible.
If the segmented content preview is as expected, you may click "Confirm" and finish the creation process.
Segmentation Comparison
Segment mode
Automatic
GraphRAG
Customize
Description
Automatically set segmentation rules and preprocessing rules by iSiri.
The latest RAG segmentation method maximizes the preservation of text content relationships (high cost).
Customize parameters such as segment identifier, segmentation length, and preprocessing rules.
Use cases
There is no particular requirement for segmentation. Easiest choice.
Reserve complex relations in content, the most efficient mode for information search, but the highest cost.
The structure of the content file is unique and well-understood.
Recall method(view details in Retrieval Test)
Mixed, Vector, and Full-text recall methods are supported.
Local and Global recall methods are supported.
Mixed, Vector, and Full-text recall methods are supported.
Rerank
Supported.
Default, not able to re-rank
Supported
Last updated