Balances recall vs. context size.
Chunking splits documents into smaller segments based on various strategies like fixed size, semantic boundaries, or structural elements. These chunks become the basic units for embedding and retrieval, determining how information is packaged for the LLM.
The way you chunk documents significantly impacts retrieval quality and response accuracy. Good chunking preserves context, reduces noise, ensures that relevant information is retrieved together, and fits within LLM context windows.
Select a document to chunk
Vector databases are specialized database systems designed to store and query high-dimensional vecto...
Embeddings are dense vector representations of words, sentences, or documents in a continuous vector...
AGREEMENT OF SALE THIS AGREEMENT made this 15th day of June, 2023, between ABC Corporation ("Seller...
Configure how the document is split into chunks
Splits text at paragraph boundaries, preserving the natural structure of the document.
When enabled, paragraphs are kept intact and combined until they reach the chunk size limit. When disabled, large paragraphs are split into smaller chunks using the overlap setting.
Building a robust Chunking Design solution is challenging. Respeak's Enterprise RAG Platform handles this complexity for you.