Namespace: Chunk
ragCore.Chunk
Type Aliases
ChunkFunc
Ƭ ChunkFunc: (page
: Page
, options?
: Partial
<ChunkOptions
>) => Promise
<ContentChunk
[]>
Type declaration
▸ (page
, options?
): Promise
<ContentChunk
[]>
A ChunkFunc is a function that takes a page and returns it in chunks.
Parameters
Name | Type |
---|---|
page | Page |
options? | Partial <ChunkOptions > |
Returns
Promise
<ContentChunk
[]>
Defined in
mongodb-rag-core/build/chunk/chunkPage.d.ts:8
ChunkMetadataGetter
Ƭ ChunkMetadataGetter<T
>: (args
: { chunk
: Omit
<ContentChunk
, "tokenCount"
> ; metadata?
: T
; page
: Page
; text
: string
}) => Promise
<T
>
Type parameters
Name | Type |
---|---|
T | extends Record <string , unknown > = Record <string , unknown > |
Type declaration
▸ (args
): Promise
<T
>
Parameters
Name | Type | Description |
---|---|---|
args | Object | - |
args.chunk | Omit <ContentChunk , "tokenCount" > | - |
args.metadata? | T | Previous metadata, if any. Omitting this from the return value should not overwrite previous metadata. |
args.page | Page | - |
args.text | string | The text of the chunk without metadata. |
Returns
Promise
<T
>
Defined in
mongodb-rag-core/build/chunk/ChunkTransformer.d.ts:6
ChunkOptions
Ƭ ChunkOptions: Object
Options for converting a Page
into ContentChunk[]
.
Type declaration
Name | Type | Description |
---|---|---|
chunkOverlap | number | Number of tokens to overlap between chunks. If this is 0, chunks will not overlap. If this is greater than 0, chunks will overlap by this number of tokens. |
maxChunkSize | number | Maximum chunk size before transform function is applied to it. If Page has more tokens than this number, it is split into smaller chunks. |
minChunkSize? | number | Minimum chunk size before transform function is applied to it. If a chunk has fewer tokens than this number, it is discarded before ingestion. You can use this as a vector search optimization to avoid including chunks with very few tokens and thus very little semantic meaning. Example You might set this to 15 to avoid including chunks that are just a few characters or words. For instance, you likely would not want to set a chunk that is just the closing of a code block (), which occurs not infrequently if chunking using the Langchain RecursiveCharacterTextSplitter. Chunk 1: ````text py foo = "bar" # more semantically relevant python code... Chunk 2: text ``` ```` |
tokenizer | SomeTokenizer | Tokenizer to use to count number of tokens in text. |
transform? | ChunkTransformer | Transform to be applied to each chunk as it is produced. Provides the opportunity to prepend metadata, etc. |
yamlChunkSize? | number | If provided, this will override the maxChunkSize for openapi-yaml pages. This is useful because openapi-yaml pages tend to be very large, and we want to split them into smaller chunks than the default maxChunkSize. |
Defined in
mongodb-rag-core/build/chunk/chunkPage.d.ts:12
ChunkTransformer
Ƭ ChunkTransformer: (chunk
: Omit
<ContentChunk
, "tokenCount"
>, details
: { page
: Page
}) => Promise
<Omit
<ContentChunk
, "tokenCount"
>>
Type declaration
▸ (chunk
, details
): Promise
<Omit
<ContentChunk
, "tokenCount"
>>
Parameters
Name | Type |
---|---|
chunk | Omit <ContentChunk , "tokenCount" > |
details | Object |
details.page | Page |
Returns
Promise
<Omit
<ContentChunk
, "tokenCount"
>>
Defined in
mongodb-rag-core/build/chunk/ChunkTransformer.d.ts:3
ContentChunk
Ƭ ContentChunk: Omit
<EmbeddedContent
, "embeddings"
| "updated"
>
Defined in
mongodb-rag-core/build/chunk/chunkPage.d.ts:4
SomeTokenizer
Ƭ SomeTokenizer: Object
Type declaration
Name | Type |
---|---|
encode | (text : string ) => { bpe : number [] ; text : string [] } |
Defined in
mongodb-rag-core/build/chunk/chunkPage.d.ts:66
Variables
defaultOpenApiSpecYamlChunkOptions
• Const
defaultOpenApiSpecYamlChunkOptions: ChunkOptions
Defined in
mongodb-rag-core/build/chunk/chunkOpenApiSpecYaml.d.ts:2
Functions
chunkCode
▸ chunkCode(page
, options?
): Promise
<ContentChunk
[]>
A ChunkFunc is a function that takes a page and returns it in chunks.
Parameters
Name | Type |
---|---|
page | Page |
options? | Partial <ChunkOptions > |
Returns
Promise
<ContentChunk
[]>
Defined in
mongodb-rag-core/build/chunk/chunkPage.d.ts:8
chunkMd
▸ chunkMd(page
, options?
): Promise
<ContentChunk
[]>
A ChunkFunc is a function that takes a page and returns it in chunks.
Parameters
Name | Type |
---|---|
page | Page |
options? | Partial <ChunkOptions > |
Returns
Promise
<ContentChunk
[]>
Defined in
mongodb-rag-core/build/chunk/chunkPage.d.ts:8
chunkOpenApiSpecYaml
▸ chunkOpenApiSpecYaml(page
, options?
): Promise
<ContentChunk
[]>
A ChunkFunc is a function that takes a page and returns it in chunks.
Parameters
Name | Type |
---|---|
page | Page |
options? | Partial <ChunkOptions > |
Returns
Promise
<ContentChunk
[]>
Defined in
mongodb-rag-core/build/chunk/chunkPage.d.ts:8