Module: sources
Interfaces
- Branch
- DevCenterEntryTag
- MakeGitDataSourceParams
- MakeLangChainDocumentLoaderDataSourceParams
- MakeMongoDbUniversityDataSourceParams
- ProjectBase
- SnootyProject
Type Aliases
Cell
Ƭ Cell: Object
Type declaration
Name | Type |
---|---|
columnName? | string |
content | SnootyNode |
Defined in
mongodb-rag-ingest/src/sources/snooty/renderSnootyTable.ts:66
DataSource
Ƭ DataSource: Object
Represents a source of page data.
Type declaration
Name | Type | Description |
---|---|---|
name | string | The unique name among registered data sources. |
fetchPages | () => Promise <Page []> | Fetches pages in the data source. |
Defined in
mongodb-rag-ingest/src/sources/DataSource.ts:6
DevCenterEntry
Ƭ DevCenterEntry: Object
Type declaration
Name | Type |
---|---|
calculated_slug | string |
content | string | null |
description | string |
name | string |
tags | DevCenterEntryTag [] |
type | string |
Defined in
mongodb-rag-ingest/src/sources/DevCenterDataSource.ts:18
DevCenterProjectConfig
Ƭ DevCenterProjectConfig: ProjectBase
& { baseUrl
: string
; collectionName
: string
; databaseName
: string
; type
: "devcenter"
}
Defined in
mongodb-rag-ingest/src/sources/DevCenterDataSource.ts:10
FilterFunc
Ƭ FilterFunc: (path
: string
) => boolean
Type declaration
▸ (path
): boolean
Parameters
Name | Type |
---|---|
path | string |
Returns
boolean
Defined in
mongodb-rag-ingest/src/sources/GitDataSource.ts:159
GetSnootyProjectsResponse
Ƭ GetSnootyProjectsResponse: Object
Schema for API response from https://snooty-data-api.mongodb.com/prod/projects
Type declaration
Name | Type |
---|---|
data | SnootyProject [] |
Defined in
mongodb-rag-ingest/src/sources/snooty/SnootyProjectsInfo.ts:12
HandleHtmlPageFuncOptions
Ƭ HandleHtmlPageFuncOptions: Object
Type declaration
Name | Type | Description |
---|---|---|
extractMetadata? | (domDoc : Document ) => Record <string , unknown > | Extract metadata from page DOM. Added to the Page.metadata field. If a in the result of extractMetadata() is the same as a key in metadata , the extractMetadata() key will override it. |
extractTitle? | (domDoc : Document ) => string | undefined | Extract Page.title from page content and path. |
metadata? | PageMetadata | Page.metadata passed from config. Included in all documents |
pathToPageUrl | (path : string ) => string | Construct the Page.url from page path. |
postProcessMarkdown? | (markdown : string ) => Promise <string > | Transform Markdown once it's been generated |
removeElements | (domDoc : Document ) => Element [] | Returns an array of DOM elements to be removed from the parsed document. |
Defined in
mongodb-rag-ingest/src/sources/handleHtmlDocument.ts:6
HandlePageFunc
Ƭ HandlePageFunc: (path
: string
, content
: string
) => Promise
<undefined
| Omit
<Page
, "sourceName"
> | Omit
<Page
, "sourceName"
>[]>
Type declaration
▸ (path
, content
): Promise
<undefined
| Omit
<Page
, "sourceName"
> | Omit
<Page
, "sourceName"
>[]>
Function to convert a file in the repo into a Page
or Page[]
.
Parameters
Name | Type | Description |
---|---|---|
path | string | Path to file in repo |
content | string | Contents of file in repo |
Returns
Promise
<undefined
| Omit
<Page
, "sourceName"
> | Omit
<Page
, "sourceName"
>[]>
Defined in
mongodb-rag-ingest/src/sources/GitDataSource.ts:20
LocallySpecifiedSnootyProjectConfig
Ƭ LocallySpecifiedSnootyProjectConfig: Omit
<SnootyProjectConfig
, "baseUrl"
| "currentBranch"
| "version"
> & { baseUrl?
: string
; currentBranch?
: string
; versionNameOverride?
: string
}
Specifies a locally-overrideable Snooty project configuration.
baseUrl
and currentBranch
, if undefined, will be filled in by the Snooty
Data API GET projects endpoint. You can set them yourself to override the data
in the Snooty Data API. currentBranch
will be the name of the first branch
entry with isStableBranch
set to true in the Data API response.
Defined in
mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:86
MakeCodeOnGithubTextDataSourceParams
Ƭ MakeCodeOnGithubTextDataSourceParams: Omit
<MakeGitHubDataSourceArgs
, "handleDocumentInRepo"
> & { metadata?
: PageMetadata
}
Defined in
mongodb-rag-ingest/src/sources/CodeOnGithubTextDataSource.ts:9
MakeGitHubDataSourceArgs
Ƭ MakeGitHubDataSourceArgs: Object
Type declaration
Name | Type | Description |
---|---|---|
filter? | MakeGitDataSourceParams ["filter" ] | Filter function to filter out files from the repo. Using this overrides the repoLoaderOptions.ignorePaths option. Note that file paths will have a leading slash (e.g. /somedir/somefile.txt ). |
name | string | The data source name. |
repoLoaderOptions? | Partial <GithubRepoLoaderParams > | The branch to fetch. |
repoUrl | string | The GitHub repo URL. |
handleDocumentInRepo | (document : Document <{ source : string }>) => Promise <undefined | Omit <Page , "sourceName" > | Omit <Page , "sourceName" >[]> | Handle a given file in the repo. Any number of Pages can be returned for a given file. The exact details depend on the given repo. Return undefined to skip this document. Page sourceName will be overridden by the name passed to makeGitHubDataSource. |
Defined in
mongodb-rag-ingest/src/sources/GitHubDataSource.ts:7
MakeMdOnGithubDataSourceParams
Ƭ MakeMdOnGithubDataSourceParams: Omit
<MakeGitHubDataSourceArgs
, "handleDocumentInRepo"
> & { extractMetadata?
: (pageContent
: string
, frontMatter?
: Record
<string
, unknown
>) => PageMetadata
; extractTitle?
: (pageContent
: string
, frontMatter?
: Record
<string
, unknown
>) => string
| undefined
; filter?
: MakeGitHubDataSourceArgs
["filter"
] ; frontMatter?
: { format?
: string
; process
: boolean
; separator?
: string
} ; metadata?
: PageMetadata
; pathToPageUrl
: (pathInRepo
: string
, frontMatter?
: Record
<string
, unknown
>) => string
}
Defined in
mongodb-rag-ingest/src/sources/MdOnGithubDataSource.ts:9
MakeSnootyDataSourceArgs
Ƭ MakeSnootyDataSourceArgs: Object
Type declaration
Name | Type | Description |
---|---|---|
name | string | The data source name. |
project | SnootyProjectConfig | The configuration for the Snooty project. |
snootyDataApiBaseUrl | string | The base URL for Snooty Data API requests. |
version? | string | - |
Defined in
mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:95
Row
Ƭ Row: Object
Type declaration
Name | Type |
---|---|
cells | Cell [] |
Defined in
mongodb-rag-ingest/src/sources/snooty/renderSnootyTable.ts:71
SnootyManifestEntry
Ƭ SnootyManifestEntry: Object
Type declaration
Name | Type |
---|---|
data | unknown |
type | "page" | "timestamp" | "metadata" | "asset" |
Defined in
mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:13
SnootyNode
Ƭ SnootyNode: Object
A node in the Snooty AST.
Index signature
▪ [key: string
]: unknown
Type declaration
Name | Type |
---|---|
children? | (SnootyNode | SnootyTextNode )[] |
options? | Record <string , unknown > |
type | string |
Defined in
mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:29
SnootyPageData
Ƭ SnootyPageData: Object
A page in the Snooty manifest.
Type declaration
Name | Type |
---|---|
ast | SnootyNode |
deleted | boolean |
page_id | string |
tags? | string [] |
Defined in
mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:48
SnootyPageEntry
Ƭ SnootyPageEntry: SnootyManifestEntry
& { data
: SnootyPageData
; type
: "page"
}
Represents a page entry in a Snooty manifest file.
Defined in
mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:21
SnootyProjectConfig
Ƭ SnootyProjectConfig: ProjectBase
& { baseUrl
: string
; currentBranch
: string
; type
: "snooty"
}
Defined in
mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:55
SnootyProjectsInfo
Ƭ SnootyProjectsInfo: Object
Type declaration
Name | Type |
---|---|
getBaseUrl | (args : { branchName : string ; projectName : string }) => Promise <string > |
getCurrentBranch | (args : { projectName : string }) => Promise <Branch > |
getCurrentVersionName | (args : { projectName : string }) => Promise <undefined | string > |
Defined in
mongodb-rag-ingest/src/sources/snooty/SnootyProjectsInfo.ts:16
SnootyTextNode
Ƭ SnootyTextNode: SnootyNode
& { children
: never
; type
: "text"
; value
: string
}
A Snooty AST node with a text value.
Defined in
mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:39
Table
Ƭ Table: Object
Type declaration
Name | Type |
---|---|
dataRows | Row [] |
headerRows | Row [] |
Defined in
mongodb-rag-ingest/src/sources/snooty/renderSnootyTable.ts:75
Functions
extractHtmlH1
▸ extractHtmlH1(domDoc
): undefined
| string
Parameters
Name | Type |
---|---|
domDoc | Document |
Returns
undefined
| string
Defined in
mongodb-rag-ingest/src/sources/handleHtmlDocument.ts:95
extractTags
▸ extractTags(tags
): string
[]
Extract relevant tags from dev center entry tags
Parameters
Name | Type |
---|---|
tags | DevCenterEntryTag [] |
Returns
string
[]
Defined in
mongodb-rag-ingest/src/sources/DevCenterDataSource.ts:98
filterOnlyPublicActiveTiCatalogItems
▸ filterOnlyPublicActiveTiCatalogItems(item
): boolean
Filter function to only include public, published, and non-legacy MongoDB University content.
⚠️ Important ⚠️
You should include only this content or a subset of it in externally facing applications.
Parameters
Name | Type |
---|---|
item | TiCatalogItem |
Returns
boolean
Defined in
mongodb-rag-ingest/src/sources/mongodb-university/MongoDbUniversityDataSource.ts:38
getAcquitTestsFromGithubRepo
▸ getAcquitTestsFromGithubRepo(repoUrl
, repoLoaderOptions
): Promise
<string
[]>
Parameters
Name | Type |
---|---|
repoUrl | string |
repoLoaderOptions | Partial <GithubRepoLoaderParams > |
Returns
Promise
<string
[]>
Defined in
mongodb-rag-ingest/src/sources/AcquitRequireMdOnGithubDataSource.ts:96
getRelevantFilePathsInDir
▸ getRelevantFilePathsInDir(directoryPath
, filter
, fileList?
): string
[]
Parameters
Name | Type | Default value |
---|---|---|
directoryPath | string | undefined |
filter | FilterFunc | undefined |
fileList | string [] | [] |
Returns
string
[]
Defined in
mongodb-rag-ingest/src/sources/GitDataSource.ts:161
getRelevantFilesAsStrings
▸ getRelevantFilesAsStrings(«destructured»
): Promise
<Record
<string
, string
>>
Parameters
Name | Type |
---|---|
«destructured» | Object |
› directoryPath | string |
› filter | FilterFunc |
Returns
Promise
<Record
<string
, string
>>
Defined in
mongodb-rag-ingest/src/sources/GitDataSource.ts:182
getRepoLocally
▸ getRepoLocally(«destructured»
): Promise
<void
>
Parameters
Name | Type |
---|---|
«destructured» | Object |
› localPath | string |
› options? | TaskOptions |
› repoPath | string |
Returns
Promise
<void
>
Defined in
mongodb-rag-ingest/src/sources/GitDataSource.ts:136
getTitleFromSnootyAst
▸ getTitleFromSnootyAst(node
): undefined
| string
Parameters
Name | Type |
---|---|
node | SnootyNode |
Returns
undefined
| string
Defined in
mongodb-rag-ingest/src/sources/snooty/snootyAstToMd.ts:196
getTitleFromSnootyOpenApiSpecAst
▸ getTitleFromSnootyOpenApiSpecAst(node
): undefined
| string
Parameters
Name | Type |
---|---|
node | SnootyNode |
Returns
undefined
| string
Defined in
mongodb-rag-ingest/src/sources/snooty/snootyAstToOpenApiSpec.ts:43
handleHtmlDocument
▸ handleHtmlDocument(path
, content
, options
): Promise
<Omit
<Page
, "sourceName"
>>
Parameters
Name | Type |
---|---|
path | string |
content | string |
options | HandleHtmlPageFuncOptions |
Returns
Promise
<Omit
<Page
, "sourceName"
>>
Defined in
mongodb-rag-ingest/src/sources/handleHtmlDocument.ts:30
handlePage
▸ handlePage(page
, «destructured»
): Promise
<Page
>
Parameters
Name | Type |
---|---|
page | SnootyPageData |
«destructured» | Object |
› baseUrl | string |
› productName? | string |
› sourceName | string |
› tags | string [] |
› version? | string |
Returns
Promise
<Page
>
Defined in
mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:264
makeAcquitRequireMdOnGithubDataSource
▸ makeAcquitRequireMdOnGithubDataSource(«destructured»
): Promise
<DataSource
>
Loads an MD/Acquit docs site from a GitHub repo. Acquit is a tool for writing tests in comments, and then extracting them into a test suite. This function loads the tests from the repo, and then transforms the document content to include tests from the test suite in the document. Acquit is used in the Mongoose ODM documentation. This data source assumes that the test files are in the same repo as the docs.
Parameters
Name | Type |
---|---|
«destructured» | Omit <MakeGitHubDataSourceArgs , "handleDocumentInRepo" > & { acquitCodeBlockLanguageReplacement? : string ; metadata? : PageMetadata ; pathToPageUrl : (pathInRepo : string ) => string ; testFileLoaderOptions : Partial <GithubRepoLoaderParams > } |
Returns
Promise
<DataSource
>
Defined in
mongodb-rag-ingest/src/sources/AcquitRequireMdOnGithubDataSource.ts:21
makeCodeOnGithubTextDataSource
▸ makeCodeOnGithubTextDataSource(«destructured»
): Promise
<DataSource
>
Loads source code files from a GitHub repo.
Parameters
Name | Type |
---|---|
«destructured» | MakeCodeOnGithubTextDataSourceParams |
Returns
Promise
<DataSource
>
Defined in
mongodb-rag-ingest/src/sources/CodeOnGithubTextDataSource.ts:21
makeDevCenterDataSource
▸ makeDevCenterDataSource(«destructured»
): Promise
<DataSource
>
Parameters
Name | Type |
---|---|
«destructured» | DevCenterProjectConfig |
Returns
Promise
<DataSource
>
Defined in
mongodb-rag-ingest/src/sources/DevCenterDataSource.ts:32
makeDevCenterPage
▸ makeDevCenterPage(document
, name
, baseUrl
): Page
Parameters
Name | Type |
---|---|
document | DevCenterEntry |
name | string |
baseUrl | string |
Returns
Page
Defined in
mongodb-rag-ingest/src/sources/DevCenterDataSource.ts:67
makeDevCenterPageBody
▸ makeDevCenterPageBody(«destructured»
): string
Parameters
Name | Type |
---|---|
«destructured» | Object |
› content | string |
› title? | string |
Returns
string
Defined in
mongodb-rag-ingest/src/sources/DevCenterDataSource.ts:109
makeGitDataSource
▸ makeGitDataSource(«destructured»
): DataSource
Loads and processes files from a Git repo (can be hosted anywhere).
Parameters
Name | Type |
---|---|
«destructured» | MakeGitDataSourceParams |
Returns
Defined in
mongodb-rag-ingest/src/sources/GitDataSource.ts:61
makeGitHubDataSource
▸ makeGitHubDataSource(«destructured»
): DataSource
Loads an arbitrary GitHub repo and converts its contents into pages.
Parameters
Name | Type |
---|---|
«destructured» | MakeGitHubDataSourceArgs |
Returns
Defined in
mongodb-rag-ingest/src/sources/GitHubDataSource.ts:50
makeLangChainDocumentLoaderDataSource
▸ makeLangChainDocumentLoaderDataSource(«destructured»
): DataSource
Create a data source that loads pages from a Langchain document loader.
Parameters
Name | Type |
---|---|
«destructured» | MakeLangChainDocumentLoaderDataSourceParams |
Returns
Defined in
mongodb-rag-ingest/src/sources/LangchainDocumentLoaderDataSource.ts:37
makeMdOnGithubDataSource
▸ makeMdOnGithubDataSource(«destructured»
): Promise
<DataSource
>
Loads an .md/.mdx docs site from a GitHub repo.
Parameters
Name | Type |
---|---|
«destructured» | MakeMdOnGithubDataSourceParams |
Returns
Promise
<DataSource
>
Defined in
mongodb-rag-ingest/src/sources/MdOnGithubDataSource.ts:72
makeMongoDbUniversityDataSource
▸ makeMongoDbUniversityDataSource(params
): DataSource
Data source constructor function for ingesting data from the MongoDB University Data API. (This is an internal API.)
Parameters
Name | Type |
---|---|
params | MakeMongoDbUniversityDataSourceParams |
Returns
Defined in
mongodb-rag-ingest/src/sources/mongodb-university/MongoDbUniversityDataSource.ts:69
makeRandomTmp
▸ makeRandomTmp(prefix
): string
Parameters
Name | Type | Description |
---|---|---|
prefix | string | prefix for the temporary directory name |
Returns
string
Defined in
mongodb-rag-ingest/src/sources/GitDataSource.ts:127
makeRstOnGitHubDataSource
▸ makeRstOnGitHubDataSource<MetadataType
>(«destructured»
): DataSource
Loads an rST docs site from a GitHub repo.
Type parameters
Name | Type |
---|---|
MetadataType | extends Record <string , unknown > |
Parameters
Name | Type |
---|---|
«destructured» | Omit <MakeGitHubDataSourceArgs , "handleDocumentInRepo" > & { getMetadata? : (info : { bodyMarkdown : string ; document : Document <{ source : string }> ; title? : string ; url : string }) => undefined | MetadataType ; pathToPageUrl : (pathInRepo : string ) => string } |
Returns
Defined in
mongodb-rag-ingest/src/sources/RstOnGitHubDataSource.ts:13
makeSnootyDataSource
▸ makeSnootyDataSource(«destructured»
): DataSource
& { _baseUrl
: string
; _currentBranch
: string
; _snootyProjectName
: string
; _version?
: string
}
Parameters
Name | Type |
---|---|
«destructured» | MakeSnootyDataSourceArgs |
Returns
DataSource
& { _baseUrl
: string
; _currentBranch
: string
; _snootyProjectName
: string
; _version?
: string
}
Defined in
mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:114
makeSnootyProjectsInfo
▸ makeSnootyProjectsInfo(«destructured»
): Promise
<SnootyProjectsInfo
& { _data
: SnootyProject
[] }>
Creates a SnootyProjectsInfo object from the Snooty Data API GET projects endpoint.
Parameters
Name | Type |
---|---|
«destructured» | Object |
› snootyDataApiBaseUrl | string |
Returns
Promise
<SnootyProjectsInfo
& { _data
: SnootyProject
[] }>
Defined in
mongodb-rag-ingest/src/sources/snooty/SnootyProjectsInfo.ts:33
pageBlobUrl
▸ pageBlobUrl(args
): string
Parameters
Name | Type |
---|---|
args | Object |
args.branch | string |
args.filePath? | string | string [] |
args.repoUrl | string |
Returns
string
Defined in
mongodb-rag-ingest/src/sources/CodeOnGithubTextDataSource.ts:73
parseSnootyTable
▸ parseSnootyTable(node
): Table
Turns a Snooty AST table into a Table data structure.
Tables in Snooty are represented as lists of lists under a list-table
directive. There is no concept for "rows" or "cells" in the AST, which adds
friction when working with the Snooty AST directly -- especially considering
that list already have rendering logic that differs from tables. Having an
intermediate data structure Table
with Row
s and Cell
s makes it easier to
work with and render.
Parameters
Name | Type |
---|---|
node | SnootyNode |
Returns
Defined in
mongodb-rag-ingest/src/sources/snooty/renderSnootyTable.ts:90
prepareSnootySources
▸ prepareSnootySources(«destructured»
): Promise
<DataSource
& { _baseUrl
: string
; _currentBranch
: string
; _snootyProjectName
: string
; _version?
: string
}[]>
Fill the details of the defined Snooty data sources with the info in the Snooty Data API projects endpoint.
Parameters
Name | Type |
---|---|
«destructured» | Object |
› projects | LocallySpecifiedSnootyProjectConfig [] |
› snootyDataApiBaseUrl | string |
Returns
Promise
<DataSource
& { _baseUrl
: string
; _currentBranch
: string
; _snootyProjectName
: string
; _version?
: string
}[]>
Defined in
mongodb-rag-ingest/src/sources/snooty/SnootyProjectsInfo.ts:102
removeMarkdownImagesAndLinks
▸ removeMarkdownImagesAndLinks(content
): string
Utility function to remove markdown images and links from a string. Useful if you do not want to include images and links in content, which can add significantly add to the token count when creating embeddings while also diluting the semantic meaning of the content.
Parameters
Name | Type |
---|---|
content | string |
Returns
string
Defined in
mongodb-rag-ingest/src/sources/removeMarkdownImagesAndLinks.ts:7
renderCells
▸ renderCells(cells
, options
): string
Parameters
Name | Type |
---|---|
cells | Cell [] |
options | RenderTableElementOptions |
Returns
string
Defined in
mongodb-rag-ingest/src/sources/snooty/renderSnootyTable.ts:43
renderRows
▸ renderRows(rows
, options
): string
Parameters
Name | Type |
---|---|
rows | Row [] |
options | RenderTableElementOptions |
Returns
string
Defined in
mongodb-rag-ingest/src/sources/snooty/renderSnootyTable.ts:34
renderSnootyTable
▸ renderSnootyTable(node
): string
Return a string of MD from a Snooty AST node.
Parameters
Name | Type |
---|---|
node | SnootyNode |
Returns
string
Defined in
mongodb-rag-ingest/src/sources/snooty/renderSnootyTable.ts:10
rstToSnootyAst
▸ rstToSnootyAst(rst
): SnootyNode
Parameters
Name | Type |
---|---|
rst | string |
Returns
Defined in
mongodb-rag-ingest/src/sources/snooty/rstToSnootyAst.ts:5
snootyAstToMd
▸ snootyAstToMd(node
): string
Renders a snooty AST node as markdown.
Parameters
Name | Type |
---|---|
node | SnootyNode |
Returns
string
Defined in
mongodb-rag-ingest/src/sources/snooty/snootyAstToMd.ts:8
snootyAstToOpenApiSpec
▸ snootyAstToOpenApiSpec(node
): Promise
<string
>
Parameters
Name | Type |
---|---|
node | SnootyNode |
Returns
Promise
<string
>
Defined in
mongodb-rag-ingest/src/sources/snooty/snootyAstToOpenApiSpec.ts:4