Skip to main content

Module: sources

Interfaces

Type Aliases

Cell

Ƭ Cell: Object

Type declaration

NameType
columnName?string
contentSnootyNode

Defined in

mongodb-rag-ingest/src/sources/snooty/renderSnootyTable.ts:66


DataSource

Ƭ DataSource: Object

Represents a source of page data.

Type declaration

NameTypeDescription
namestringThe unique name among registered data sources.
fetchPages() => Promise<Page[]>Fetches pages in the data source.

Defined in

mongodb-rag-ingest/src/sources/DataSource.ts:6


DevCenterEntry

Ƭ DevCenterEntry: Object

Type declaration

NameType
calculated_slugstring
contentstring | null
descriptionstring
namestring
tagsDevCenterEntryTag[]
typestring

Defined in

mongodb-rag-ingest/src/sources/DevCenterDataSource.ts:18


DevCenterProjectConfig

Ƭ DevCenterProjectConfig: ProjectBase & { baseUrl: string ; collectionName: string ; databaseName: string ; type: "devcenter" }

Defined in

mongodb-rag-ingest/src/sources/DevCenterDataSource.ts:10


FilterFunc

Ƭ FilterFunc: (path: string) => boolean

Type declaration

▸ (path): boolean

Parameters
NameType
pathstring
Returns

boolean

Defined in

mongodb-rag-ingest/src/sources/GitDataSource.ts:159


GetSnootyProjectsResponse

Ƭ GetSnootyProjectsResponse: Object

Schema for API response from https://snooty-data-api.mongodb.com/prod/projects

Type declaration

NameType
dataSnootyProject[]

Defined in

mongodb-rag-ingest/src/sources/snooty/SnootyProjectsInfo.ts:12


HandleHtmlPageFuncOptions

Ƭ HandleHtmlPageFuncOptions: Object

Type declaration

NameTypeDescription
extractMetadata?(domDoc: Document) => Record<string, unknown>Extract metadata from page DOM. Added to the Page.metadata field. If a in the result of extractMetadata() is the same as a key in metadata, the extractMetadata() key will override it.
extractTitle?(domDoc: Document) => string | undefinedExtract Page.title from page content and path.
metadata?PageMetadataPage.metadata passed from config. Included in all documents
pathToPageUrl(path: string) => stringConstruct the Page.url from page path.
postProcessMarkdown?(markdown: string) => Promise<string>Transform Markdown once it's been generated
removeElements(domDoc: Document) => Element[]Returns an array of DOM elements to be removed from the parsed document.

Defined in

mongodb-rag-ingest/src/sources/handleHtmlDocument.ts:6


HandlePageFunc

Ƭ HandlePageFunc: (path: string, content: string) => Promise<undefined | Omit<Page, "sourceName"> | Omit<Page, "sourceName">[]>

Type declaration

▸ (path, content): Promise<undefined | Omit<Page, "sourceName"> | Omit<Page, "sourceName">[]>

Function to convert a file in the repo into a Page or Page[].

Parameters
NameTypeDescription
pathstringPath to file in repo
contentstringContents of file in repo
Returns

Promise<undefined | Omit<Page, "sourceName"> | Omit<Page, "sourceName">[]>

Defined in

mongodb-rag-ingest/src/sources/GitDataSource.ts:20


LocallySpecifiedSnootyProjectConfig

Ƭ LocallySpecifiedSnootyProjectConfig: Omit<SnootyProjectConfig, "baseUrl" | "currentBranch" | "version"> & { baseUrl?: string ; currentBranch?: string ; versionNameOverride?: string }

Specifies a locally-overrideable Snooty project configuration.

baseUrl and currentBranch, if undefined, will be filled in by the Snooty Data API GET projects endpoint. You can set them yourself to override the data in the Snooty Data API. currentBranch will be the name of the first branch entry with isStableBranch set to true in the Data API response.

Defined in

mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:86


MakeCodeOnGithubTextDataSourceParams

Ƭ MakeCodeOnGithubTextDataSourceParams: Omit<MakeGitHubDataSourceArgs, "handleDocumentInRepo"> & { metadata?: PageMetadata }

Defined in

mongodb-rag-ingest/src/sources/CodeOnGithubTextDataSource.ts:9


MakeGitHubDataSourceArgs

Ƭ MakeGitHubDataSourceArgs: Object

Type declaration

NameTypeDescription
filter?MakeGitDataSourceParams["filter"]Filter function to filter out files from the repo. Using this overrides the repoLoaderOptions.ignorePaths option. Note that file paths will have a leading slash (e.g. /somedir/somefile.txt).
namestringThe data source name.
repoLoaderOptions?Partial<GithubRepoLoaderParams>The branch to fetch.
repoUrlstringThe GitHub repo URL.
handleDocumentInRepo(document: Document<{ source: string }>) => Promise<undefined | Omit<Page, "sourceName"> | Omit<Page, "sourceName">[]>Handle a given file in the repo. Any number of Pages can be returned for a given file. The exact details depend on the given repo. Return undefined to skip this document. Page sourceName will be overridden by the name passed to makeGitHubDataSource.

Defined in

mongodb-rag-ingest/src/sources/GitHubDataSource.ts:7


MakeMdOnGithubDataSourceParams

Ƭ MakeMdOnGithubDataSourceParams: Omit<MakeGitHubDataSourceArgs, "handleDocumentInRepo"> & { extractMetadata?: (pageContent: string, frontMatter?: Record<string, unknown>) => PageMetadata ; extractTitle?: (pageContent: string, frontMatter?: Record<string, unknown>) => string | undefined ; filter?: MakeGitHubDataSourceArgs["filter"] ; frontMatter?: { format?: string ; process: boolean ; separator?: string } ; metadata?: PageMetadata ; pathToPageUrl: (pathInRepo: string, frontMatter?: Record<string, unknown>) => string }

Defined in

mongodb-rag-ingest/src/sources/MdOnGithubDataSource.ts:9


MakeSnootyDataSourceArgs

Ƭ MakeSnootyDataSourceArgs: Object

Type declaration

NameTypeDescription
namestringThe data source name.
projectSnootyProjectConfigThe configuration for the Snooty project.
snootyDataApiBaseUrlstringThe base URL for Snooty Data API requests.
version?string-

Defined in

mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:95


Row

Ƭ Row: Object

Type declaration

NameType
cellsCell[]

Defined in

mongodb-rag-ingest/src/sources/snooty/renderSnootyTable.ts:71


SnootyManifestEntry

Ƭ SnootyManifestEntry: Object

Type declaration

NameType
dataunknown
type"page" | "timestamp" | "metadata" | "asset"

Defined in

mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:13


SnootyNode

Ƭ SnootyNode: Object

A node in the Snooty AST.

Index signature

▪ [key: string]: unknown

Type declaration

NameType
children?(SnootyNode | SnootyTextNode)[]
options?Record<string, unknown>
typestring

Defined in

mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:29


SnootyPageData

Ƭ SnootyPageData: Object

A page in the Snooty manifest.

Type declaration

NameType
astSnootyNode
deletedboolean
page_idstring
tags?string[]

Defined in

mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:48


SnootyPageEntry

Ƭ SnootyPageEntry: SnootyManifestEntry & { data: SnootyPageData ; type: "page" }

Represents a page entry in a Snooty manifest file.

Defined in

mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:21


SnootyProjectConfig

Ƭ SnootyProjectConfig: ProjectBase & { baseUrl: string ; currentBranch: string ; type: "snooty" }

Defined in

mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:55


SnootyProjectsInfo

Ƭ SnootyProjectsInfo: Object

Type declaration

NameType
getBaseUrl(args: { branchName: string ; projectName: string }) => Promise<string>
getCurrentBranch(args: { projectName: string }) => Promise<Branch>
getCurrentVersionName(args: { projectName: string }) => Promise<undefined | string>

Defined in

mongodb-rag-ingest/src/sources/snooty/SnootyProjectsInfo.ts:16


SnootyTextNode

Ƭ SnootyTextNode: SnootyNode & { children: never ; type: "text" ; value: string }

A Snooty AST node with a text value.

Defined in

mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:39


Table

Ƭ Table: Object

Type declaration

NameType
dataRowsRow[]
headerRowsRow[]

Defined in

mongodb-rag-ingest/src/sources/snooty/renderSnootyTable.ts:75

Functions

extractHtmlH1

extractHtmlH1(domDoc): undefined | string

Parameters

NameType
domDocDocument

Returns

undefined | string

Defined in

mongodb-rag-ingest/src/sources/handleHtmlDocument.ts:95


extractTags

extractTags(tags): string[]

Extract relevant tags from dev center entry tags

Parameters

NameType
tagsDevCenterEntryTag[]

Returns

string[]

Defined in

mongodb-rag-ingest/src/sources/DevCenterDataSource.ts:98


filterOnlyPublicActiveTiCatalogItems

filterOnlyPublicActiveTiCatalogItems(item): boolean

Filter function to only include public, published, and non-legacy MongoDB University content.

⚠️ Important ⚠️

You should include only this content or a subset of it in externally facing applications.

Parameters

NameType
itemTiCatalogItem

Returns

boolean

Defined in

mongodb-rag-ingest/src/sources/mongodb-university/MongoDbUniversityDataSource.ts:38


getAcquitTestsFromGithubRepo

getAcquitTestsFromGithubRepo(repoUrl, repoLoaderOptions): Promise<string[]>

Parameters

NameType
repoUrlstring
repoLoaderOptionsPartial<GithubRepoLoaderParams>

Returns

Promise<string[]>

Defined in

mongodb-rag-ingest/src/sources/AcquitRequireMdOnGithubDataSource.ts:96


getRelevantFilePathsInDir

getRelevantFilePathsInDir(directoryPath, filter, fileList?): string[]

Parameters

NameTypeDefault value
directoryPathstringundefined
filterFilterFuncundefined
fileListstring[][]

Returns

string[]

Defined in

mongodb-rag-ingest/src/sources/GitDataSource.ts:161


getRelevantFilesAsStrings

getRelevantFilesAsStrings(«destructured»): Promise<Record<string, string>>

Parameters

NameType
«destructured»Object
› directoryPathstring
› filterFilterFunc

Returns

Promise<Record<string, string>>

Defined in

mongodb-rag-ingest/src/sources/GitDataSource.ts:182


getRepoLocally

getRepoLocally(«destructured»): Promise<void>

Parameters

NameType
«destructured»Object
› localPathstring
› options?TaskOptions
› repoPathstring

Returns

Promise<void>

Defined in

mongodb-rag-ingest/src/sources/GitDataSource.ts:136


getTitleFromSnootyAst

getTitleFromSnootyAst(node): undefined | string

Parameters

NameType
nodeSnootyNode

Returns

undefined | string

Defined in

mongodb-rag-ingest/src/sources/snooty/snootyAstToMd.ts:196


getTitleFromSnootyOpenApiSpecAst

getTitleFromSnootyOpenApiSpecAst(node): undefined | string

Parameters

NameType
nodeSnootyNode

Returns

undefined | string

Defined in

mongodb-rag-ingest/src/sources/snooty/snootyAstToOpenApiSpec.ts:43


handleHtmlDocument

handleHtmlDocument(path, content, options): Promise<Omit<Page, "sourceName">>

Parameters

NameType
pathstring
contentstring
optionsHandleHtmlPageFuncOptions

Returns

Promise<Omit<Page, "sourceName">>

Defined in

mongodb-rag-ingest/src/sources/handleHtmlDocument.ts:30


handlePage

handlePage(page, «destructured»): Promise<Page>

Parameters

NameType
pageSnootyPageData
«destructured»Object
› baseUrlstring
› productName?string
› sourceNamestring
› tagsstring[]
› version?string

Returns

Promise<Page>

Defined in

mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:264


makeAcquitRequireMdOnGithubDataSource

makeAcquitRequireMdOnGithubDataSource(«destructured»): Promise<DataSource>

Loads an MD/Acquit docs site from a GitHub repo. Acquit is a tool for writing tests in comments, and then extracting them into a test suite. This function loads the tests from the repo, and then transforms the document content to include tests from the test suite in the document. Acquit is used in the Mongoose ODM documentation. This data source assumes that the test files are in the same repo as the docs.

Parameters

NameType
«destructured»Omit<MakeGitHubDataSourceArgs, "handleDocumentInRepo"> & { acquitCodeBlockLanguageReplacement?: string ; metadata?: PageMetadata ; pathToPageUrl: (pathInRepo: string) => string ; testFileLoaderOptions: Partial<GithubRepoLoaderParams> }

Returns

Promise<DataSource>

Defined in

mongodb-rag-ingest/src/sources/AcquitRequireMdOnGithubDataSource.ts:21


makeCodeOnGithubTextDataSource

makeCodeOnGithubTextDataSource(«destructured»): Promise<DataSource>

Loads source code files from a GitHub repo.

Parameters

NameType
«destructured»MakeCodeOnGithubTextDataSourceParams

Returns

Promise<DataSource>

Defined in

mongodb-rag-ingest/src/sources/CodeOnGithubTextDataSource.ts:21


makeDevCenterDataSource

makeDevCenterDataSource(«destructured»): Promise<DataSource>

Parameters

NameType
«destructured»DevCenterProjectConfig

Returns

Promise<DataSource>

Defined in

mongodb-rag-ingest/src/sources/DevCenterDataSource.ts:32


makeDevCenterPage

makeDevCenterPage(document, name, baseUrl): Page

Parameters

NameType
documentDevCenterEntry
namestring
baseUrlstring

Returns

Page

Defined in

mongodb-rag-ingest/src/sources/DevCenterDataSource.ts:67


makeDevCenterPageBody

makeDevCenterPageBody(«destructured»): string

Parameters

NameType
«destructured»Object
› contentstring
› title?string

Returns

string

Defined in

mongodb-rag-ingest/src/sources/DevCenterDataSource.ts:109


makeGitDataSource

makeGitDataSource(«destructured»): DataSource

Loads and processes files from a Git repo (can be hosted anywhere).

Parameters

NameType
«destructured»MakeGitDataSourceParams

Returns

DataSource

Defined in

mongodb-rag-ingest/src/sources/GitDataSource.ts:61


makeGitHubDataSource

makeGitHubDataSource(«destructured»): DataSource

Loads an arbitrary GitHub repo and converts its contents into pages.

Parameters

NameType
«destructured»MakeGitHubDataSourceArgs

Returns

DataSource

Defined in

mongodb-rag-ingest/src/sources/GitHubDataSource.ts:50


makeLangChainDocumentLoaderDataSource

makeLangChainDocumentLoaderDataSource(«destructured»): DataSource

Create a data source that loads pages from a Langchain document loader.

Parameters

NameType
«destructured»MakeLangChainDocumentLoaderDataSourceParams

Returns

DataSource

Defined in

mongodb-rag-ingest/src/sources/LangchainDocumentLoaderDataSource.ts:37


makeMdOnGithubDataSource

makeMdOnGithubDataSource(«destructured»): Promise<DataSource>

Loads an .md/.mdx docs site from a GitHub repo.

Parameters

NameType
«destructured»MakeMdOnGithubDataSourceParams

Returns

Promise<DataSource>

Defined in

mongodb-rag-ingest/src/sources/MdOnGithubDataSource.ts:72


makeMongoDbUniversityDataSource

makeMongoDbUniversityDataSource(params): DataSource

Data source constructor function for ingesting data from the MongoDB University Data API. (This is an internal API.)

Parameters

NameType
paramsMakeMongoDbUniversityDataSourceParams

Returns

DataSource

Defined in

mongodb-rag-ingest/src/sources/mongodb-university/MongoDbUniversityDataSource.ts:69


makeRandomTmp

makeRandomTmp(prefix): string

Parameters

NameTypeDescription
prefixstringprefix for the temporary directory name

Returns

string

Defined in

mongodb-rag-ingest/src/sources/GitDataSource.ts:127


makeRstOnGitHubDataSource

makeRstOnGitHubDataSource<MetadataType>(«destructured»): DataSource

Loads an rST docs site from a GitHub repo.

Type parameters

NameType
MetadataTypeextends Record<string, unknown>

Parameters

NameType
«destructured»Omit<MakeGitHubDataSourceArgs, "handleDocumentInRepo"> & { getMetadata?: (info: { bodyMarkdown: string ; document: Document<{ source: string }> ; title?: string ; url: string }) => undefined | MetadataType ; pathToPageUrl: (pathInRepo: string) => string }

Returns

DataSource

Defined in

mongodb-rag-ingest/src/sources/RstOnGitHubDataSource.ts:13


makeSnootyDataSource

makeSnootyDataSource(«destructured»): DataSource & { _baseUrl: string ; _currentBranch: string ; _snootyProjectName: string ; _version?: string }

Parameters

NameType
«destructured»MakeSnootyDataSourceArgs

Returns

DataSource & { _baseUrl: string ; _currentBranch: string ; _snootyProjectName: string ; _version?: string }

Defined in

mongodb-rag-ingest/src/sources/snooty/SnootyDataSource.ts:114


makeSnootyProjectsInfo

makeSnootyProjectsInfo(«destructured»): Promise<SnootyProjectsInfo & { _data: SnootyProject[] }>

Creates a SnootyProjectsInfo object from the Snooty Data API GET projects endpoint.

Parameters

NameType
«destructured»Object
› snootyDataApiBaseUrlstring

Returns

Promise<SnootyProjectsInfo & { _data: SnootyProject[] }>

Defined in

mongodb-rag-ingest/src/sources/snooty/SnootyProjectsInfo.ts:33


pageBlobUrl

pageBlobUrl(args): string

Parameters

NameType
argsObject
args.branchstring
args.filePath?string | string[]
args.repoUrlstring

Returns

string

Defined in

mongodb-rag-ingest/src/sources/CodeOnGithubTextDataSource.ts:73


parseSnootyTable

parseSnootyTable(node): Table

Turns a Snooty AST table into a Table data structure.

Tables in Snooty are represented as lists of lists under a list-table directive. There is no concept for "rows" or "cells" in the AST, which adds friction when working with the Snooty AST directly -- especially considering that list already have rendering logic that differs from tables. Having an intermediate data structure Table with Rows and Cells makes it easier to work with and render.

Parameters

NameType
nodeSnootyNode

Returns

Table

Defined in

mongodb-rag-ingest/src/sources/snooty/renderSnootyTable.ts:90


prepareSnootySources

prepareSnootySources(«destructured»): Promise<DataSource & { _baseUrl: string ; _currentBranch: string ; _snootyProjectName: string ; _version?: string }[]>

Fill the details of the defined Snooty data sources with the info in the Snooty Data API projects endpoint.

Parameters

NameType
«destructured»Object
› projectsLocallySpecifiedSnootyProjectConfig[]
› snootyDataApiBaseUrlstring

Returns

Promise<DataSource & { _baseUrl: string ; _currentBranch: string ; _snootyProjectName: string ; _version?: string }[]>

Defined in

mongodb-rag-ingest/src/sources/snooty/SnootyProjectsInfo.ts:102


removeMarkdownImagesAndLinks(content): string

Utility function to remove markdown images and links from a string. Useful if you do not want to include images and links in content, which can add significantly add to the token count when creating embeddings while also diluting the semantic meaning of the content.

Parameters

NameType
contentstring

Returns

string

Defined in

mongodb-rag-ingest/src/sources/removeMarkdownImagesAndLinks.ts:7


renderCells

renderCells(cells, options): string

Parameters

NameType
cellsCell[]
optionsRenderTableElementOptions

Returns

string

Defined in

mongodb-rag-ingest/src/sources/snooty/renderSnootyTable.ts:43


renderRows

renderRows(rows, options): string

Parameters

NameType
rowsRow[]
optionsRenderTableElementOptions

Returns

string

Defined in

mongodb-rag-ingest/src/sources/snooty/renderSnootyTable.ts:34


renderSnootyTable

renderSnootyTable(node): string

Return a string of MD from a Snooty AST node.

Parameters

NameType
nodeSnootyNode

Returns

string

Defined in

mongodb-rag-ingest/src/sources/snooty/renderSnootyTable.ts:10


rstToSnootyAst

rstToSnootyAst(rst): SnootyNode

Parameters

NameType
rststring

Returns

SnootyNode

Defined in

mongodb-rag-ingest/src/sources/snooty/rstToSnootyAst.ts:5


snootyAstToMd

snootyAstToMd(node): string

Renders a snooty AST node as markdown.

Parameters

NameType
nodeSnootyNode

Returns

string

Defined in

mongodb-rag-ingest/src/sources/snooty/snootyAstToMd.ts:8


snootyAstToOpenApiSpec

snootyAstToOpenApiSpec(node): Promise<string>

Parameters

NameType
nodeSnootyNode

Returns

Promise<string>

Defined in

mongodb-rag-ingest/src/sources/snooty/snootyAstToOpenApiSpec.ts:4