mongodb-chatbot-evaluation
Interfaces
- BaseGeneratedData
- BaseTestCase
- CheckResponseQuality
- CheckResponseQualityParams
- CommandMetadataStore
- CommandRunMetadata
- ConversationEvalData
- ConversationGeneratedData
- ConversationTestCase
- EvalConfig
- EvalResult
- EvaluateConversationQualityParams
- EvaluationStore
- ExpectedLinks
- GenerateDataAndMetadataParams
- GenerateDataFuncParams
- GeneratedDataStore
- MakeGenerateConversationDataParams
- MakeGenerateLlmConversationDataParams
- MakeMongoDbCommandMetadataStoreParams
- MakeMongoDbEvaluationStoreParams
- MakeMongoDbGeneratedDataStoreParams
- MakeMongoDbReportStoreParams
- MongoDbEvaluationStore
- MongoDbGeneratedDataStore
- MongoDbReportStore
- Report
- ReportEvalFuncParams
- ReportStore
- ResponseQualityExample
Type Aliases
ConfigConstructor
Ƭ ConfigConstructor: () => Promise
<EvalConfig
>
Type declaration
▸ (): Promise
<EvalConfig
>
Returns
Promise
<EvalConfig
>
Defined in
packages/mongodb-chatbot-evaluation/src/EvalConfig.ts:43
ConversationTestCaseData
Ƭ ConversationTestCaseData: z.infer
<typeof ConversationTestCaseDataSchema
>
Defined in
packages/mongodb-chatbot-evaluation/src/generate/TestCase.ts:32
EvaluateQualityFunc
Ƭ EvaluateQualityFunc: (params
: EvaluateQualityFuncParams
) => Promise
<EvalResult
>
Type declaration
▸ (params
): Promise
<EvalResult
>
Parameters
Name | Type |
---|---|
params | EvaluateQualityFuncParams |
Returns
Promise
<EvalResult
>
Defined in
packages/mongodb-chatbot-evaluation/src/evaluate/EvaluateQualityFunc.ts:10
GenerateDataFunc
Ƭ GenerateDataFunc: (params
: GenerateDataFuncParams
) => Promise
<{ failedCases
: SomeTestCase
[] ; generatedData
: SomeGeneratedData
[] }>
Type declaration
▸ (params
): Promise
<{ failedCases
: SomeTestCase
[] ; generatedData
: SomeGeneratedData
[] }>
Parameters
Name | Type |
---|---|
params | GenerateDataFuncParams |
Returns
Promise
<{ failedCases
: SomeTestCase
[] ; generatedData
: SomeGeneratedData
[] }>
Defined in
packages/mongodb-chatbot-evaluation/src/generate/GenerateDataFunc.ts:16
GenerateReportAndMetadataParams
Ƭ GenerateReportAndMetadataParams: Object
Type declaration
Name | Type |
---|---|
evaluationRunId | ObjectId |
evaluationStore | EvaluationStore |
metadataStore | CommandMetadataStore |
name | string |
reportEvalFunc | ReportEvalFunc |
reportStore | ReportStore |
Defined in
packages/mongodb-chatbot-evaluation/src/report/generateReportAndMetadata.ts:10
LoadConfigArgs
Ƭ LoadConfigArgs: Object
Type declaration
Name | Type |
---|---|
config? | string |
Defined in
packages/mongodb-chatbot-evaluation/src/withConfig.ts:6
Pipeline
Ƭ Pipeline: (generate
: PipelineGenerateFunc
, evaluate
: PipelineEvaluateFunc
, report
: PipelineReportFunc
) => Promise
<void
>
Type declaration
▸ (generate
, evaluate
, report
): Promise
<void
>
Parameters
Name | Type |
---|---|
generate | PipelineGenerateFunc |
evaluate | PipelineEvaluateFunc |
report | PipelineReportFunc |
Returns
Promise
<void
>
Defined in
packages/mongodb-chatbot-evaluation/src/Pipeline.ts:21
ReportEvalFunc
Ƭ ReportEvalFunc: (params
: ReportEvalFuncParams
) => Promise
<Report
>
Type declaration
▸ (params
): Promise
<Report
>
Parameters
Name | Type |
---|---|
params | ReportEvalFuncParams |
Returns
Promise
<Report
>
Defined in
packages/mongodb-chatbot-evaluation/src/report/ReportEvalFunc.ts:12
SomeGeneratedData
Ƭ SomeGeneratedData: ConversationGeneratedData
| BaseGeneratedData
Defined in
packages/mongodb-chatbot-evaluation/src/generate/GeneratedDataStore.ts:17
SomeTestCase
Ƭ SomeTestCase: ConversationTestCase
| BaseTestCase
Defined in
packages/mongodb-chatbot-evaluation/src/generate/TestCase.ts:41
Variables
ConversationTestCaseDataSchema
• Const
ConversationTestCaseDataSchema: ZodObject
<{ expectation
: ZodOptional
<ZodString
> ; expectedLinks
: ZodOptional
<ZodArray
<ZodString
, "many"
>> ; messages
: ZodArray
<ZodObject
<{ content
: ZodString
; role
: ZodEnum
<["assistant"
, "user"
]> }, "strip"
, ZodTypeAny
, { content
: string
; role
: "assistant"
| "user"
}, { content
: string
; role
: "assistant"
| "user"
}>, "many"
> ; name
: ZodString
; skip
: ZodOptional
<ZodBoolean
> ; tags
: ZodOptional
<ZodArray
<ZodString
, "many"
>> }, "strip"
, ZodTypeAny
, { expectation?
: string
; expectedLinks?
: string
[] ; messages
: { role: "assistant" | "user"; content: string; }[] ; name
: string
; skip?
: boolean
; tags?
: string
[] }, { expectation?
: string
; expectedLinks?
: string
[] ; messages
: { role: "assistant" | "user"; content: string; }[] ; name
: string
; skip?
: boolean
; tags?
: string
[] }>
Defined in
packages/mongodb-chatbot-evaluation/src/generate/TestCase.ts:8
mongodbResponseQualityExamples
• Const
mongodbResponseQualityExamples: ResponseQualityExample
[]
Defined in
packages/mongodb-chatbot-evaluation/src/evaluate/checkResponseQuality.ts:128
Functions
checkResponseQuality
▸ checkResponseQuality(«destructured»
): Promise
<CheckResponseQuality
>
Parameters
Name | Type |
---|---|
«destructured» | CheckResponseQualityParams |
Returns
Promise
<CheckResponseQuality
>
Defined in
packages/mongodb-chatbot-evaluation/src/evaluate/checkResponseQuality.ts:27
convertCommandRunMetadataToJson
▸ convertCommandRunMetadataToJson(command
): Object
Parameters
Name | Type |
---|---|
command | CommandRunMetadata |
Returns
Object
Name | Type |
---|---|
_id | string |
command | "generate" | "evaluate" | "report" |
endTime | number |
name | string |
startTime | number |
Defined in
packages/mongodb-chatbot-evaluation/src/CommandMetadataStore.ts:16
evaluateConversationAverageRetrievalScore
▸ evaluateConversationAverageRetrievalScore(params
): Promise
<EvalResult
>
Evaluate average similarity score for the retrieved context information.
Parameters
Name | Type |
---|---|
params | EvaluateQualityFuncParams |
Returns
Promise
<EvalResult
>
Defined in
packages/mongodb-chatbot-evaluation/src/evaluate/EvaluateQualityFunc.ts:10
evaluateExpectedLinks
▸ evaluateExpectedLinks(params
): Promise
<EvalResult
>
Evaluates if the final assistant message contains the expected links.
Skips if no expectedLinks
in the test case data.
The evaluation checks if each actual link one of the expectedLink
values
as a substring. This is to allow for the base URL of the link to change.
This is possible if the documentation you're testing against is versioned,
so that the link might update.
For example, if the expectedLinks
includes ["link1", "link2" ]
,
this would match for the actual links of ["https://mongodb.com/foo/v1/link1", "https://docs.mongodb.com/foo/v2/link2"]
.
The eval result is the portion of the expectedLinks
that are present in the final assistant message.
For example, if the expectedLinks
are ["link1", "link2" ]
and the final assistant message only contains ["link1"]
, the eval result: .5
.
Parameters
Name | Type |
---|---|
params | EvaluateQualityFuncParams |
Returns
Promise
<EvalResult
>
Defined in
packages/mongodb-chatbot-evaluation/src/evaluate/EvaluateQualityFunc.ts:10
generateDataAndMetadata
▸ generateDataAndMetadata(«destructured»
): Promise
<{ failedCases
: SomeTestCase
[] ; generatedData
: SomeGeneratedData
[] ; metadata
: { _id
: ObjectId
= runId; command
: "generate"
= "generate"; endTime
: Date
; name
: string
; startTime
: Date
} }>
Generate data for test cases and store metadata about the generation.
Parameters
Name | Type |
---|---|
«destructured» | GenerateDataAndMetadataParams |
Returns
Promise
<{ failedCases
: SomeTestCase
[] ; generatedData
: SomeGeneratedData
[] ; metadata
: { _id
: ObjectId
= runId; command
: "generate"
= "generate"; endTime
: Date
; name
: string
; startTime
: Date
} }>
Defined in
packages/mongodb-chatbot-evaluation/src/generate/generateDataAndMetadata.ts:40
generateReportAndMetadata
▸ generateReportAndMetadata(«destructured»
): Promise
<{ metadata
: { _id
: ObjectId
= runId; command
: "report"
= "report"; endTime
: Date
; name
: string
; startTime
: Date
} ; report
: Report
}>
Parameters
Name | Type |
---|---|
«destructured» | GenerateReportAndMetadataParams |
Returns
Promise
<{ metadata
: { _id
: ObjectId
= runId; command
: "report"
= "report"; endTime
: Date
; name
: string
; startTime
: Date
} ; report
: Report
}>
Defined in
packages/mongodb-chatbot-evaluation/src/report/generateReportAndMetadata.ts:19
getConversationsTestCasesFromYaml
▸ getConversationsTestCasesFromYaml(yamlData
): ConversationTestCase
[]
Get conversation test cases from YAML file. Throws if the YAML is not correctly formatted.
Parameters
Name | Type |
---|---|
yamlData | string |
Returns
Defined in
packages/mongodb-chatbot-evaluation/src/generate/getConversationsTestCasesFromYaml.ts:12
isConversationTestCase
▸ isConversationTestCase(testCase
): testCase is ConversationTestCase
Parameters
Name | Type |
---|---|
testCase | SomeTestCase |
Returns
testCase is ConversationTestCase
Defined in
packages/mongodb-chatbot-evaluation/src/generate/TestCase.ts:43
loadConfig
▸ loadConfig(«destructured»
): Promise
<EvalConfig
>
Parameters
Name | Type |
---|---|
«destructured» | LoadConfigArgs |
Returns
Promise
<EvalConfig
>
Defined in
packages/mongodb-chatbot-evaluation/src/withConfig.ts:10
makeEvaluateConversationFaithfulness
▸ makeEvaluateConversationFaithfulness(«destructured»
): EvaluateQualityFunc
Evaluate whether the assistant's response is faithful to the retrieved context information
Wraps the LlamaIndex.ts FaithfulnessEvaluator
.
Note that in our testing experience, results vary based on the large language model used in the evaluator. For example, on a dataset of ~50 questions, the OpenAI GPT-3.5 model had a higher incidence of false negatives, while the OpenAI GPT-4 model had more false positives.
All evaluation results should be analyzed with a critical eye and you should thoroughly review results before making any decisions based on them.
Parameters
Name | Type |
---|---|
«destructured» | MakeEvaluatorParams |
Returns
Defined in
packages/mongodb-chatbot-evaluation/src/evaluate/evaluateConversationFaithfulness.ts:28
makeEvaluateConversationQuality
▸ makeEvaluateConversationQuality(«destructured»
): EvaluateQualityFunc
Construct a a EvaluateQualityFunc that evaluates the quality of a conversation using an OpenAI ChatGPT LLM.
The returned EvalResult has the following properties:
- In EvalResult.result,
1
if the conversation meets quality standards and0
if it does not. - In EvalResult.metadata,
reason
for the result, as generated by the LLM.
Parameters
Name | Type |
---|---|
«destructured» | EvaluateConversationQualityParams |
Returns
Defined in
packages/mongodb-chatbot-evaluation/src/evaluate/evaluateConversationQuality.ts:43
makeEvaluateConversationRelevancy
▸ makeEvaluateConversationRelevancy(«destructured»
): EvaluateQualityFunc
Evaluate whether the assistant's response is relevant to the user query
Wraps the LlamaIndex.ts RelevancyEvaluator
.
Parameters
Name | Type |
---|---|
«destructured» | MakeEvaluatorParams |
Returns
Defined in
packages/mongodb-chatbot-evaluation/src/evaluate/evaluateConversationRelevancy.ts:20
makeGenerateConversationData
▸ makeGenerateConversationData(«destructured»
): GenerateDataFunc
Generate conversation data from test cases.
Parameters
Name | Type |
---|---|
«destructured» | MakeGenerateConversationDataParams |
Returns
Defined in
packages/mongodb-chatbot-evaluation/src/generate/generateConversationData.ts:49
makeGenerateLlmConversationData
▸ makeGenerateLlmConversationData(«destructured»
): GenerateDataFunc
Generate conversation data from test cases using a large language model, not an instance of the chatbot.
This can be useful for evaluating how an LLM performs on a specific task, even before a RAG chatbot is implemented.
Parameters
Name | Type |
---|---|
«destructured» | MakeGenerateLlmConversationDataParams |
Returns
Defined in
packages/mongodb-chatbot-evaluation/src/generate/generateLlmConversationData.ts:43
makeMongoDbCommandMetadataStore
▸ makeMongoDbCommandMetadataStore(«destructured»
): CommandMetadataStore
Parameters
Name | Type |
---|---|
«destructured» | MakeMongoDbCommandMetadataStoreParams |
Returns
Defined in
packages/mongodb-chatbot-evaluation/src/CommandMetadataStore.ts:36
makeMongoDbEvaluationStore
▸ makeMongoDbEvaluationStore(«destructured»
): MongoDbEvaluationStore
Parameters
Name | Type |
---|---|
«destructured» | MakeMongoDbEvaluationStoreParams |
Returns
Defined in
packages/mongodb-chatbot-evaluation/src/evaluate/EvaluationStore.ts:52
makeMongoDbGeneratedDataStore
▸ makeMongoDbGeneratedDataStore(«destructured»
): MongoDbGeneratedDataStore
Parameters
Name | Type |
---|---|
«destructured» | MakeMongoDbGeneratedDataStoreParams |
Returns
Defined in
packages/mongodb-chatbot-evaluation/src/generate/GeneratedDataStore.ts:69
makeMongoDbReportStore
▸ makeMongoDbReportStore(«destructured»
): MongoDbReportStore
Parameters
Name | Type |
---|---|
«destructured» | MakeMongoDbReportStoreParams |
Returns
Defined in
packages/mongodb-chatbot-evaluation/src/report/ReportStore.ts:39
reportAverageScore
▸ reportAverageScore(params
): Promise
<Report
>
Report the average score of an evaluation run.
Parameters
Name | Type |
---|---|
params | ReportEvalFuncParams |
Returns
Promise
<Report
>
Defined in
packages/mongodb-chatbot-evaluation/src/report/ReportEvalFunc.ts:12
reportStatsForBinaryEvalRun
▸ reportStatsForBinaryEvalRun(params
): Promise
<Report
>
Report on the pass percentage of a binary evaluation (eval that return either 1 or 0)
for eval run evaluationRunId
.
Parameters
Name | Type |
---|---|
params | ReportEvalFuncParams |
Returns
Promise
<Report
>
Defined in
packages/mongodb-chatbot-evaluation/src/report/ReportEvalFunc.ts:12
runPipeline
▸ runPipeline(«destructured»
): Promise
<void
>
Runs a pipeline of commands from configuration file. This is a useful utility for chaining a group of commands.
For example, you may want to generate one set of converations, then generate N evaluations for each conversation, and then M reports for each evaluation.
Parameters
Name | Type |
---|---|
«destructured» | RunPipelineParams |
Returns
Promise
<void
>
Defined in
packages/mongodb-chatbot-evaluation/src/Pipeline.ts:53
stringifyConversation
▸ stringifyConversation(messages
): string
Parameters
Name | Type |
---|---|
messages | { content : string ; role : string }[] |
Returns
string
Defined in
packages/mongodb-chatbot-evaluation/src/evaluate/stringifyConversation.ts:1
withConfig
▸ withConfig<T
>(action
, args
): Promise
<void
>
Type parameters
Name |
---|
T |
Parameters
Name | Type |
---|---|
action | (config : EvalConfig , args : T ) => Promise <void > |
args | LoadConfigArgs & T |
Returns
Promise
<void
>
Defined in
packages/mongodb-chatbot-evaluation/src/withConfig.ts:62
withConfigOptions
▸ withConfigOptions<T
>(args
): Argv
<T
& LoadConfigArgs
>
Apply config options to CLI command.
Type parameters
Name |
---|
T |
Parameters
Name | Type |
---|---|
args | Argv <T > |
Returns
Argv
<T
& LoadConfigArgs
>