You can execute Prompt Experiments in the ABV UI to test different prompt versions from Prompt Management or language models and compare the results side-by-side.Optionally, you can use LLM-as-a-Judge Evaluators to automatically score the responses based on the expected outputs to further analyze the results on an aggregate level.
Show Example: Prompt Variables & Dataset Item Keys Mapping
Prompt:
Copy
You are an ABV expert. Answer based on:{{documentation}}Question: {{question}}
Dataset Item:
Copy
{ "documentation": "ABV is an LLM Engineering Platform", "question": "What is ABV?"}
In this example:
The prompt variable {{documentation}} maps to the JSON key "documentation"
The prompt variable {{question}} maps to the JSON key "question"
Both keys must exist in the dataset item’s input JSON for the experiment to run successfully
Show Example: Chat Message Placeholder Mapping
In addition to variables, you can also map placeholders in chat message prompts to dataset item keys.
This is useful when the dataset item also contains for example a chat message history to use.
Your chat prompt needs to contain a placeholder with a name. Variables within placeholders are not resolved.Chat Prompt:
Placeholder named: message_historyDataset Item:
Copy
{ "message_history": [ { "role": "user", "content": "What is ABV?" }, { "role": "assistant", "content": "ABV is a tool for tracking and analyzing the performance of language models." } ], "question": "What is ABV?"}
In this example:
The chat prompt placeholder message_history maps to the JSON key "message_history".
The prompt variable {{question}} maps to the JSON key "question" in a variable not within a placeholder message.
Both keys must exist in the dataset item’s input JSON for the experiment to run successfully
Create a dataset with the inputs and expected outputs you want to use for your prompt experiments. How to create a dataset?
A dataset is usable when: (1) the dataset items have JSON objects as input and (2) these objects have JSON keys that match the prompt variables of the prompt(s) you will use. See the example below.
Show Example: Prompt Variables & Dataset Item Keys Mapping
Prompt:
Copy
You are an ABV expert. Answer based on:{{documentation}}Question: {{question}}
Dataset Item:
Copy
{ "documentation": "ABV is an LLM Engineering Platform", "question": "What is ABV?"}
In this example:
The prompt variable {{documentation}} maps to the JSON key "documentation"
The prompt variable {{question}} maps to the JSON key "question"
Both keys must exist in the dataset item’s input JSON for the experiment to run successfully
As your prompt will be executed for each dataset item, you need to configure an LLM connection in the project settings. How to configure an LLM connection?
You can set up an LLM-as-a-judge evaluator to score the responses based on the expected outputs. Make sure to set the target of the LLM-as-a-Judge to “Experiment runs” and filter for the dataset you want to use. How to set up LLM-as-a-judge?
Set up or select the LLM connection you want to use
Select the dataset you want to use
Optionally select the evaluator you want to use
Click onCreate to trigger the Dataset Run
This will trigger the Dataset Run and you will be redirected to the Dataset Runs page. The run might take a few seconds or minutes to complete depending on the prompt complexity and dataset size.