You can execute Prompt Experiments in the ABV UI to test different prompt versions from Prompt Management or language models and compare the results side-by-side. Optionally, you can use LLM-as-a-Judge Evaluators to automatically score the responses based on the expected outputs to further analyze the results on an aggregate level.Documentation Index
Fetch the complete documentation index at: https://docs.abv.dev/llms.txt
Use this file to discover all available pages before exploring further.
Why use Prompt Experiments?
- Quickly test different prompt versions or models
- Structure your prompt testing by using a dataset to test different prompt versions and models
- Quickly iterate on prompts through Dataset Runs
- Optionally use LLM-as-a-Judge Evaluators to score the responses based on the expected outputs from the dataset
- Prevent regressions by running tests when making prompt changes
Prerequisites
1) Create a usable prompt
Create a prompt that you want to test and evaluate. How to create a prompt?A prompt is usable when:
2) Create a usable dataset
Create a dataset with the inputs and expected outputs you want to use for your prompt experiments. How to create a dataset?A dataset is usable when: (1) the dataset items have JSON objects as input and (2) these objects have JSON keys that match the prompt variables of the prompt(s) you will use. See the example below.
3) Configure LLM connection
As your prompt will be executed for each dataset item, you need to configure an LLM connection in the project settings. How to configure an LLM connection?4) Optional: Set up LLM-as-a-judge
You can set up an LLM-as-a-judge evaluator to score the responses based on the expected outputs. Make sure to set the target of the LLM-as-a-Judge to “Experiment runs” and filter for the dataset you want to use. How to set up LLM-as-a-judge?Trigger a Prompt Experiment
1) Navigate to the dataset
Dataset Runs are currently started from the detail page of a dataset.
- Navigate to
Your Project>Datasets - Click on the dataset you want to start a Dataset Run for
2) Open the setup page
Click onStart Experiment to open the setup page
Click on Create below Prompt Experiment
3) Configure the Dataset Run
- Set a Dataset Run name
- Select the prompt you want to use
- Set up or select the LLM connection you want to use
- Select the dataset you want to use
- Optionally select the evaluator you want to use
- Click on
Createto trigger the Dataset Run