A dataset is a collection of inputs and expected outputs and is used to test your application. Before executing your first dataset run, you need to create a dataset.

Why use datasets?

Datasets prerequisite for Dataset Runs, they serve as the data input of Dataset Runs
Create test cases for your application with real production traces
Collaboratively create and collect dataset items with your team
Have a single source of truth for your test data

Get Started

1) Creating a dataset

Datasets have a name which is unique within a project.

ABV UI

Navigate to Your Project > Datasets
Click on + New dataset to create a new dataset.

Python SDK

Install package

pip install abvdev

from abvdev import ABV

abv = ABV(
    api_key="sk-abv-...", # your api key here
    host="https://app.abv.dev", # host="https://eu.app.abv.dev", for EU region
)

abv.create_dataset(
    name="<dataset_name>",
    # optional description
    description="My first dataset",
    # optional metadata
    metadata={
        "author": "Alice",
        "date": "2025-01-01",
        "type": "benchmark"
    }
)

See Python SDK docs for details on how to initialize the Python client.

JS/TS SDK

npm i @abvdev/client

Environment variables Add your ABV credentials as environment variables, e.g. use .env file and dotenv package to load variable values.

npm install dotenv

.env

ABV_API_KEY = "sk-abv-...";
ABV_BASEURL = "https://app.abv.dev"; US region
# ABV_BASEURL = "https://eu.app.abv.dev"; EU region

import { ABVClient } from "@abvdev/client";
 
const abv = new ABVClient();

alternatively use Constructor parameters

import { ABVClient } from "@abvdev/client";
 
const abv = new ABVClient({
  apiKey: "sk-abv-...",
  baseUrl: "https://app.abv.dev", // US region
  // baseUrl: "https://eu.app.abv.dev", // EU region
});

Create dataset

import { ABVClient } from "@abvdev/client"
import dotenv from "dotenv";
dotenv.config();
 
const abv = new ABVClient()
 
async function main(){
  await abv.api.datasets.create({
    name: "dataset1", // set dataset name
    // optional description
    description: "My first dataset",
    // optional metadata
    metadata: {
      author: "Alice",
      date: "2025-01-01",
      type: "benchmark",
    },
  });
}

main();

See JS/TS SDK docs for details on how to initialize the JS/TS client.

2) Create new dataset items

Dataset items can be added to a dataset by providing the input and optionally the expected output.

ABV UI

Add item - Add item manually via UI
Import CSV - Import CSV file
Add from trace - Add from the trace view

Python SDK

from abvdev import ABV

abv = ABV(
    api_key="sk-abv-...", # your api key here
    host="https://app.abv.dev", # host="https://eu.app.abv.dev", for EU region
)

abv.create_dataset_item(
    dataset_name="<dataset_name>",
    # any python object or value, optional
    input={
        "text": "hello world"
    },
    # any python object or value, optional
    expected_output={
        "text": "hello world"
    },
    # metadata, optional
    metadata={
        "model": "llama3",
    }
)

See Python SDK docs for details on how to initialize the Python client.

JS/TS SDK

import { ABVClient } from "@abvdev/client";
import dotenv from "dotenv";
dotenv.config();
 
const abv = new ABVClient();
 
async function main() {
  await abv.api.datasetItems.create({
    datasetName: "dataset1",
    // any JS object or value
    input: {
      text: "hello world",
    },
    // any JS object or value, optional
    expectedOutput: {
      text: "hello world",
    },
    // metadata, optional
    metadata: {
      model: "llama3",
    },
  });
}

main();

See JS/TS SDK docs for details on how to initialize the JS/TS client.

Create synthetic datasets

Frequently, you want to create synthetic examples to test your application to bootstrap your dataset. LLMs are great at generating these by prompting for common questions/tasks.

Create items from production data

A common workflow is to select production traces where the application did not perform as expected. Then you let an expert add the expected output to test new versions of your application on the same data.

ABV UI

In the UI, use + Add to dataseton any observation (span, event, generation) of a production trace.

Python SDK

from abvdev import ABV

abv = ABV(
    api_key="sk-abv-...", # your api key here
    host="https://app.abv.dev", # host="https://eu.app.abv.dev", for EU region
)

abv.create_dataset_item(
    dataset_name="<dataset_name>",
    input={ "text": "hello world" },
    expected_output={ "text": "hello world" },
    # link to a trace
    source_trace_id="<trace_id>",
    # optional: link to a specific span, event, or generation
    source_observation_id="<observation_id>"
)

JS/TS SDK

import { ABVClient } from "@abvdev/client";
import dotenv from "dotenv";
dotenv.config();
 
const abv = new ABVClient();
 
async function main() {
  await abv.api.datasetItems.create({
    datasetName: "dataset1",
    input: { text: "hello world" },
    expectedOutput: { text: "hello world" },
    // link to a trace
    sourceTraceId: "<trace_id>",
    // optional: link to a specific span, event, or generation
    sourceObservationId: "<observation_id>",
  });
}

main();

Edit/archive dataset items

You can edit or archive dataset items. Archiving items will remove them from future experiment runs.

ABV UI

In the UI, you can edit the item by clicking on the item id. To archive or delete the item, click on the dots next to the item and select Archive or Delete.

Python SDK

You can upsert items by providing the id of the item you want to update.

from abvdev import ABV

abv = ABV(
    api_key="sk-abv-...", # your api key here
    host="https://app.abv.dev", # host="https://eu.app.abv.dev", for EU region
)

abv.create_dataset_item(
    dataset_name="<dataset_name>",
    id="<item_id>",
    # example: update status to "ARCHIVED"
    status="ARCHIVED"
)

JS/TS SDK

You can upsert items by providing the id of the item you want to update.

import { ABVClient } from "@abvdev/client";
import dotenv from "dotenv";
dotenv.config();
 
const abv = new ABVClient();
 
async function main() {
  await abv.api.datasetItems.create({
    datasetName: "dataset1",
    id: "dataset_item_id_here",
    // example: update status to "ARCHIVED"
    status: "ARCHIVED",
  });
}

main();

Dataset runs

Once you created a dataset, you can test and evaluate your application based on it. Native Dataset Runs Remote Dataset Runs

Getting Started

Basic Features

LLM Gateway

Guardrails

Evaluations

Prompt Management

Cookbook

SDKs

Platform

Support

Datasets

Why use datasets?

Get Started

1) Creating a dataset

ABV UI

Python SDK

JS/TS SDK

2) Create new dataset items

ABV UI

Python SDK

JS/TS SDK

Create synthetic datasets

Create items from production data

ABV UI

Python SDK

JS/TS SDK

Edit/archive dataset items

ABV UI

Python SDK

JS/TS SDK

Dataset runs

Getting Started

Basic Features

LLM Gateway

Guardrails

Evaluations

Prompt Management

Cookbook

SDKs

Platform

Support

​Why use datasets?

​Get Started

​1) Creating a dataset

​ABV UI

​Python SDK

​JS/TS SDK

​2) Create new dataset items

​ABV UI

​Python SDK

​JS/TS SDK

​Create synthetic datasets

​Create items from production data

​ABV UI

​Python SDK

​JS/TS SDK

​Edit/archive dataset items

​ABV UI

​Python SDK

​JS/TS SDK

​Dataset runs

Why use datasets?

Get Started

1) Creating a dataset

ABV UI

Python SDK

JS/TS SDK

2) Create new dataset items

ABV UI

Python SDK

JS/TS SDK

Create synthetic datasets

Create items from production data

ABV UI

Python SDK

JS/TS SDK

Edit/archive dataset items

ABV UI

Python SDK

JS/TS SDK

Dataset runs