Evaluations
Scores Data Model
5 min
in abv scores are the data object to store evaluations they are a flexible object that is used across all evaluation overview docid\ xmql64v6abqxjdsa49l84 to assign evaluation scores to different objects across the abv platform high level you can think of scores as the output of an evaluation method scores scores serve as objects for storing evaluation metrics in abv here are its core properties scores reference a trace , observation , session , or datasetrun each score references exactly one of the above objects scores are either numeric , categorical , or boolean scores can optionally be linked to a scoreconfig to ensure they comply with a specific schema common use level description trace used for evaluation of a single interaction (most common) observation used for evaluation of a single observation below the trace level session used for comprehensive evaluation of outputs across multiple interactions dataset run used for performance scores of a dataset run dataset runs data model docid\ rbvady5ck90rznygpf0yv score object attribute type description name string name of the score, e g user feedback, hallucination eval value number optional numeric value of the score always defined for numeric and boolean scores optional for categorical scores stringvalue string optional string equivalent of the score's numeric value for boolean and categorical data types automatically set for categorical scores based on the config if the configid is provided traceid string optional id of the trace the score relates to observationid string optional observation (e g llm call) the score relates to sessionid string optional id of the session the score relates to datasetrunid string optional id of the dataset run the score relates to comment string optional evaluation comment, commonly used for user feedback, eval output or internal notes id string unique identifier of the score auto generated by sdks optionally can also be used as an indempotency key to update scores source string automatically set based on the souce of the score can be either api , eval , or annotation datatype string automatically set based on the config data type when the configid is provided otherwise can be defined manually as numeric , categorical or boolean configid string optional score config id to ensure that the score follows a specific schema can be defined in the abv ui or via api when provided the score's datatype is automatically set based on the config score config score configs are used to ensure that your scores follow a specific schema using score configs allows you to standardize your scoring schema across your team and ensure that scores are consistent and comparable for future analysis you can define a scoreconfig in the abv ui or via our api configs are immutable but can be archived (and restored anytime) a score config includes score name data type numeric , categorical , boolean constraints on score value range (min/max for numerical, custom categories for categorical data types score config object attribute type description id string unique identifier of the score config name string name of the score config, e g user feedback, hallucination eval datatype string can be either numeric , categorical or boolean isarchived boolean whether the score config is archived defaults to false minvalue number optional sets minimum value for numerical scores if not set, the minimum value defaults to ∞ maxvalue number optional sets maximum value for numerical scores if not set, the maximum value defaults to +∞ categories list optional defines categories for categorical scores list of objects with label value pairs description string optional provides further description of the score configuration