Skip to main content

AI Evaluation Platform

Zeno is an interactive platform for evaluating any AI system. You can explore your data, discover failures, and create and share interactive evaluation reports.

Sponsors and Organizations


Explore data and model outputs with customizable views for any data type


Interactively discover, test and save model behavior for analysis and updates


Create exportable visualizations and charts comparing models and slices

Explore your data

Zeno's modular instance view can be extended to render any data type and model output

Image Classification
Audio Transcription
Activity Recognition
Your custom data type

Create interactive reports

Track and compare performance across slices and models

Slices created in the Exploration page can be used to build interactive visualizations for deeper analyses of model behavior. Visualizations include bar charts for comparing slice performance across models and trend tables for detecting regressions in slice performance.

Zeno charts can be exported as PDFs or PNGs for sharing with other stakeholders, or shared as links for live views of model performance.

Extend Zeno with the Python API

Add new models, metrics, and metadata columns with the Python API

The Python API is used to add models, metrics, and new metadata columns to Zeno.

@model functions wraps Python libraries such as PyTorch, Tensorflow, Keras, HuggingFace, etc. to get model predictions

@metric functions are used to calculate different metrics on slices of data

@distill functions derive new metadata columns from your data instances.

Audio transcription using the OpenAI Whisper model
def load_model(model_path):
model = whisper.load_model("tiny")

def pred(df, ops: ZenoOptions):
# Get a list of paths for each audio file
files = [os.path.join(ops.data_path, f) for f in df[ops.data_column]]
return [model.transcribe(f)["text"] for f in files]

return pred