Skip to main content

Using Zeno

We recommend you watch the following video for a quick overview of Zeno's features.


The following is a written walkthrough of Zeno's features using Zeno to explore a dataset of customer service emails.

Data and Model Output Exploration

When you first open Zeno you will see your data instances on the right hand side and metadata distributions on the left hand side.

Zeno Screenshot

The metadata distributions on the left show summary visualizations of columns in your dataset:

Metadata view

If you want to filter down your data to only include examples that have a particular value for a feature:

  • Textual Features: Type in a value in the text box and click "set". You can also use regexes or adjust case-sensitivity by pressing the buttons.
  • Numerical Features: Dragging the slider to select a range of values.
  • Categorical Features: Clicking on a specific value in the bar chart (not displayed above).

For instance, if we want to find all examples that have a length of 100 or fewer characters that contain the string "sorry", you can filter the "label and "label_length" features, and see that the displayed examples on the right are updated.

Once you've found a subset of the data that you're interested in, you can save it for future analysis and monitoring by clicking the "Create a new Slice" button:

Slice Creation

You can also arrange slices into folders for easier browsing.

This slicing is very powerful functionality if you get creative with the features and patterns that you use! If you want to try to add new features, you can implement them and add them to the config.py file in the examples that you're using, some examples below:

Chart Building

Once you have some models to compare and some slices to compare them on, you can start building interactive charts to summarize model performance. To do this, click on the "Charts" button on the left of the page:

Chart tab

This will take you to a page that shows all of your created charts:

Charts

You can create new charts by interactively selecting slices, metrics, and models. For example, you can create a chart comparing model performance across instances with short, medium, or long ground truth labels:

Chart example

Qualitative Comparison

One final handy feature of Zeno is the ability to compare the outputs of two models on the same examples. You can do this by clicking on the qualitative comparison button:

Qualitative comparison

On the page, you can then select the two models you want to compare side-by-side, and select the metric you'd like to compare them by. Here we choose gpt-3.5-turbo and vicuna and compare them according to the bert_score metric.

Comparison

You can also sort the outputs by the difference between the scores between the two systems by clicking on the header of the "difference" column. This allows you to find examples where one of the two systems produces much better outputs than the other, such as the one below where one model suddenly went off track and produced an incomprehensible output.

Example find