Using Zeno
To highlight the main features and uses of Zeno, we walk through an example of using Zeno to explore a dataset of customer service emails, which you can explore in our chatbot report.
You can explore the code that created this Zeno report in the Zeno Build repository.
Data and Model Output Exploration
When you first open Zeno you will see your data instances on the right hand side and metadata distributions on the left hand side.
The metadata distributions on the left show summary visualizations of columns in your dataset:
If you want to filter down your data to only include examples that have a particular value for a feature:
- Textual Features: Type in a value in the text box and click "set". You can also use regexes or adjust case-sensitivity by pressing the buttons.
- Numerical Features: Dragging the slider to select a range of values.
- Categorical Features: Clicking on a specific value in the bar chart (not displayed above).
For instance, if we want to find all examples that have a length of 100 or fewer characters that contain the string "sorry", you can filter the "label and "label_length" features, and see that the displayed examples on the right are updated.
Once you've found a subset of the data that you're interested in, you can save it for future analysis and monitoring by clicking the "Create a new Slice" button:
You can also arrange slices into folders for easier browsing.
This slicing is very powerful functionality if you get creative with the features and patterns that you use! If you want to try to add new features, you can implement them and add them to the config.py file in the examples that you're using, some examples below:
Chart Building
Once you have some models to compare and some slices to compare them on, you can start building interactive charts to summarize model performance. To do this, click on the "Charts" button on the left of the page:
This will take you to a page that shows all of your created charts:
You can create new charts by interactively selecting slices, metrics, and models. For example, you can create a chart comparing model performance across instances with short, medium, or long ground truth labels:
Qualitative Comparison
One final handy feature of Zeno is the ability to compare the outputs of two models on the same examples. You can do this by clicking on the qualitative comparison button:
On the page, you can then select the two models you want to compare side-by-side, and select the metric you'd like to compare them by. Here we choose gpt-3.5-turbo
and vicuna
and compare them according to the bert_score
metric.
You can also sort the outputs by the difference between the scores between the two systems by clicking on the header of the "difference" column. This allows you to find examples where one of the two systems produces much better outputs than the other, such as the one below where one model suddenly went off track and produced an incomprehensible output.