Example Use Cases
Explore evaluation setups for a range of AI applications. For each use case we show how to load data, calculate metrics and features, and perform evaluation using Zeno.
Each use case has the following buttons to explore further:
The first will take you to an online example of this project with Zeno. The second links to the full reproducible Jupyter notebook for the example.
|Transcribe audio into text. In this example, we compare the performance of the different transcription models on the Speech Accent Archive, a dataset of diverse English speakers saying the same phrase.
|Evaluate LLMs on tax questions. In this example, we'll upload already generated evaluation results from the taxeval project.