Agent as Judge
How to use an agent as a judge
Lytix supports evaluating agentic workflows. This means not only do we evaluate the input/output of the flow, but also can pass in a data source (e.g. repository) to further evaluate the output.
🚨 Prerequisite Login & setup your lytix
CLI here.
Create a Test Suite
The first step is to create a test suite. This is a group of tests that will be run together.
Update config.json
After creating the test suite, you’ll need to update the config.json
file to define what repository you’d like to evaluate.
Note We currently only support public GitHub repositories. Please reach out to support@lytix.com if you’d like to evaluate a private repository.
Create a Test in a Suite
Update start.py
To collect test data, we want to allow full flexibility. Thus, we create a start.py
file in the folder of the test. This file will be executed to collect the data.
The only requirement is that the start.py
file prints the following JSON object to stdout:
Where messages
is an array of {role: "user" | "assistant", content: "..."}
and sources
is an array of file paths that contain the data we want to evaluate the output against.
Note Currently we only support a single user
message.