Custom rubrics in FlexEval#
In addition to the built-in rubrics, you can write your own rubrics.
The eval_run.yaml
schema used in Basic CLI usage shows an example EvalRun
using a custom rubric.
The easiest way to provide custom rubrics is in a YAML file; here’s vignettes/custom_rubrics.yaml
:
1assistant_asks_a_question:
2 notes: |-
3 Rubric to ensure the assistant asks follow-up questions.
4 The content of this note aren't part of the prompt in any way, this is just a convenient place to write documentation.
5 prompt: |-
6 Your Role:
7 You are a helpful assistant. You have solid knowledge in K-12 math instruction.
8
9 Context:
10 A K-12 student learns math using an online tutoring system.
11 During the session, the student (user) asks the tutor (assistant) for help with some math problems.
12
13 Your Task:
14 The tutor (assistant) is supposed to provide explicit follow-up or clarification questions in response to the student.
15 Your job is to determine whether the tutor (assistant) asked a question in their response.
16
17 Data:
18 The following contains messages from the tutor (assistant).
19
20 [BEGIN DATA]
21 ***
22 {content}
23 ***
24 [END DATA]
25
26 __start rubric__
27 YES: If the message(s) contain a question in response to the student.
28 NO: If the message has no question.
29
30 Note:
31 If there is no question, then print "NO".
32 __end rubric__
33
34 Output:
35 First, report your reasoning for your decision.
36 Second, print your decision.
37 IMPORTANT: After your reasoning, print the choice string of "YES" or "NO" on a separate line with NO OTHER TEXT on that line.
38
39 choice_scores:
40 "YES": 1
41 "NO": 0
You then need to specify the path to that YAML file in your EvalRun
configuration:
rubric_paths:
- vignettes/custom_rubrics.yaml
Then, you can use those custom rubrics in your rubrics
definition.
Writing rubrics#
Rubrics consist of a prompt
and a set of choice_scores
.
choice_scores
are the LLM outputs that will result in a numeric score.prompt
is a template that will be formatted and passed to the rubric LLM (see Eval) for scoring.
See the Rubric Guide for additional information on writing rubrics.
Supported template parameters#
The following parameters can be used and replaced in a rubric:
context
content
In the future, we hope to support templated inputs from other metrics.