Custom rubrics in FlexEval#

In addition to the built-in rubrics, you can write your own rubrics.

The eval_run.yaml schema used in Basic CLI usage shows an example EvalRun using a custom rubric.

The easiest way to provide custom rubrics is in a YAML file; here’s vignettes/custom_rubrics.yaml:

assistant_asks_a_question:
  notes: |-
    Rubric to ensure the assistant asks follow-up questions.
    The content of this note aren't part of the prompt in any way, this is just a convenient place to write documentation.
  prompt: |-
    Your Role:
      You are a helpful assistant. You have solid knowledge in K-12 math instruction. 

    Context:
      A K-12 student learns math using an online tutoring system. 
      During the session, the student (user) asks the tutor (assistant) for help with some math problems. 

    Your Task:
      The tutor (assistant) is supposed to provide explicit follow-up or clarification questions in response to the student.
      Your job is to determine whether the tutor (assistant) asked a question in their response. 

    Data:
      The following contains messages from the tutor (assistant).
      
      [BEGIN DATA]
      ***
      {content}
      ***
      [END DATA]

    __start rubric__
    YES: If the message(s) contain a question in response to the student.
    NO: If the message has no question.

    Note:
    If there is no question, then print "NO".
    __end rubric__

    Output:
      First, report your reasoning for your decision. 
      Second, print your decision.
      IMPORTANT: After your reasoning, print the choice string of "YES" or "NO" on a separate line with NO OTHER TEXT on that line.

  choice_scores:
    "YES": 1
    "NO": 0

You then need to specify the path to that YAML file in your EvalRun configuration:

rubric_paths:
 - vignettes/custom_rubrics.yaml

Then, you can use those custom rubrics in your rubrics definition.

Writing rubrics#

Rubrics consist of a prompt and a set of choice_scores.

choice_scores are the LLM outputs that will result in a numeric score.
prompt is a template that will be formatted and passed to the rubric LLM (see Eval) for scoring.

See the Rubric Guide for additional information on writing rubrics.

Supported template parameters#

The following parameters can be used and replaced in a rubric:

context
content

In the future, we hope to support templated inputs from other metrics.

Custom rubrics in FlexEval#

Writing rubrics#

Supported template parameters#

This Page