Basic CLI usage#

All of FlexEval’s features can be accessed using YAML configuration files.

Running an eval is as simple as pointing to that configuration file and using the run() command:

flexeval run vignettes/eval_run.yaml

The file vignettes/eval_run.yaml demonstrates a complete configuration using the EvalRun schema.

 1data_sources:
 2  - type: file
 3    path: vignettes/conversations.jsonl
 4database_path: eval_results.db
 5eval:
 6  metrics:
 7    function:
 8      - name: has_question_mark
 9      - name: is_role
10        kwargs:
11          role: assistant
12    rubric:
13      - name: assistant_asks_a_question
14        depends_on:
15          - name: is_role
16            kwargs:
17              role: assistant
18            metric_min_value: 1
19  grader_llm:
20    function_name: echo_completion
21    kwargs:
22      response: Reasoning.\nNO
23config:
24  clear_tables: True
25  logs_path: vignettes/log
26rubric_paths:
27 - vignettes/custom_rubrics.yaml
28function_modules:
29 - vignettes/custom_functions.py

Use the summarize_metrics() command to print a result summary:

python -m flexeval summarize-metrics vignettes/eval_run.yaml

See Metric Analysis for a more in-depth vignette analyzing FlexEval outputs.