Getting started#

Installation#

FlexEval is available on PyPI as python-flexeval.

Install using pip:

pip install python-flexeval

Install using poetry:

poetry add python-flexeval

Install using uv:

uv add python-flexeval

Usage#

Create and run an evaluation:

import flexeval
from flexeval.schema import Eval, EvalRun, FileDataSource, Metrics, FunctionItem, Config

data_sources = [FileDataSource(path="vignettes/conversations.jsonl")]
eval = Eval(metrics=Metrics(function=[FunctionItem(name="flesch_reading_ease")]))
config = Config(clear_tables=True)
eval_run = EvalRun(
    data_sources=data_sources,
    database_path="eval_results.db",
    eval=eval,
    config=config,
)
flexeval.run(eval_run)

This example computes Flesch reading ease for every turn in a list of conversations provided in JSONL format. The metric values are stored in an SQLite database called eval_results.db.

The basic approach:
  • Create an Eval defining the functions and metrics that should be computed on the inputs.

  • Create an EvalRun defining the input and output data sources.

  • Invoke run() to execute the evaluation.

For more information about using FlexEval, continue on to the User guide.

For usage examples, consult the Vignettes.