flexeval.run_utils#

Utilities for runner.

Functions

_cleanup_stale_dataset(dataset)

Delete a partially-loaded Dataset and its child rows.

build_eval_set_run(runner)

build_evalsetrun_datasets(evalrun, evalsetrun)

create_dataset(data_source)

find_dataset_by_name(name)

Return the loaded Dataset with this name, or None if no such dataset exists.

load_datasets(evalrun)

set_datasets_for_evalsetrun(datasets, evalsetrun)

flexeval.run_utils.build_eval_set_run(runner: EvalRunner) EvalSetRun[source]#
flexeval.run_utils.build_evalsetrun_datasets(evalrun: EvalRun, evalsetrun: EvalSetRun) list[Dataset][source]#
flexeval.run_utils.create_dataset(data_source: DataSource) Dataset[source]#
flexeval.run_utils.find_dataset_by_name(name: str) Dataset | None[source]#

Return the loaded Dataset with this name, or None if no such dataset exists.

If a Dataset with this name exists but is not marked is_loaded (the remnant of a crashed prior load), it is treated as stale: cleaned up via _cleanup_stale_dataset() and None is returned, so the caller can proceed as if no dataset existed.

Raises:

ValueError – If more than one Dataset has this name, or if a stale unloaded Dataset has derived rows (metrics or eval-run links) that suggest a genuine integrity problem — see _cleanup_stale_dataset.

flexeval.run_utils.load_datasets(evalrun: EvalRun) list[Dataset][source]#
flexeval.run_utils.set_datasets_for_evalsetrun(datasets: list[Dataset], evalsetrun: EvalSetRun)[source]#