`orchestrator`¶

This module contains the main user-facing entry points for running evaluations.

`evaluate(predicted_records, gold_records, run_config)`¶

Purpose:

choose the correct evaluator class from run_config.task_type
run the evaluation
return a ResultBundle

Parameters:

predicted_records: list of Pydantic models representing predictions
gold_records: list of Pydantic models representing gold data
run_config: RunConfig controlling task type and comparison behavior

Returns:

ResultBundle

Error conditions:

unsupported task_type raises ValueError
indexed task types raise ValueError if index_key_name is missing
single-feature evaluation raises ValueError if more than one feature rule is supplied

Side effects:

none

`build_run_context(run_config)`¶

Purpose:

create a RunContext for logging and traceability

Returns:

RunContext containing a run identifier, start timestamp, and configuration hash

Side effects:

none

Generated API details¶

`build_run_context(run_config)` ¶

Create a run context with identifier, timestamp, and config hash.

Source code in src/extraction_testing/orchestrator.py

def build_run_context(run_config: RunConfig) -> RunContext:
    """Create a run context with identifier, timestamp, and config hash."""
    run_identifier_value = timestamp_string()
    started_at_timestamp_value = datetime.now().isoformat(timespec="seconds")
    configuration_hash_value = hash_configuration(model_to_dict(run_config))
    return RunContext(run_identifier_value, started_at_timestamp_value, configuration_hash_value)

`evaluate(predicted_records, gold_records, run_config)` ¶

Convenience entry point to evaluate based on task type.

Source code in src/extraction_testing/orchestrator.py

def evaluate(predicted_records: List[BaseModel], gold_records: List[BaseModel], run_config: RunConfig) -> ResultBundle:
    """Convenience entry point to evaluate based on task type."""
    if run_config.task_type == TaskType.MULTI_ENTITY:
        tester = MultiEntityExtractionTest(run_config)
    elif run_config.task_type == TaskType.SINGLE_ENTITY:
        tester = SingleEntityExtractionTest(run_config)
    elif run_config.task_type == TaskType.SINGLE_FEATURE:
        tester = SingleFeatureExtractionTest(run_config)
    else:
        raise ValueError(f"Unsupported task type: {run_config.task_type}")
    return tester.test(predicted_records, gold_records)

orchestrator¶

evaluate(predicted_records, gold_records, run_config)¶

build_run_context(run_config)¶