Evaluate a Single-Entity Task¶
Scenario¶
You have one structured entity per record and each entity has several fields. A common example is extracting multiple fields from one article, form, or document.
Runnable example¶
from typing import Optional
from pydantic import BaseModel
from extraction_testing import FeatureRule, RunConfig, TaskType, evaluate
class ArticleRecord(BaseModel):
row_identifier: int
headline_text: str
author_name: str
publish_date: Optional[str]
predicted_records = [
ArticleRecord(
row_identifier=1,
headline_text="breaking market rally",
author_name="J. Smith",
publish_date="2024-06-02",
),
ArticleRecord(
row_identifier=2,
headline_text="Local Sports Win",
author_name="Alex Coach",
publish_date="2024-06-05",
),
]
gold_records = [
ArticleRecord(
row_identifier=1,
headline_text="Breaking: Market Rally",
author_name="John Smith",
publish_date="2024-06-01",
),
ArticleRecord(
row_identifier=2,
headline_text="Local Sports Win",
author_name="A. Coach",
publish_date="2024-06-01",
),
]
run_config = RunConfig(
task_type=TaskType.SINGLE_ENTITY,
feature_rules=[
FeatureRule(feature_name="headline_text", feature_type="text"),
FeatureRule(
feature_name="author_name",
feature_type="text",
alias_map={"J. Smith": "John Smith", "Alex Coach": "A. Coach"},
),
FeatureRule(
feature_name="publish_date",
feature_type="date",
date_tolerance_days=1,
),
],
index_key_name="row_identifier",
)
result_bundle = evaluate(predicted_records, gold_records, run_config)
print(result_bundle.per_feature_metrics_data_frame)
print(result_bundle.total_metrics_data_frame)
print("row_accuracy:", result_bundle.row_accuracy_value)
What to expect¶
- the two rows are aligned by
row_identifier - headline and author should match on both rows after normalization and aliasing
publish_dateshould match on row1because it is within one daypublish_dateshould fail on row2because it is four days away
That means:
row_accuracy_valueshould be0.5, because only the first row is fully correctpublish_dateshould have weaker metrics thanheadline_textandauthor_name
When this guide applies¶
Use this workflow when:
- each record maps to one known entity
- you need per-field scoring across several fields
- alignment should happen by an explicit key, not similarity matching