Skip to content

Supported Field Types

Dataclasses

from dataclasses import dataclass

@dataclass
class Product:
    name: str
    price: float
    tags: list[str]

    @property
    def display_price(self) -> str:
        return f"${self.price:.2f}"

extractor = MetadataExtractor()
metadata = extractor.extract(Product)

assert metadata["name"].field_type == str
assert metadata["price"].numeric is True
assert metadata["tags"].multivalued is True
assert metadata["tags"].items_type == str
assert metadata["display_price"].computed is True

Pydantic Models

from pydantic import BaseModel, computed_field

class User(BaseModel):
    username: str
    email: str

    @computed_field
    @property
    def email_domain(self) -> str:
        return self.email.split("@")[1]

metadata = MetadataExtractor().extract(User)
assert metadata["email_domain"].computed is True

Literal Types

Literal types are unwrapped to their underlying type:

from typing import Literal

@dataclass
class Task:
    status: Literal["pending", "done"]    # → str, categorical
    priority: Literal[1, 2, 3]           # → int, categorical
    score: Literal[1.0, 2.0]             # → float, non-categorical

Nested Structures

The library traverses nested structures and builds dot-separated paths automatically:

@dataclass
class Department:
    name: str
    budget: float

@dataclass
class Company:
    name: str
    departments: list[Department]

metadata = MetadataExtractor().extract(Company)

assert "departments.name" in metadata
assert "departments.budget" in metadata
assert metadata["departments.name"].parent_field == metadata["departments"]

Optional and Union Types

@dataclass
class Config:
    timeout: int | None        # optional=True, field_type=int
    retries: Optional[int]     # optional=True, field_type=int

Union of multiple non-None types (e.g. int | str) raises InvalidTypeUnionError.

Categorical vs Non-Categorical

Fields are automatically classified:

Type categorical
str, int, bool True
float, Decimal, datetime, date, time, timedelta False
Composite or NonCategorical annotation False

Multivalued Annotations

When a field is a collection, annotations can be placed on the collection or on the items:

@dataclass
class Dataset:
    outer: Annotated[list[int], DocInfo("Outer doc")]           # doc = "Outer doc"
    inner: list[Annotated[int, DocInfo("Inner doc")]]           # doc = "Inner doc"
    both: Annotated[list[Annotated[int, DocInfo("Inner")]], DocInfo("Outer")]  # outer wins