Data specifications by type hints
Typespecs is a lightweight Python library that leverages typing.Annotated to embed, extract, and manage metadata (such as units, categories, and descriptions) directly within your data structures.
It keeps your code clean by binding specifications directly to your type hints.
The extracted specifications are returned as a transparent subclass of pandas.DataFrame, making it instantly compatible with the rich PyData ecosystem.
pip install typespecsYou can attach metadata to your class fields using Annotated and the typespecs.Spec object.
The Spec object acts as a read-only dictionary, ensuring your metadata remains immutable and safe from runtime modifications.
Once your data structure is defined, use typespecs.from_annotated to parse the instance and extract both the actual data and its associated metadata into a DataFrame object.
from dataclasses import dataclass
from typespecs import ITSELF, Spec, from_annotated
from typing import Annotated as Ann, TypeVar
@dataclass
class Weather:
temp: Ann[list[float], Spec(category="data", name="Temperature", units="K")]
wind: Ann[list[float], Spec(category="data", name="Wind speed", units="m/s")]
loc: Ann[str, Spec(category="info", name="Observed location")]
weather = Weather([273.15, 280.15], [5.0, 10.0], "Tokyo")
specs = from_annotated(weather)
print(specs) category data name type units
temp data [273.15, 280.15] Temperature list[float] K
wind data [5.0, 10.0] Wind speed list[float] m/s
loc info Tokyo Observed location <class 'str'> <NA>
Typespecs simplifies working with nested types.
You can easily create reusable type aliases with built-in specifications.
Furthermore, by using the special typespecs.ITSELF object, the library dynamically captures the subtype (e.g., float in list[float]) as one of metadata.
T = TypeVar("T")
Dtype = Ann[T, Spec(dtype=ITSELF)]
@dataclass
class Weather:
temp: Ann[list[Dtype[float]], Spec(category="data", name="Temperature", units="K")]
wind: Ann[list[Dtype[float]], Spec(category="data", name="Wind speed", units="m/s")]
loc: Ann[str, Spec(category="info", name="Observed location")]
weather = Weather([273.15, 280.15], [5.0, 10.0], "Tokyo")
specs = from_annotated(weather)
print(specs) category data dtype name type units
temp data [273.15, 280.15] <class 'float'> Temperature list[float] K
wind data [5.0, 10.0] <class 'float'> Wind speed list[float] m/s
loc info Tokyo <NA> Observed location <class 'str'> <NA>
By default, missing metadata values are filled with pandas.NA.
You can override this behavior and specify a custom fallback value by using the default parameter in from_annotated.
specs = from_annotated(weather, default=None)
print(specs) category data dtype name type units
temp data [273.15, 280.15] <class 'float'> Temperature list[float] K
wind data [5.0, 10.0] <class 'float'> Wind speed list[float] m/s
loc info Tokyo None Observed location <class 'str'> None
By default, typespecs neatly merges nested metadata (e.g., float in list[float]) into a single parent row.
If you need to inspect the exact structural hierarchy of your annotations, set merge=False in from_annotated.
This unpacks the tree, distinguishing between the parent collection and its elements.
specs = from_annotated(weather, merge=False)
print(specs) category data dtype name type units
temp data [273.15, 280.15] <NA> Temperature list[float] K
temp/0 <NA> <NA> <class 'float'> <NA> <class 'float'> <NA>
wind data [5.0, 10.0] <NA> Wind speed list[float] m/s
wind/0 <NA> <NA> <class 'float'> <NA> <class 'float'> <NA>
loc info Tokyo <NA> Observed location <class 'str'> <NA>