-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
With PR #5666, OnDemandFeatureView supports aggregations. But it is exclusive with transformation.
the reason is the infer_schema will break if having both defined in feature views.
This can be solved by reworking the infer_schema.
def infer_features(self) -> None:
random_input = self._construct_random_input(singleton=self.singleton)
inferred_features = self.feature_transformation.infer_features(
random_input=random_input, singleton=self.singleton
)
Schema inference doesn't run the aggregation step - it only:
- Looks at the source feature view's schema
- Creates fake random data matching those columns
- Passes it directly to the UDF
But at runtime, the UDF receives aggregated columns, not the original source columns.
So the UDF would need to be written for aggregated input:
def my_odfv(inputs):
return {"trips_doubled": inputs["sum_trips"] * 2} # expects sum_trips
But schema inference passes it:
{"trips": [10], "rating": [4.5]}
Describe the solution you'd like
A clear and concise description of what you want to happen.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.