Skip to content

Make OnDemandFeatureView work for having both aggregations and transformations #5689

@HaoXuAI

Description

@HaoXuAI

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

With PR #5666, OnDemandFeatureView supports aggregations. But it is exclusive with transformation.
the reason is the infer_schema will break if having both defined in feature views.
This can be solved by reworking the infer_schema.

    def infer_features(self) -> None:
        random_input = self._construct_random_input(singleton=self.singleton)
        inferred_features = self.feature_transformation.infer_features(
            random_input=random_input, singleton=self.singleton
        )

Schema inference doesn't run the aggregation step - it only:

  1. Looks at the source feature view's schema
  2. Creates fake random data matching those columns
  3. Passes it directly to the UDF
    But at runtime, the UDF receives aggregated columns, not the original source columns.
    So the UDF would need to be written for aggregated input:
  def my_odfv(inputs):
      return {"trips_doubled": inputs["sum_trips"] * 2}  # expects sum_trips

But schema inference passes it:

  {"trips": [10], "rating": [4.5]}

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions