Skip to content

Improve performance of in_list expressions #3027

@andygrove

Description

@andygrove

What is the problem the feature request solves?

From CometComparisonExpressionBenchmark (#3026):

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
in_list:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                                30             32           3         35.2          28.4       1.0X
Comet (Scan)                                         30             32           1         35.1          28.5       1.0X
Comet (Scan + Exec)                                  37             38           1         28.4          35.2       0.8X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
not_in_list:                              Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                                30             32           1         34.7          28.8       1.0X
Comet (Scan)                                         31             33           1         34.2          29.3       1.0X
Comet (Scan + Exec)                                  37             38           1         28.6          34.9       0.8X

Describe the potential solution

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions