Skip to content

Optimize the performance of IN operator for indexed properties #8698

@nalexn

Description

@nalexn

Problem

The issue was first reported over 6 years ago on stack overflow, but is still actual for latest realm versions.

When you use the predicate key IN %@ and supply a collection with a substantially large number of values (over a thousand), the filtering happens SO slowly that doing filtering outside the query element-by-element outperforms it by 50x, despite being inefficient on its own.

let values = Set(["value_1", "value_2", ..., "value_40000"])

// 1. Complete query with IN operator: over 25secs

let results = realm.objects(MyObject.self)
       .filter("key IN %@", values)
       // same for new syntax:
       // .where { $0.key.in(values) }

// 2. Filter outside realm: under 0.45secs

let results = realm.objects(MyObject.self)
      .filter { values.contains($0.key) }

The use case when this is needed:
We run a complex query on a large data set, which is done on a background thread so the UI isn't blocked. It passes a set of resulting objects' ids to the main thread, where we can quickly query objects by ids without blocking the UI.

Solution

It looks like the operator <key> IN <collection> is translated into <key> == value_1 OR <key> == value_2 OR ..., which works well on small sets of values, but quickly degrades when the collection is large enough.

I hope there is a way to optimize this part of the algorithm, at least when the supplied collection supports O(1) element presence check, like for Set<String>.

Alternatives

There is really no alternative to this API other than hacks like supplying a subset of the collection for values, or filtering manually after the query

How important is this improvement for you?

Would be a major improvement

Feature would mainly be used with

Local Database only

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions