Fixes #39323 - Optimize grouped authorization filter handling#10988
Fixes #39323 - Optimize grouped authorization filter handling#10988pablomh wants to merge 6 commits into
Conversation
Group granular filters by effective taxonomy scope and derive granularity from the resource class so large filter sets avoid repeated per-filter authorization metadata checks. Co-authored-by: Cursor <cursoragent@cursor.com>
adamruzicka
left a comment
There was a problem hiding this comment.
Some at-a-glance styled notes, more coming later
| def normalize_taxonomy_group_condition(condition) | ||
| matches = condition.to_s.match(/\A(?<key>\w+_id) \^ \((?<ids>[\d,\s]+)\)\z/) | ||
| return condition if matches.blank? | ||
|
|
||
| ids = matches[:ids].split(',').map(&:to_i).uniq.sort | ||
| QueryBuilder.key_value_in(matches[:key], ids) | ||
| end |
There was a problem hiding this comment.
If condition is expected to be taxonomy_search that is generated by us, then it would be better to modify the code that generates it to make sure it will always be sorted.
There was a problem hiding this comment.
Agreed — addressed in c2e36f3. Added .sort in Filter#build_taxonomy_search_string_from_ids so new filters produce canonical taxonomy_search strings going forward.
For existing rows that may have unsorted IDs, I kept a lightweight normalize_taxonomy_search in the Authorizer grouping path so legacy data still collapses correctly without requiring a migration.
There was a problem hiding this comment.
I kept a lightweight normalize_taxonomy_search in the Authorizer grouping path so legacy data still collapses correctly without requiring a migration.
While this reads like the intent is for this to be temporary, things like that have a tendency to stick around. How about adding a migration to recalculate (or just reorder) all the taxonomy searches to put everyone on the happy path?
There was a problem hiding this comment.
Agreed — replaced the runtime normalization with a data migration in f7d9bdd.
The initial approach kept normalization at runtime because we wanted to avoid the risk of a migration touching every filters.taxonomy_search row in production — a bad regex could silently mangle search strings and broaden or restrict permissions. We opted for a read-only normalizer that could not corrupt data.
Now that the write path is canonical (.sort in Filter#build_taxonomy_search_string_from_ids), a one-time migration is the cleaner solution.
The migration (CanonicalizeFilterTaxonomySearch) sorts the IDs inside each persisted taxonomy_search value. It strictly matches only the three formats that Filter#build_taxonomy_search writes today:
(organization_id ^ (...))(location_id ^ (...))((organization_id ^ (...)) AND (location_id ^ (...)))
Rows with unrecognized formats are skipped and counted (logged via say), so any unexpected legacy data is preserved rather than silently rewritten. The migration uses a local model (MigrationFilter < ApplicationRecord) to avoid coupling to the app Filter class.
With that in place:
normalize_taxonomy_searchis removed from Authorizer- Grouping is now a plain
filters.group_by(&:taxonomy_search).values - Added a test in
filter_test.rbprovingFilterwrites sorted IDs on save
- Sort taxonomy IDs at generation time in Filter#build_taxonomy_search_string_from_ids so new filters produce canonical taxonomy_search strings - Extract Filter.granular_for_resource? class method, delegate from Filter#granular? and Authorizer -- single source of truth for granularity - Return [] instead of nil for universal grant in grouped_granular_filter_conditions to match the original code shape - Keep normalization at consumption in Authorizer for legacy data compatibility Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Style/PerlBackrefs: use Regexp.last_match instead of $1/$2 - Layout/FirstArrayElementIndentation: fix array indentation - Layout/ArgumentAlignment: fix multi-line argument alignment Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4243e7d to
812a7cb
Compare
adamruzicka
left a comment
There was a problem hiding this comment.
Test setup: 161 organizations, 1,814 locations, 400 roles with 33 filters each, non-admin user assigned to all roles.
How are the organizations and locations assigned? I would expect this patch to bring biggest improvements for situations where many orgs/locs are assigned to the roles themselves, right? This wouldn't really address a case where a user has a single role with 1 org/loc but the user itself is in all the orgs/locs
Also, test failures are related
|
In our reproduction, all 400 roles are assigned the same 36 organizations and 36 locations (out of 161/1,814 total). This is the best case for the grouping optimization — all 13,200 filters share the same taxonomy scope and collapse into a single group.
Correct — the grouping optimization helps when multiple filters share the same taxonomy scope. A single role with unique taxonomy assignments wouldn't benefit from grouping. However, the other two improvements in this PR still help regardless of taxonomy diversity:
We could extend the test setup with varied per-role taxonomy assignments to measure the diverse case. Would that help? |
When only one taxonomy group exists, return the condition directly
instead of wrapping it in QueryBuilder.join('OR', [single_item])
which adds unnecessary parenthesization.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a migration that canonicalizes existing filter taxonomy_search values by sorting the IDs inside each organization_id/location_id clause. This allows the Authorizer to group filters directly by their persisted taxonomy_search value without runtime normalization. - Migration uses strict pattern matching against the four known taxonomy_search formats, skips unrecognized rows with a count - Migration-local model avoids coupling to the app Filter class - Remove normalize_taxonomy_search from Authorizer - Simplify grouped_granular_filters to group_by(&:taxonomy_search) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
f9cab5d to
68618db
Compare
…apping - FilterTest: assign orgs/locs through the role and enforce_inherited_taxonomies instead of passing :organizations directly to Filter (Filter has no such setter) - Remove single-condition shortcut that skipped QueryBuilder.join parenthesization, breaking the expected output format for downstream callers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
68618db to
7d5d6dc
Compare
Summary
Filter#granular?→resource_type→filterings.first.permission.resource_typetaxonomy_searchvalueContext
With many roles and filters (e.g., 400 roles × 33 filters = 13,200 filters), the authorization path was spending 5-7 seconds on per-filter Ruby-side work before even building the scoped search query. The repeated
Filter#granular?calls triggered lazy association loading for each filter, and identical taxonomy clauses were duplicated across all filters instead of being grouped.Community thread: https://community.theforeman.org/t/46388
Results
Test setup: 161 organizations, 1,814 locations, 400 roles with 33 filters each, non-admin user assigned to all roles.
/api/hosts/api/hosts?per_page/api/job_templates/api/job_templates?searchAuthorization overhead dropped from 5-7 seconds to 57-75ms.
Test plan
🤖 Generated with Claude Code