Fixes #39298 - Use bulk insert for new fact values on persisted hosts#10979
Fixes #39298 - Use bulk insert for new fact values on persisted hosts#10979pablomh wants to merge 1 commit into
Conversation
pablomh
left a comment
There was a problem hiding this comment.
The key difference is which method each PR targets: #8823 applied upsert_all to update_facts (existing facts changing value), where the current code already uses update_all with no callbacks — so there was no measurable gain. Our PR applies insert_all to add_new_facts (facts being created for the first time), where the current code calls create! per fact with individual INSERTs.
During registration this is the dominant path — all facts are new (100 per RHEL 8 host, 117 per RHEL 9/10 host) — and at concurrent scale the per-fact INSERTs compound. Measured on a 16-CPU Satellite with only this patch applied (no other changes), mixed workload (50% RHEL 8, 25% RHEL 9, 25% RHEL 10):
| Concurrency | INSERTs/host (before → after) | AR P50 (before → after) | AR P95 (before → after) |
|---|---|---|---|
| 152 | 100 (el8) / 117 (el9,el10) → 1 | 2,610ms → 1,297ms (−50%) | 8,214ms → 3,350ms (−59%) |
| 304 | 100 (el8) / 117 (el9,el10) → 1 | 4,579ms → 2,004ms (−56%) | 11,922ms → 6,232ms (−48%) |
| 456 | 100 (el8) / 117 (el9,el10) → 1 | 8,492ms → 2,619ms (−69%) | 13,036ms → 6,884ms (−47%) |
| 608 | 100 (el8) / 117 (el9,el10) → 1 | 8,701ms → 3,215ms (−63%) | 13,963ms → 7,566ms (−46%) |
| 760 | 100 (el8) / 117 (el9,el10) → 1 | 9,789ms → 3,666ms (−63%) | 14,138ms → 7,120ms (−50%) |
ekohl
left a comment
There was a problem hiding this comment.
These items may not be needed, but I wrote down my thoughts and notes when reviewing the code.
Replace per-record create! with insert_all for new fact values when the host already exists in the database. For new (unsaved) hosts, continue using build to defer persistence until the host is saved. Key changes: - bulk_insert_new_fact_values uses host.fact_values.insert_all with record_timestamps: true, letting Rails handle host_id, created_at, and updated_at automatically - facts_to_create changed from Array to Hash via Hash#except, carrying both name and value to avoid redundant lookups - add_new_fact simplified to always use build (only called for new records now) Verified on foremanctl deployment: 117 fact_values created with correct host_id and timestamps via association-scoped insert_all. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
f61d8bb to
82ff1ea
Compare
|
Ping? |
|
I took another pass over your suggestion and I think the nicest end state would be to apply the Concretely, I mean:
So roughly: def add_new_facts
if host.new_record?
facts_to_create.each_key { |name| add_new_fact(name) }
else
bulk_insert_new_fact_values(facts_to_create)
end
@counters[:added] = facts_to_create.size
end
def bulk_insert_new_fact_values(new_facts)
return if new_facts.empty?
records = new_facts.map do |name, value|
{ value: value, fact_name_id: fact_names[name].id }
end
host.fact_values.insert_all(records, unique_by: [:fact_name_id, :host_id], record_timestamps: true)
endand in db_facts_names = db_facts.map(&:first) & fact_names.keys
@facts_to_create = facts.except(*db_facts_names)I like that shape because it keeps the current no-partition flow for compose facts, removes repeated If we go this way, I’d also add focused coverage for Does that match what you had in mind? |
Summary
Replace per-fact
create!loop with a singleinsert_allfor new facts on persisted hosts, reducing DB round-trips by ~94% (108 INSERTs → 6).Changes
fact_importer.rb:add_new_factspartitions into compose facts (nil values, handled individually) and regular facts (bulk inserted viainsert_all)bulk_insert_new_fact_values: new method usingFactValue.insert_allwithunique_by: [:fact_name_id, :host_id]@counters = { added: 0, updated: 0, deleted: 0 }prevents nil accessHow to test
bundle exec rake test TEST=test/unit/fact_importer_test.rb