Cleaning up batch tests #2

p-schlickmann · 2025-11-12T16:01:04Z

Following up our discussion at rails#142

These are some minimal improvements to the organization of the tests. Very great work with the batches @jpcamara !

* Introduces a "batch" concept, similar to batches present in Sidekiq Pro and GoodJob * Batches monitor a set of jobs, and when those jobs are completed can fire off a final job * This introduces a SolidQueue::JobBatch model, as well as the ability to enqueue jobs and associate them with the batch * There are still more ideas to figure out, but this provides a basic batch scaffolding to spark discussion

…el setting

* This means we can store the arguments and settings by letting the user do `BatchJob.new(arguments).set(options)` * Yield the batch in `enqueue` in case someone needs info from it * When you serialize then deserialize an activejob instance, the arguments are in the serialized_arguments field and can only be transferred over by the private method `deserialize_arguments_if_needed`. This is pretty janky, so there is probably something i'm missing * `perform_all_later` let's us do a perform_later even with instance, which does not seem to be possible on the instances themselves * Make sure `batch` is still first arg of the batch callback * Add spec for adding arguments and options to the batch callback

* Support Ruby < 3.2 by removing the implicit key/variable syntax

* on_failure fires the first time any of the jobs fail, even once * on_success only fires if all jobs work (after retries) * remove unused job_id

* Allows enqueueing a job within a job, as part of the batch

* Support nested batches * Parent batches will not complete until all child batches have been completed

* Attach success jobs to the parent batch, not to the current batch (which has already finished at this point)

* Thanks to Mikael Henriksson for his work in rails#590. His work decentralizes management of batch status by moving it to the BatchUpdateJob, and tracking status using counts rather than querying specific job statuses after the fact. This is a much simpler approach to tracking the jobs, and allows us to avoid a constantly polling set of queries in the dispatcher. Also add in arbitrary metadata to allow tracking data from start to end of execution. This also means enqueueing a BatchUpdateJob based on callbacks in two different kinds of Batchable, which are included when a job is updated and finished, or when a FailedExecution is created (since failed jobs never "finish"). * This batch feature already took some inspiration from the GoodJob batch implementation (https://github.com/bensheldon/good_job). But now we also increase that by adopting some of the buffering and abstractions in a similar form as GoodJob. To discourage heavy reliance on the JobBatch model, it has been renamed to BatchRecord, and a separate Batch interface is how you interact with batches, with some delegation to the core model. * A new Buffer class (also modeled after GoodJob) was added specifically for batches. This was primarily added to support enqueue_after_transaction_commit. We now override the ActiveJob #enqueue method so we can keep track of which jobs are attempting to enqueue. When enqueue_after_transaction_commit is on, those jobs do not enqueue until all transactions commit. By tracking them at the high level enqueue and keeping a buffer of jobs, we can ensure that the jobs get tracked even when their creation is deferred until the transaction is committed. The side benefit is that we get to enqueue all the jobs together, probably offering some performance advantage. This buffer also keeps track of child batches for the same reason. * To support triggering a callback/BatchUpdateJob when a job finishes, the update to finished_at needed to become an update! call * As a simplification, on_failure is now only fired after all jobs finish, rather than at the first time a job fails * The adapter logic itself also needed to be updated to support the buffer and enqueue_after_transaction_commit. If a job is coming from a batch enqueue, we ignore it here and allow the batching process to enqueue_all at the end of the enqueue block. If the job is originally from a batch, but is retrying, we make sure the job counts in the batch stay updated. I don't love this addition, since it adds alot of complication to the adapter code, all solely oriented around batches * Batches benefit from keeping jobs until the batch has finished. As such, we ignore the preserve jobs setting, but if it is set to false, we enqueue a cleanup job once the batch has finished and clear out finished jobs * Implement preserved jobs test and remove todo * Idempotent updates with pessismistic locks * Check if it finished before we acquired the lock * Use enqueue_all directly rather than passing through activejob for completion jobs Co-authored-by: Mikael Henriksson <[email protected]>

* BatchExecution allows us to know for sure we only ever run completion on a job once. We destroy it and update the counts in a transaction. Also can remove the batch_processed_at field from jobs, which are meant to be touched as little as possible and relevant states reflected in *_execution models * It also gives us a slightly cleaner interface in the batchable classes * Updated some table naming and pruned unused fields/indexes * Increase child batch count as new batches are enqueued, even in existing batches * Refactor to a unified Batch interface * It was overly complicated to split Batch and BatchRecord apart just to keep a more strict interface * That concept was taken from GoodJob, but it didn't feel in the spirit of the simplicity of the SolidQueue project. It was alot of concepts to juggle in your head * Also moved around some files (like the cleanup and empty jobs) to the more appropriate app/jobs

* Use BatchPreparable module to add batch executions and update the batch totals after creating ready/claimed records * Remove buffer logic - this makes it harder to implement empty batches, but results in much simpler to understand code. Also batches will always exist now, whether the jobs inside ever get enqueued or not * Move logic for getting status and querying by status scopes to Trackable module * total_child_batches wasn't being used for much, so remove it

* Making it explicit is the easiest option, and the most in alignment with solid queue * Fix errors around upserting across providers. SQLite and Postgres share identical syntax (at least for this use-case) and mysql works differently

* Reduce load from each callback, and makes checks less susceptible to race conditions * Make sure monitor jobs can run, even absent of an ApplicationJob * Allow setting the queue on the maintenance jobs * Bring back emptyjob for empty queues

* We still track it, but it was causing alot of race conditions while trying to keep exclusively in callbacks. Running in a job or worker/dispatcher it works easily, but adds more overhead to the code and processing * Move to explicit timestamp fields instead of status fields so it's easier to track specifics of batch transitions * Move batches lower in the schema, after current models

* By querying batch executions remaining, the query times remain very fast. * When we are constantly updating the single batch row counts, it becomes a hotspot. Fast executing jobs quickly accumulate and slow down overall job processing (processing a few thousand jobs goes for 10ish seconds to 40ish seconds). This still adds a bit of overhead, but significantly less (10ish seconds to 15ish seconds) * Handle batch completion in an after_commit to make sure the transaction is visible before checking executions. This may mean we need to introduce some monitoring in the cases an after_commit fails to fire due network issues or a database issue

* Batch execution is managed through the BatchExecution model, which is dependent destroyed when jobs are destroyed * Since it checks batch completion in an after_commit on: :destroy, it already gets checked, even when the job is not preserved * Because we rely on batch executions and counts, we don't need the jobs to stick around to properly run a batch

* Without always updating it on the fly, it's always the same as total_jobs, or is 0. So it's not really useful as a distinct column

* Remove multi step job example since we don't handle hierarchy anymore for the time being

* This is a placeholder for a later rails-based abstraction * Use bigint references like everything else and directly relate to the record

* GoodJob and Sidekiq do not require an explicit keyword, they just accept any other arguments as metadata and set them that way * Also make it so we always return a hash for metadata, and that hash is with_indifferent_access

p-schlickmann · 2025-11-12T16:03:22Z

test/integration/batch_lifecycle_test.rb

    ApplicationJob.enqueue_after_transaction_commit = false if defined?(ApplicationJob.enqueue_after_transaction_commit)
    SolidQueue.preserve_finished_jobs = true
-    SolidQueue::Batch.maintenance_queue_name = nil
  end


these removals in setup and teardown are happening because this is already being handled either in test_helper.rb or by the use of transactional tests

p-schlickmann · 2025-11-12T16:03:48Z

test/integration/batch_lifecycle_test.rb

+      JobBuffer.add "Hi failure #{batch.id}!"
+    end
+  end
+


just moving those jobs from the bottom to the top of the file

p-schlickmann · 2025-11-12T16:35:01Z

test/integration/batch_lifecycle_test.rb

  def job!(active_job)
-    SolidQueue::Job.find_by!(active_job_id: active_job.job_id)
+    skip_active_record_query_cache do
+      SolidQueue::Job.find_by!(active_job_id: active_job.job_id)


this should be fine but I just wanted to make sure we are skipping active record query cache, as we do in other tests

p-schlickmann · 2025-11-12T16:36:20Z

test/models/solid_queue/batch_test.rb


 class SolidQueue::BatchTest < ActiveSupport::TestCase
-  self.use_transactional_tests = false
+  class BatchCompletionJob < ApplicationJob


I moved BatchCompletionJob here instead of a separate file, since it's only being used in this test

p-schlickmann · 2025-11-12T16:45:54Z

test/dummy/app/jobs/sleepy_job.rb

wasn't being used anywhere

p-schlickmann · 2025-11-12T16:46:18Z

test/dummy/app/jobs/batch_completion_job.rb

moved to test/models/solid_queue/batch_test.rb

jpcamara and others added 30 commits November 5, 2025 21:08

Use ActiveSupport::IsolatedExecutionState to honor user isolation lev…

bc56edd

…el setting

Ability to retrieve batch from a job

504042b

Use text so the jobs store properly on mysql

6ceca41

* Support Ruby < 3.2 by removing the implicit key/variable syntax

Handle on_failure and on_success

953bb32

* on_failure fires the first time any of the jobs fail, even once * on_success only fires if all jobs work (after retries) * remove unused job_id

Allow enqueueing into a batch instance

bd16f4a

* Allows enqueueing a job within a job, as part of the batch

Block enqueueing if the batch is finished

2998d74

Migration to allow nesting batches

7c60234

* Support nested batches * Parent batches will not complete until all child batches have been completed

Expanded batch readme

fc319c9

Force an initial batch check

871aef2

Initial batch lifecycle tests

2f05ba9

* Attach success jobs to the parent batch, not to the current batch (which has already finished at this point)

Add job batches to queue_schema.rb as well

7274e93

Test updates

af0c583

Leftover from previous implementation

3e24358

Move batch completion checks to job

0a8598a

* Reduce load from each callback, and makes checks less susceptible to race conditions * Make sure monitor jobs can run, even absent of an ApplicationJob * Allow setting the queue on the maintenance jobs * Bring back emptyjob for empty queues

Support rails versions that don't have after_all_transactions_commit

0761fd2

Fix starting batch in rails 7.1

60424d9

Helper status method

58a236f

Remove parent/child batch relationship, which simplifies the logic

6ad1be1

Removing pending_jobs column

554afd5

* Without always updating it on the fly, it's always the same as total_jobs, or is 0. So it's not really useful as a distinct column

Update doc to reflect current feature state

80af4e0

* Remove multi step job example since we don't handle hierarchy anymore for the time being

We always save the batch first now, so we don't need to upsert

a195e25

jpcamara and others added 9 commits November 5, 2025 21:08

Rubocop

6da0e9f

Accidental claude.md

8e583f1

Allow omitting a block, which will just enqueue an empty job

46e117c

Switch batch_id to active_job_batch_id

fc2f227

* This is a placeholder for a later rails-based abstraction * Use bigint references like everything else and directly relate to the record

Make it so metadata is more ergonomic to include

437a780

* GoodJob and Sidekiq do not require an explicit keyword, they just accept any other arguments as metadata and set them that way * Also make it so we always return a hash for metadata, and that hash is with_indifferent_access

Bad query field

ca61ca5

Update metadata interface

cde32c3

Give more breathing room for CI test runs

130ea3b

Cleaning up batch tests

d154310

p-schlickmann commented Nov 12, 2025

View reviewed changes

test/dummy/app/jobs/sleepy_job.rb Outdated

Copy link

Author

p-schlickmann Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasn't being used anywhere

p-schlickmann commented Nov 12, 2025

View reviewed changes

test/dummy/app/jobs/batch_completion_job.rb Outdated

Copy link

Author

p-schlickmann Nov 12, 2025 •

edited

Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to test/models/solid_queue/batch_test.rb

jpcamara force-pushed the batch-poc branch from 130ea3b to 58abaae Compare November 24, 2025 02:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cleaning up batch tests #2

Cleaning up batch tests #2

Uh oh!

p-schlickmann commented Nov 12, 2025 •

edited

Loading

Uh oh!

p-schlickmann Nov 12, 2025

Uh oh!

p-schlickmann Nov 12, 2025

Uh oh!

p-schlickmann Nov 12, 2025

Uh oh!

p-schlickmann Nov 12, 2025

Uh oh!

p-schlickmann Nov 12, 2025

Uh oh!

p-schlickmann Nov 12, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Cleaning up batch tests #2

Are you sure you want to change the base?

Cleaning up batch tests #2

Uh oh!

Conversation

p-schlickmann commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

p-schlickmann Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

p-schlickmann Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

p-schlickmann Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

p-schlickmann Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

p-schlickmann Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

p-schlickmann Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

p-schlickmann commented Nov 12, 2025 •

edited

Loading

p-schlickmann Nov 12, 2025 •

edited

Loading