Dead-letter Kafka records individually within a batch by mcruzdev · Pull Request #33 · kubesmarts/logic-apps

mcruzdev · 2026-06-17T17:48:35Z

Changes

The consumer now reads records in batches (Message<ConsumerRecords>). Batch ack/nack apply to the whole batch, so a single bad record cannot be nacked on its own without affecting the rest.

Instead of nacking, each failed record is produced to the data-index-events-dlq topic via a dedicated emitter, and the batch is acked exactly once after all dead-letter writes complete.

The batch offset is committed only after every dead-letter write has finished, so a failed record is never dropped: if a dead-letter write itself fails the batch is not committed (failure-strategy=fail) and the records are reprocessed.

Successfully processed records remain committed via the same ack.

Replaces the incoming dead-letter-queue failure strategy with an outgoing data-index-events-dlq channel.

Closes #32

The consumer now reads records in batches (Message<ConsumerRecords>). Batch ack/nack apply to the whole batch, so a single bad record cannot be nacked on its own without affecting the rest. Instead of nacking, each failed record is produced to the data-index-events-dlq topic via a dedicated emitter, and the batch is acked exactly once after all dead-letter writes complete. The batch offset is committed only after every dead-letter write has finished, so a failed record is never dropped: if a dead-letter write itself fails the batch is not committed (failure-strategy=fail) and the records are reprocessed. Successfully processed records remain committed via the same ack. Replaces the incoming dead-letter-queue failure strategy with an outgoing data-index-events-dlq channel. Signed-off-by: Matheus Cruz <matheuscruz.dev@gmail.com>

gmunozfe · 2026-06-23T11:06:21Z

@mcruzdev I have add this PR mcruzdev#2 that batches both workflow and task persistence.

In my benchmark, after adding this fix, we got a great improvement for "fork10" scenario with 40 requests/second:

Before persistence batch:

Kafka consumer ~700 events/sec
DB materialization ~290 rows/sec
lag at finish ~331k

After persistence batch:

Kafka consumer ~3.6k events/sec (5x)
DB materialization ~1.31k rows/sec (4.5x)
lag at finish ~9k and lag=0 after ~11s (almost real time, polling window is 10s)

The batched task upsert (persistBatch) cannot create the placeholder workflow needed when a task event arrives before its workflow, so the whole batch failed with a foreign key violation and was retried forever. On a FK violation, roll back and fall back to per-record persistence, which already creates placeholders idempotently. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Signed-off-by: Matheus Cruz <matheuscruz.dev@gmail.com>

mcruzdev force-pushed the kafka-batch-per-record-dlq branch from f069f64 to e108a5e Compare June 17, 2026 17:55

mcruzdev marked this pull request as ready for review June 18, 2026 01:04

gmunozfe and others added 2 commits June 23, 2026 09:33

Add also persistence batches to execute fewer commits (#2)

7a73fa7

fjtirado reviewed Jun 24, 2026

View reviewed changes

Comment thread ...main/java/org/kubesmarts/logic/dataindex/ingestion/kafka/service/KafkaLifecycleConsumer.java Outdated

fjtirado reviewed Jun 24, 2026

View reviewed changes

Comment thread ...main/java/org/kubesmarts/logic/dataindex/ingestion/kafka/service/KafkaLifecycleConsumer.java

Do a clean up

eebb719

Signed-off-by: Matheus Cruz <matheuscruz.dev@gmail.com>

fjtirado self-requested a review June 25, 2026 09:58

fjtirado approved these changes Jun 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dead-letter Kafka records individually within a batch#33

Dead-letter Kafka records individually within a batch#33
mcruzdev wants to merge 4 commits into
kubesmarts:mainfrom
mcruzdev:kafka-batch-per-record-dlq

mcruzdev commented Jun 17, 2026 •

edited

Loading

Uh oh!

gmunozfe commented Jun 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

mcruzdev commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

gmunozfe commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mcruzdev commented Jun 17, 2026 •

edited

Loading

gmunozfe commented Jun 23, 2026 •

edited

Loading