Description
Testing with the benchmark the following ingestion path:
Quarkus Flow → Kafka flow-lifecycle-out → Data Index ingestion → PostgreSQL
The pipeline is functionally working, but the initial benchmark results show a significant Data Index ingestion bottleneck when consuming raw lifecycle events from Kafka.
It's needed to evaluate whether grouped lifecycle events, similar to SonataFlow Data Index grouping, are needed to make the ingestion path scalable and comparable.
Current status
Mode 3 was validated with a single fork10 request.
A single fork10 request produces the expected Kafka event volume:
flow-lifecycle-out high watermark = 90
This matches the Quarkus Flow raw lifecycle event model:
workflow events:
- 11 workflows × 4 events = 44
task events:
total:
- 44 + 46 = 90 Kafka lifecycle events/request
So the Mode 3 path is functionally correct.
Issue detected
A 10-minute fork10 RATE=40 benchmark was executed with one Kafka partition and one Data Index consumer.
k6 result:
requests: 23,996
rate: ~39.99 req/s
http failures: 0
p95 latency: ~24 ms
Expected Kafka events:
23,996 requests × 90 events/request = 2,159,640 events
Kafka state after the run:
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG
flow-lifecycle-out 0 356,221 2,159,730 1,803,509
This means the consumer group was stable and consuming, but it was far behind the producer.
Approximate ingestion rate:
356,221 consumed events / 600s = ~594 events/sec
Producer rate during the benchmark:
2,159,640 produced events / 600s = ~3,599 events/sec
So the producer was generating events at 6x Data Index Kafka ingestion rate.
Solution
As in sonataflow, try to group events:
fork10:
90 messages/request → ~1 grouped message/request
~90x fewer Kafka records
Description
Testing with the benchmark the following ingestion path:
Quarkus Flow → Kafka flow-lifecycle-out → Data Index ingestion → PostgreSQLThe pipeline is functionally working, but the initial benchmark results show a significant Data Index ingestion bottleneck when consuming raw lifecycle events from Kafka.
It's needed to evaluate whether grouped lifecycle events, similar to SonataFlow Data Index grouping, are needed to make the ingestion path scalable and comparable.
Current status
Mode 3 was validated with a single fork10 request.
A single fork10 request produces the expected Kafka event volume:
flow-lifecycle-out high watermark = 90This matches the Quarkus Flow raw lifecycle event model:
workflow events:
task events:
total:
So the Mode 3 path is functionally correct.
Issue detected
A 10-minute
fork10 RATE=40benchmark was executed with one Kafka partition and one Data Index consumer.k6 result:
Expected Kafka events:
23,996 requests × 90 events/request = 2,159,640 eventsKafka state after the run:
This means the consumer group was stable and consuming, but it was far behind the producer.
Approximate ingestion rate:
356,221 consumed events / 600s = ~594 events/secProducer rate during the benchmark:
2,159,640 produced events / 600s = ~3,599 events/secSo the producer was generating events at 6x Data Index Kafka ingestion rate.
Solution
As in sonataflow, try to group events: