Add MODE 3 implementation (ingestion with Kafka consumer) by mcruzdev · Pull Request #25 · kubesmarts/logic-apps

mcruzdev · 2026-05-26T03:50:30Z

This pull request introduces production-ready support and documentation for MODE 3 (Kafka-based ingestion) in the Data Index project, alongside improvements to the documentation build process. The main changes add comprehensive guidance, architecture, and code references for Kafka ingestion, clarify when to use each mode, and update the documentation system to support both Maven and npm workflows.

MODE 3 (Kafka) Support and Documentation:

Added a detailed architecture document for MODE 3 (Kafka ingestion) at data-index/data-index-docs/modules/ROOT/pages/architecture/kafka-mode.adoc, covering event pipeline, components, persistence, idempotency, out-of-order handling, error processing, and a comparison with other modes.
Updated CLAUDE.md to reflect MODE 3 as production ready, describe its architecture, clarify when to use it, and list new key components, code structure, and integration tests for Kafka ingestion. [1] [2] [3] [4] [5] [6] [7] [8]
Added references to new code directories and documentation for MODE 3, including ingestion processor/service modules and their READMEs. [1] [2] [3] [4]

Documentation Navigation and Structure:

Added navigation links for Kafka production deployment and architecture in the documentation sidebar (nav.adoc). [1] [2]
Updated documentation structure and instructions to support both Maven and npm workflows, including new commands for building and serving docs, and enabling auto-rebuild with nodemon. [1] [2] [3] [4]

These changes collectively enable and document the Kafka-based ingestion (MODE 3) as a first-class, production-ready option, and modernize the documentation workflow for contributors.

gmunozfe

Two things to fix:

Reject blank workflow/task identifiers. The current workflow path defaults missing data.name to "", and processors only reject null IDs, so invalid events can create an empty workflow ID.
Align status mapping with the existing Data Index model. task.started.v1 currently maps to STARTED, while other modes use RUNNING.

ricardozanini · 2026-05-29T15:46:02Z

+
+import java.time.OffsetDateTime;
+
+public class LifecycleEvent {


Don't we have this already in the SDK?

Yes, we have a lot of classes (WorkflowCompletedCEData, WorkflowStartedCEData, TaskCancelledCEData) that represent each kind of lifecycle event, but this class aims to merge everything in a centralized data transfer object to be parsed.

Why cannot deserialize to the proper SDK POJO using the cloud event type ?
There is a clear and predicatble mapping between each type and each pojo as defined in the SDK, there is not need to create a union class holding all the structs.
Also, now that the SDK is a hierarchy after the later changes you can add common code to handle the different types generically (WorkflowCEEvent and TaskCEEvent) and just an extension to deal with different date fields names

Signed-off-by: Matheus Cruz <matheuscruz.dev@gmail.com>

fjtirado · 2026-06-03T16:31:51Z

I would use InputStreamReader and BufferedReader#readAllAsString() here.

Signed-off-by: Matheus Cruz <matheuscruz.dev@gmail.com>

gmunozfe

Looks good to me, great work @mcruzdev
Probably after analyzing performance we can try some kind of grouping like in sonataflow

ricardozanini · 2026-06-09T17:02:01Z

Looks good to me, great work @mcruzdev Probably after analyzing performance we can try some kind of grouping like in sonataflow

@gmunozfe, care to open an issue in Quarkus Flow to track this, please?

mcruzdev marked this pull request as ready for review May 26, 2026 20:10

mcruzdev requested review from fjtirado and ricardozanini May 26, 2026 20:11

ricardozanini requested a review from gmunozfe May 27, 2026 00:43

ricardozanini reviewed May 27, 2026

View reviewed changes

gmunozfe requested changes May 27, 2026

View reviewed changes

mcruzdev marked this pull request as draft May 28, 2026 03:31

mcruzdev force-pushed the issue-23 branch 4 times, most recently from 05b7844 to a92af73 Compare May 29, 2026 05:49

mcruzdev marked this pull request as ready for review May 29, 2026 15:30

ricardozanini reviewed May 29, 2026

View reviewed changes

mcruzdev force-pushed the issue-23 branch from 90e1d24 to e23ec78 Compare May 30, 2026 06:15

gmunozfe self-requested a review May 31, 2026 22:08

gmunozfe reviewed May 31, 2026

View reviewed changes

Comment thread ...in/java/org/kubesmarts/logic/dataindex/ingestion/kafka/processor/TaskExecutionProcessor.java Outdated

Comment thread ...ice/src/main/java/org/kubesmarts/logic/dataindex/ingestion/kafka/service/LifecycleEvent.java Outdated

Add MODE 3 implementation (ingestion with Kafka consumer)

4521521

Signed-off-by: Matheus Cruz <matheuscruz.dev@gmail.com>

mcruzdev force-pushed the issue-23 branch from 9112875 to 4521521 Compare June 3, 2026 16:07

fjtirado reviewed Jun 3, 2026

View reviewed changes

Comment thread .../main/java/org/kubesmarts/logic/dataindex/ingestion/kafka/processor/persistence/LoadSQL.java Outdated

Apply @fjtirado suggestion

2cdab7e

Signed-off-by: Matheus Cruz <matheuscruz.dev@gmail.com>

fjtirado approved these changes Jun 5, 2026

View reviewed changes

mcruzdev requested review from gmunozfe and ricardozanini June 9, 2026 14:20

gmunozfe approved these changes Jun 9, 2026

View reviewed changes

ricardozanini approved these changes Jun 9, 2026

View reviewed changes

ricardozanini merged commit a6d8622 into kubesmarts:main Jun 9, 2026
2 checks passed


		import java.time.OffsetDateTime;

		public class LifecycleEvent {

Uh oh!

Conversation

mcruzdev commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gmunozfe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ricardozanini May 29, 2026

Choose a reason for hiding this comment

Uh oh!

mcruzdev May 30, 2026

Choose a reason for hiding this comment

Uh oh!

fjtirado Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fjtirado Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gmunozfe left a comment

Choose a reason for hiding this comment

Uh oh!

ricardozanini commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mcruzdev commented May 26, 2026 •

edited

Loading

fjtirado Jun 1, 2026 •

edited

Loading