Skip to content

[schema] Bump PartitionState v23 + StoreMetaValue v43 for prc record-count verification#2776

Merged
ymuppala merged 2 commits intolinkedin:mainfrom
ymuppala:prc-schema-bumps
May 7, 2026
Merged

[schema] Bump PartitionState v23 + StoreMetaValue v43 for prc record-count verification#2776
ymuppala merged 2 commits intolinkedin:mainfrom
ymuppala:prc-schema-bumps

Conversation

@ymuppala
Copy link
Copy Markdown
Collaborator

@ymuppala ymuppala commented May 6, 2026

Problem

Part 2
A Spark-based VPJ KIF repush silently wrote batch data to the wrong datacenter (incident-10468). KAFKA_INPUT_BROKER_URL leaked into the writer config and overwrote KAFKA_BOOTSTRAP_SERVERS. Two of three DCs lost batch data while the push job reported success — the store served partial data for hours before anyone noticed.

The root cause was an unguarded trust boundary: Venice servers count records as they ingest, but never compared their count against what the producer claimed it sent. Any failure mode that drops records between VPJ and the server — wrong-DC writes, partial broker connectivity loss, intermittent produce errors not surfaced to VPJ — would replay 10468.

Catching this end-to-end requires:

  1. Producer side: VPJ tells the server "I sent N records to partition P". Already done in [vpj] counting records produced per partition as part of push and adding it to the EOP message #2758, which attaches a prc (per-partition record count) PubSub header to every EOP control message.
  2. Consumer side: server counts what it actually receives, compares at EOP, and surfaces the deficit. This is the work coming next.
  3. Failure plumbing: server-side failures must propagate as ExecutionStatus.ERROR to the controller and as a tagged checkpoint to the VPJ runner, so the push job fails fast instead of completing silently with missing data.

Solution

Schema-only PR — adds two of the three Avro schema versions needed for the upcoming consumer-side verification, without any Java consumer code yet. AdminOperation v98 is split into a separate follow-up PR to stay under the enforce-lines-added 2000-line cap (two avsc files at 328 + 455 = 783 lines fit comfortably; AdminOperation alone is 1309 lines).

The two follow-up PRs:

  1. AdminOperation v98 schema-only PR — same shape as this one, separately because the avsc is large.
  2. Code PR (rebased from [server][common] server-side batch-push record count verification at EOP #2774) — removes the versionOverrides pins from this PR + the AdminOperation PR, bumps AvroProtocolDefinition constants in lockstep, wires the actual consumer code (server-side counting, EOP comparison, per-store throw path, controller statusDetails propagation, VPJ checkpoint detection).

Why each schema gets a new version

The schemas live in different layers, and each has its own reason to bump:

Schema Version Field added Why it needs a new version
PartitionState v22 → v23 batchPushRecordCount (long, default 0) This is the on-disk per-partition checkpoint blob (RocksDB metadata partition). The server needs to count batch-push records as they arrive and persist that count alongside other replica state (offsets, DIV state, activeKeyCount, etc.) so it survives restarts and rides along with blob-transfer snapshots. Without a schema bump the count has nowhere to live persistently — OffsetRecord (the runtime wrapper) is just a façade around PartitionState.
StoreMetaValue v42 → v43 batchPushRecordCountVerificationEnabled (boolean, default false) This is the value schema of the meta system store, which clients/operators read to discover store-level config. The throw-on-deficit behavior is per-store opt-in — operators flip it surgically via UpdateStoreQueryParams and roll it back the same way. The flag has to be visible alongside other store-level configs (batchPushEnabled, chunkingEnabled, etc.), and clients reading store metadata for any reason (admin tools, dashboards, push-job validation) need to see it.

Both changes here are strictly additive with safe defaults (0 / false). Forward-compatibility: a v23-aware reader can still consume v22 records (missing field reads as default). Backward-compatibility: a v22-aware reader consumes v23 records by ignoring the extra field. No coordinated cutover required.

Test plan

  • ./gradlew :internal:venice-common:compileAvro — log confirms (OVERRIDE) annotations on v22 / v42 (the pinned versions actually picked).
  • ./gradlew :internal:venice-common:compileJava — Java still compiles cleanly with the old generated schema classes (because protocol constants are unchanged).
  • CI: AvroCompatibilityTests verifies the new schemas are forward/backward compatible with all prior versions.

Backwards compatibility

  • No. The new fields are additive with safe defaults; older readers treat them as default values (0 / false).
  • This PR alone introduces zero runtime behavior change. Behavior changes arrive in the follow-up consumer-code PR.

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings May 6, 2026 21:35
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces new Avro protocol versions (schema-only, additive) needed for upcoming server-side batch-push record-count verification, while keeping the build pinned to the prior schema versions so existing generated classes/protocol constants remain unchanged until the follow-up wiring lands.

Changes:

  • Add PartitionState v23 with batchPushRecordCount (long, default 0).
  • Add StoreMetaValue v43 and AdminOperation v98 with batchPushRecordCountVerificationEnabled (boolean, default false).
  • Pin compileAvro.versionOverrides in build.gradle to continue generating code from v22/v42/v97 until the follow-up PR removes the pins and updates protocol constants.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated no comments.

File Description
services/venice-controller/src/main/resources/avro/AdminOperation/v98/AdminOperation.avsc Adds the per-store verification enablement flag to the admin topic protocol (UpdateStore payload) in a new schema version.
internal/venice-common/src/main/resources/avro/StoreMetaValue/v43/StoreMetaValue.avsc Adds the per-store verification enablement flag to the store metadata system store schema in a new version.
internal/venice-common/src/main/resources/avro/PartitionState/v23/PartitionState.avsc Adds persisted batchPushRecordCount to support consumer-side record count tracking in a new version.
build.gradle Pins Avro codegen to prior schema versions via versionOverrides to avoid runtime/protocol mismatches until follow-up code lands.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ymuppala ymuppala changed the title [schema] Bump PartitionState v23, StoreMetaValue v43, AdminOperation v98 for prc record-count verification [schema] [VALIDATION_OVERRIDE] Bump PartitionState v23, StoreMetaValue v43, AdminOperation v98 for prc record-count verification May 6, 2026
@ymuppala ymuppala changed the title [schema] [VALIDATION_OVERRIDE] Bump PartitionState v23, StoreMetaValue v43, AdminOperation v98 for prc record-count verification [schema] Bump PartitionState v23 + StoreMetaValue v43 for prc record-count verification May 6, 2026
…count

verification

Schema-only PR. Adds two of the three schema versions needed for upcoming
server-side batch-push record-count verification (prc) work. AdminOperation
v98 lands in a separate follow-up PR (split to stay under the
enforce-lines-added 2000-line cap).

Schemas added:
- PartitionState v22 -> v23: adds `batchPushRecordCount` (long, default 0).
  Persists the server-side count of batch-push data records, used at EOP
  to verify against the producer-side `prc` PubSub header (PR linkedin#2758).
- StoreMetaValue v42 -> v43: adds `batchPushRecordCountVerificationEnabled`
  (boolean, default false). Per-store opt-in for the throw path on
  record-count deficits.

Both additive with safe defaults — backward-compatible.

`build.gradle` pins compileAvro to v22/v42 via `versionOverrides`. The
follow-up consumer-code PR will remove the pins and bump
`AvroProtocolDefinition.PARTITION_STATE` (22->23) and
`METADATA_SYSTEM_SCHEMA_STORE` (42->43) constants in lockstep.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ymuppala ymuppala force-pushed the prc-schema-bumps branch from 027dfd7 to 62964a5 Compare May 6, 2026 21:48
ymuppala added a commit to ymuppala/venice that referenced this pull request May 6, 2026
Schema-only PR. Adds AdminOperation v98 to carry the new per-store
`batchPushRecordCountVerificationEnabled` flag on the SetStore admin op,
so parent controllers can propagate it to child controllers via the
admin topic.

Stacked on top of PR linkedin#2776 (PartitionState v23 + StoreMetaValue v43)
because both PRs touch the same `versionOverrides` block in build.gradle.
Stacking ensures the override list grows monotonically with no merge
conflict when both PRs land. Once linkedin#2776 merges, this PR's branch will be
rebased on new main and the diff will collapse to just the AdminOperation
additions (the v22 / v42 pins from linkedin#2776 will already be in main).

Field added (additive, default false):
  batchPushRecordCountVerificationEnabled (boolean) on SetStore.

`build.gradle` extends the existing `versionOverrides` list
(PartitionState/v22
and StoreMetaValue/v42 from linkedin#2776) with AdminOperation/v97. The follow-up
consumer-code PR removes all three pins together when bumping
`AvroProtocolDefinition.PARTITION_STATE` / METADATA_SYSTEM_SCHEMA_STORE /
ADMIN_OPERATION constants in lockstep with the controller wiring.

Backward-compatible: v97-aware deserializers reading v98 records treat
the unknown field as default `false`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ymuppala ymuppala enabled auto-merge (squash) May 7, 2026 02:28
Copilot AI review requested due to automatic review settings May 7, 2026 18:51
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

@ymuppala ymuppala merged commit aabc6f0 into linkedin:main May 7, 2026
110 checks passed
ymuppala added a commit to ymuppala/venice that referenced this pull request May 7, 2026
linkedin#2776 (PartitionState v23 + StoreMetaValue v43) and linkedin#2777 (AdminOperation
v98) merged with versionOverrides pinning compileAvro to the prior
versions (v22, v42, v97). This PR consumes those new schemas via the
AvroProtocolDefinition bumps (already in earlier commits on this branch),
so the pins must come down in lockstep.

After this change, compileAvro picks v23/v43/v98 (no OVERRIDE) and the
generated SpecificRecord classes contain the new fields. The unrelated
KafkaMessageEnvelope/v13 pin (from linkedin#2778) is preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
sushantmane added a commit to sushantmane/venice that referenced this pull request May 8, 2026
Main landed PR linkedin#2776 (PartitionState v23 with batchPushRecordCount, pinned
to v22 until prc record-count consumer code lands), so my donor-clock
freshness field can no longer claim v23. Move it to a new v24 that carries
v23 + donorHeartbeatTimestampMs together.

Bumps the PartitionState avro override from v22 → v24 so consumer code in
this PR can read/write the new field. v23's batchPushRecordCount is
exposed as a side effect, but no consumer reads/writes it yet — existing
data sees default 0 and behaves unchanged. The prc record-count team's
follow-up PR will replace this override (or remove it once v24 is the
real latest) when their consumer code lands.
sushantmane added a commit to sushantmane/venice that referenced this pull request May 8, 2026
…in-place

Earlier I bumped to v24 to avoid colliding with PR linkedin#2776's v23
(batchPushRecordCount), but v23 has no production data — PR linkedin#2776 pinned
the avro override at v22 because its consumer code wasn't ready, so v23
has only ever existed as a schema file. That makes it safe to extend
v23 in-place with one more additive field instead of carrying an unused
intermediate version.

Result:
  v22 (active wire format today)
   └─ v23 = batchPushRecordCount (PR linkedin#2776) + donorHeartbeatTimestampMs (thi
s PR)

Avro override bumps from v22 → v23. AvroProtocolDefinition.PARTITION_STATE
goes from (24, 24) to (24, 23). v24 directory deleted.

The prc record-count team's follow-up PR no longer needs to coordinate a
v24/v23 swap — they just remove the override (or land their consumer
code on v23 alongside this PR's). batchPushRecordCount is now part of
the active schema; default 0 means replicas behave unchanged until the
prc consumer code reads/writes it.

Verified: TestOffsetRecord (all pass), LeaderFollowerStoreIngestionTaskTest
blob-transfer suite (all 7 pass), reportIfCatchUpVersionTopicOffset and
testReadyToServePartitionValidateIngestionSuccessWithPriorState (pass for
AA_ON / AA_OFF). Generated PartitionState class carries both fields.
sushantmane added a commit to sushantmane/venice that referenced this pull request May 9, 2026
Main landed PR linkedin#2776 (PartitionState v23 with batchPushRecordCount, pinned
to v22 until prc record-count consumer code lands), so my donor-clock
freshness field can no longer claim v23. Move it to a new v24 that carries
v23 + donorHeartbeatTimestampMs together.

Bumps the PartitionState avro override from v22 → v24 so consumer code in
this PR can read/write the new field. v23's batchPushRecordCount is
exposed as a side effect, but no consumer reads/writes it yet — existing
data sees default 0 and behaves unchanged. The prc record-count team's
follow-up PR will replace this override (or remove it once v24 is the
real latest) when their consumer code lands.
sushantmane added a commit to sushantmane/venice that referenced this pull request May 9, 2026
…in-place

Earlier I bumped to v24 to avoid colliding with PR linkedin#2776's v23
(batchPushRecordCount), but v23 has no production data — PR linkedin#2776 pinned
the avro override at v22 because its consumer code wasn't ready, so v23
has only ever existed as a schema file. That makes it safe to extend
v23 in-place with one more additive field instead of carrying an unused
intermediate version.

Result:
  v22 (active wire format today)
   └─ v23 = batchPushRecordCount (PR linkedin#2776) + donorHeartbeatTimestampMs (thi
s PR)

Avro override bumps from v22 → v23. AvroProtocolDefinition.PARTITION_STATE
goes from (24, 24) to (24, 23). v24 directory deleted.

The prc record-count team's follow-up PR no longer needs to coordinate a
v24/v23 swap — they just remove the override (or land their consumer
code on v23 alongside this PR's). batchPushRecordCount is now part of
the active schema; default 0 means replicas behave unchanged until the
prc consumer code reads/writes it.

Verified: TestOffsetRecord (all pass), LeaderFollowerStoreIngestionTaskTest
blob-transfer suite (all 7 pass), reportIfCatchUpVersionTopicOffset and
testReadyToServePartitionValidateIngestionSuccessWithPriorState (pass for
AA_ON / AA_OFF). Generated PartitionState class carries both fields.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants