HIVE-29643: Add missing primary keys to backend tables across differe…#6521
HIVE-29643: Add missing primary keys to backend tables across differe…#6521saihemanth-cloudera wants to merge 4 commits into
Conversation
…nt types of vendors
There was a problem hiding this comment.
Pull request overview
This PR updates the standalone metastore schema (Hive 4.3.0) to add missing primary keys to several transactional/compaction-related backend tables across supported RDBMS vendors, which is needed for enabling/operating HA mode reliably.
Changes:
- Add new surrogate identity/serial primary keys to previously keyless transactional tables (e.g.,
TXN_COMPONENTS,WRITE_SET, etc.) across Postgres/Oracle/MySQL/MSSQL/Derby schemas. - Replace certain unique indexes with proper primary keys (e.g.,
TXN_TO_WRITE_ID,NEXT_WRITE_ID) in both base schema and upgrade scripts. - Update test DB cleanup initialization inserts to specify column lists for sequence tables whose schemas gained new PK columns.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/TestTxnDbUtil.java | Updates inserts into NEXT_LOCK_ID / NEXT_COMPACTION_QUEUE_ID to work with new PK columns. |
| standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-4.2.0-to-4.3.0.postgres.sql | Adds PKs/identity columns and converts some unique indexes to PKs for Postgres upgrades. |
| standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.3.0.postgres.sql | Adds PKs/identity columns and moves certain uniqueness from indexes to PK constraints in the Postgres base schema. |
| standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-4.2.0-to-4.3.0.oracle.sql | Adds PKs/identity columns and converts some unique indexes to PKs for Oracle upgrades. |
| standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.3.0.oracle.sql | Adds PKs/identity columns and converts some unique indexes to PKs in the Oracle base schema. |
| standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-4.2.0-to-4.3.0.mysql.sql | Adds auto-increment PKs and converts some unique indexes to PKs for MySQL upgrades. |
| standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.3.0.mysql.sql | Adds auto-increment PKs and replaces some unique indexes with PK constraints in the MySQL base schema. |
| standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-4.2.0-to-4.3.0.mssql.sql | Adds IDENTITY PKs and replaces some unique indexes with PK constraints for SQL Server upgrades. |
| standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.3.0.mssql.sql | Adds IDENTITY PKs and replaces some unique indexes with PK constraints in the SQL Server base schema. |
| standalone-metastore/metastore-server/src/main/sql/derby/upgrade-4.2.0-to-4.3.0.derby.sql | Adds identity PKs and converts some unique indexes to PKs for Derby upgrades. |
| standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.3.0.derby.sql | Adds identity PKs and replaces some unique indexes with PK constraints in the Derby base schema. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ALTER TABLE TXN_COMPONENTS ADD (TC_ID NUMBER(19) GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY); | ||
| ALTER TABLE COMPLETED_TXN_COMPONENTS ADD (CTC_ID NUMBER(19) GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY); | ||
| ALTER TABLE COMPACTION_METRICS_CACHE ADD (CMC_ID NUMBER(19) GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY); | ||
| ALTER TABLE WRITE_SET ADD (WS_ID NUMBER(19) GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY); |
| DROP INDEX NEXT_WRITE_ID_IDX; | ||
| ALTER TABLE NEXT_WRITE_ID ADD PRIMARY KEY (NWI_DATABASE, NWI_TABLE); | ||
|
|
||
| ALTER TABLE MIN_HISTORY_WRITE_ID ADD (MH_ID NUMBER(19) GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY); |
| ALTER TABLE "TXN_COMPONENTS" ADD COLUMN "TC_ID" bigserial PRIMARY KEY; | ||
| ALTER TABLE "COMPLETED_TXN_COMPONENTS" ADD COLUMN "CTC_ID" bigserial PRIMARY KEY; | ||
| ALTER TABLE "COMPACTION_METRICS_CACHE" ADD COLUMN "CMC_ID" bigserial PRIMARY KEY; | ||
| ALTER TABLE "WRITE_SET" ADD COLUMN "WS_ID" bigserial PRIMARY KEY; |
There was a problem hiding this comment.
Customers nees to plan a maintenance window for busy metastores as backfill duration scales with TXN_COMPONENTS/WRITE_SET row counts.
| ALTER TABLE TXN_COMPONENTS ADD TC_ID bigint NOT NULL AUTO_INCREMENT PRIMARY KEY FIRST; | ||
| ALTER TABLE COMPLETED_TXN_COMPONENTS ADD CTC_ID bigint NOT NULL AUTO_INCREMENT PRIMARY KEY FIRST; | ||
| ALTER TABLE COMPACTION_METRICS_CACHE ADD CMC_ID bigint NOT NULL AUTO_INCREMENT PRIMARY KEY FIRST; | ||
| ALTER TABLE WRITE_SET ADD WS_ID bigint NOT NULL AUTO_INCREMENT PRIMARY KEY FIRST; | ||
|
|
| ALTER TABLE TXN_COMPONENTS ADD TC_ID bigint IDENTITY(1,1) NOT NULL; | ||
| ALTER TABLE TXN_COMPONENTS ADD CONSTRAINT TXN_COMPONENTS_PK PRIMARY KEY CLUSTERED (TC_ID); | ||
|
|
||
| ALTER TABLE COMPLETED_TXN_COMPONENTS ADD CTC_ID bigint IDENTITY(1,1) NOT NULL; | ||
| ALTER TABLE COMPLETED_TXN_COMPONENTS ADD CONSTRAINT COMPLETED_TXN_COMPONENTS_PK PRIMARY KEY CLUSTERED (CTC_ID); | ||
|
|
||
| ALTER TABLE COMPACTION_METRICS_CACHE ADD CMC_ID bigint IDENTITY(1,1) NOT NULL; | ||
| ALTER TABLE COMPACTION_METRICS_CACHE ADD CONSTRAINT COMPACTION_METRICS_CACHE_PK PRIMARY KEY CLUSTERED (CMC_ID); | ||
|
|
||
| ALTER TABLE WRITE_SET ADD WS_ID bigint IDENTITY(1,1) NOT NULL; | ||
| ALTER TABLE WRITE_SET ADD CONSTRAINT WRITE_SET_PK PRIMARY KEY CLUSTERED (WS_ID); |
| ALTER TABLE TXN_COMPONENTS ADD COLUMN TC_ID BIGINT PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY; | ||
| ALTER TABLE COMPLETED_TXN_COMPONENTS ADD COLUMN CTC_ID BIGINT PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY; | ||
| ALTER TABLE COMPACTION_METRICS_CACHE ADD COLUMN CMC_ID BIGINT PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY; | ||
| ALTER TABLE WRITE_SET ADD COLUMN WS_ID BIGINT PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY; | ||
|
|
| ALTER TABLE TXN_COMPONENTS ADD (TC_ID NUMBER(19) GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY); | ||
| ALTER TABLE COMPLETED_TXN_COMPONENTS ADD (CTC_ID NUMBER(19) GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY); | ||
| ALTER TABLE COMPACTION_METRICS_CACHE ADD (CMC_ID NUMBER(19) GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY); | ||
| ALTER TABLE WRITE_SET ADD (WS_ID NUMBER(19) GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY); |
|
Sounds like the AI comments are valid, can you check them please? |
|



…nt types of vendors
What changes were proposed in this pull request?
Added primary keys to the backend schema tables
Why are the changes needed?
To add support HA mode in DB, we need to have primary keys for the existing tables.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Existing tests