Is there an existing issue for this?
Current Behavior
When using cdcSnapshotSettings with snapshotType set to historical and sourceType set to table, table identifier parsing is inconsistent and can fail for valid table identifiers, including those given as examples in the Feature Samples. In affected runs, the pipeline raises a ValueError during table parsing before data is read. This blocks historical CDC snapshot ingestion from table sources.
This bug was found to be present on both serverless and dedicated compute.
Expected Behavior
Historical CDC snapshot table sources should accept supported identifier formats consistently and parse them correctly.
Expected accepted formats are:
- schema.table (resolved with the pipeline catalog by default)
- catalog.schema.table
The pipeline should not fail for valid identifiers and should proceed to read the configured source table and version column.
The pipeline should fail when the format is unexpected.
Steps To Reproduce
MRE pipeline and pytest test cases reproducing the error are available by applying the attached patch: CDC_identifier_bug_MRE.patch
Channel
CURRENT
Relevant log output
pyspark.errors.exceptions.captured.AnalysisException: [TABLE_OR_VIEW_NOT_FOUND] The table or view `bug_repro`.`customer_historical_snapshot_source`.`customer_historical_snapshot_source` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.
To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS. SQLSTATE: 42P01;
Is there an existing issue for this?
Current Behavior
When using cdcSnapshotSettings with snapshotType set to historical and sourceType set to table, table identifier parsing is inconsistent and can fail for valid table identifiers, including those given as examples in the Feature Samples. In affected runs, the pipeline raises a ValueError during table parsing before data is read. This blocks historical CDC snapshot ingestion from table sources.
This bug was found to be present on both serverless and dedicated compute.
Expected Behavior
Historical CDC snapshot table sources should accept supported identifier formats consistently and parse them correctly.
Expected accepted formats are:
The pipeline should not fail for valid identifiers and should proceed to read the configured source table and version column.
The pipeline should fail when the format is unexpected.
Steps To Reproduce
MRE pipeline and pytest test cases reproducing the error are available by applying the attached patch: CDC_identifier_bug_MRE.patch
Channel
CURRENT
Relevant log output