Skip to content

[BUG]: Table CDC snapshot 2-part identifier is incorrectly handled #48

@lp-ae

Description

@lp-ae

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

When using cdcSnapshotSettings with snapshotType set to historical and sourceType set to table, table identifier parsing is inconsistent and can fail for valid table identifiers, including those given as examples in the Feature Samples. In affected runs, the pipeline raises a ValueError during table parsing before data is read. This blocks historical CDC snapshot ingestion from table sources.

This bug was found to be present on both serverless and dedicated compute.

Expected Behavior

Historical CDC snapshot table sources should accept supported identifier formats consistently and parse them correctly.
Expected accepted formats are:

  • schema.table (resolved with the pipeline catalog by default)
  • catalog.schema.table

The pipeline should not fail for valid identifiers and should proceed to read the configured source table and version column.
The pipeline should fail when the format is unexpected.

Steps To Reproduce

MRE pipeline and pytest test cases reproducing the error are available by applying the attached patch: CDC_identifier_bug_MRE.patch

Channel

CURRENT

Relevant log output

pyspark.errors.exceptions.captured.AnalysisException: [TABLE_OR_VIEW_NOT_FOUND] The table or view `bug_repro`.`customer_historical_snapshot_source`.`customer_historical_snapshot_source` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.
To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS. SQLSTATE: 42P01;

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions