Skip to content

feat: regression test scripts and ci#486

Open
gnolong wants to merge 11 commits intomainfrom
support/regression-test
Open

feat: regression test scripts and ci#486
gnolong wants to merge 11 commits intomainfrom
support/regression-test

Conversation

@gnolong
Copy link
Copy Markdown
Contributor

@gnolong gnolong commented May 6, 2026

Summary

  1. Build a dedicated integration test CI workflow and a reusable local/CI runner for all supported suites.
  2. Fix a set of regressions exposed by that workflow across MongoDB, Redis, Kafka, checker/sql generation, MySQL/PG/Redis data handling, and several flaky integration cases.

Main changes

1. Integration test CI was reworked into a dedicated workflow

  • A standalone workflow .github/workflows/integration-tests.yml.
  • Added a discover job that reads the enabled test matrix directly from dt-tests/scripts/run-integration-tests.sh --list-suites-json, so CI suite selection stays in sync with the script.
  • Added per-suite matrix execution with:
    • service startup
    • readiness waiting
    • cargo nextest test execution
    • failure log dump
    • artifact upload
    • service teardown
  • Added path filters and explicit workflow trigger fixes so PRs trigger the integration workflow more predictably.

2. Added a unified integration test runner for local and CI usage

  • Introduced dt-tests/scripts/run-integration-tests.sh as the main integration orchestration entrypoint.
  • Added support for:
    • suite discovery/listing
    • per-suite docker startup and health waiting
    • nextest execution
    • serial per-suite execution
    • log collection and artifact-friendly output
    • --keep-going, --down-each-suite, --logs-on-failure, --show-test-output
  • The runner also enforces --test-threads 1, which helps avoid cross-test interference in integration suites.

3. Added dedicated integration docker environment

  • Added dt-tests/docker-compose.integration.yml for the integration matrix.
  • Expanded service coverage for MySQL, PostgreSQL, Kafka, Redis, MongoDB, ClickHouse, TiDB, etc.
  • Added MongoDB replica-set/bootstrap related files:
    • dt-tests/docker/mongo-init/start-mongo-rs.sh
    • dt-tests/docker/mongo-init/20-create-mongo-dst-user.js
    • dt-tests/docker/mongo-init/mongo-keyfile
  • Added PostgreSQL init SQL for additional charset/database setup:
    • dt-tests/docker/postgres-init/10-create-euc-cn-dbs.sql

Follow-up fixes after the initial CI commit

4. Checker / partitioner / merger target resolution was fixed

  • TaskUtil now supports creating an RDB meta manager from checker target config, not only the main sinker config.
  • ParallelizerUtil now fails explicitly when it cannot build the RDB meta manager for merger/partitioner, instead of unwrap()-based failure paths.
  • This fixes checker-related integration failures when the effective target is the checker DB instead of the primary sink.

5. PostgreSQL replication URL parsing was fixed

  • pg_cdc_client now strips wrapped query params like options[...] when building replication connection config.
  • Added coverage for replication URLs containing options[statement_timeout]=....
  • This prevents replication connection parsing failures under integration config variants.

6. Raw string / binary value handling was corrected across Avro, Lua, Redis, and row conversion

  • Added ColValue::to_utf8_string() and to_utf8_or_hex_string().
  • RowData::convert_raw_string() now prefers UTF-8 text and falls back to hex for non-UTF8 bytes.
  • AvroConverter now serializes UTF-8 RawString as string, and binary RawString as bytes.
  • LuaProcessor now preserves non-editable binary-like values safely:
    • Blob remains preserved instead of being accidentally overwritten
    • RawString is exposed as Lua string only if valid UTF-8; otherwise preserved as raw bytes
  • RedisSinker and Redis test runner now use consistent RawString formatting, avoiding mismatches between expected DB values and actual Redis content.

7. PG SQL literal generation was fixed for special numeric values

  • RdbQueryBuilder now formats PostgreSQL float/double/decimal literals correctly for:
    • NaN
    • Infinity
    • -Infinity
  • This avoids invalid SQL generation in checker / SQL assertion paths.

8. Struct/check pipeline behavior was corrected

  • BasePipeline now runs struct checking after sink_struct(...).
  • Redis sink no longer forces DML into the Raw sink path from BasePipeline sink-method selection.
  • RdbSqlTestRunner now clears and reads SQL logs from both possible checker/sql log locations, fixing missing SQL assertion artifacts in tests.

fix #487

@gnolong gnolong marked this pull request as ready for review May 7, 2026 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] integration test scripts and CI workflow

1 participant