Skip to content

docs: avoiding backfills during a database failover or host change#3086

Merged
jwhartley merged 3 commits into
masterfrom
docs/aurora-failover-backfill
Jul 2, 2026
Merged

docs: avoiding backfills during a database failover or host change#3086
jwhartley merged 3 commits into
masterfrom
docs/aurora-failover-backfill

Conversation

@jwhartley

@jwhartley jwhartley commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

What

Documents how to avoid a full backfill when a CDC source moves to a new writer, including failovers, standby promotions, instance migrations, and Amazon Aurora Global Database regional switchovers.

The existing "Preventing backfills during database upgrades" section already covers the pause-writes / "Only Changes" procedure, but it was framed entirely around in-place upgrades (same host). It never mentioned failover or a host change, and had no MySQL specifics. A user searching for failover behavior landed nowhere, even though the procedure is the right answer.

Approach

Rather than add a parallel section, this generalizes the existing one. The new-host case is a step variant, not a separate procedure.

  • guides/backfilling-data.md: renamed the heading to "Preventing backfills during database upgrades and failovers"; generalized the intro to cover failovers/promotions/migrations (Aurora Global Database as one example); split step 4 into 4a (same host, no extra change) and 4b (new host: also repoint the capture address, because the new writer does not carry the old CDC position). Updated the internal anchor reference.
  • reference/Connectors/.../PostgreSQL.md and .../MySQL.md: short pointers into that section. MySQL had no failover guidance at all.
  • private-byoc/privatelink.md: note in the cross-region section about pre-registering another region's endpoint ahead of a host change, cross-linked back.

Why these specifics are correct

  • Postgres: logical replication slots are not replicated to a promoted standby or Aurora Global Database secondary cluster; a new slot can only start at the current WAL position. The publication and watermarks table do carry over with the storage volume.
  • MySQL: binlog coordinates are per-server, so the stored file+offset is invalid on the new writer (ERROR 1236); capture must use the writer endpoint (readers report log_bin = OFF).
  • Neither connector validates server identity on reconnect, so editing address alone resumes the stale cursor and fails. Verified against the connector source.

Notes

No new page; this extends and cross-links existing content. Prompted by a customer asking for a repeatable regional-failover pattern for an Aurora Global Database.

…nal cutover

Extends the existing 'Preventing backfills during database upgrades' guidance to
cover planned failovers, including promoting an Aurora Global Database secondary
region. Adds the address-repoint step (the new writer does not carry the old CDC
position) and the per-engine specifics: Postgres slots are not replicated to a
promoted cluster, MySQL binlog coordinates are per-server and require the writer
endpoint. Cross-links the PostgreSQL and MySQL connector pages and the
cross-region PrivateLink section (pre-registering a standby region for DR).
@github-actions

Copy link
Copy Markdown

Generalize rather than create a parallel failover section: the new-host case
(failover, standby promotion, instance migration) is a step variant (4b), not a
separate procedure. Rename the heading to '...upgrades and failovers', keep one
section, drop the Aurora-specific framing to an example. Update cross-link anchors.
@jwhartley jwhartley changed the title docs: avoiding backfills during a database failover or regional cutover docs: avoiding backfills during a database failover or host change Jun 29, 2026
Drop the Aurora Global Database examples and the MySQL-specific error detail from
the general backfill guide; state the new-host failure reason generically (resume
from a CDC position that does not exist on the new server). Make the PrivateLink
note about pre-registering any standby/backup endpoint for upgrades, failover, or
DR rather than cross-region specifically.
@jwhartley jwhartley requested a review from aeluce June 30, 2026 02:08

@aeluce aeluce left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I see the PR description notes that the Postgres publication and watermarks table will carry over with the storage volume on failover. I didn't see that specified in the docs; would it be useful to include, or is it extraneous?

@jwhartley

Copy link
Copy Markdown
Contributor Author

Yeah I trimmed those out as they were adding unnecessary bulk

@jwhartley jwhartley merged commit a1b004e into master Jul 2, 2026
8 checks passed
@jwhartley jwhartley deleted the docs/aurora-failover-backfill branch July 2, 2026 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants