fix: disable HA and PITR on target instance during migration #294
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
Migration fails when the source Cloud SQL instance has any of these features enabled (#240):
Root cause
DefineInstance()usesDeepCopy()to copy the source instance spec to the target. This copies all settings, including HA and PITR. The target instance is then created with these settings via the helper app.During setup, DMS calls
demoteTargetInstance()to convert the target to a read replica. This fails because:Fix
Changes in
instance.go:CreateInstance()– After callingDefineInstance(), strip HA and PITR from the helper app spec before creating the target instance. This prevents GCP from creating an instance with incompatible settings.PrepareTargetInstance()– Also explicitly disable PITR (pointInTimeRecoveryEnabled = false) and setavailabilityType = ZONALon the CNRM spec as a safety net.ValidateSourceInstance()– Log clear warnings when the source has HA or PITR enabled, explaining they will be temporarily disabled during migration.UpdateTargetInstanceAfterPromotion()– After DMS promote completes, explicitly restore HA (availabilityType=REGIONAL) and PITR from the source CNRM spec onto the target instance.Changes in
promote.go:UpdateTargetInstanceAfterPromotion()call to pass source instance for settings restoration.Changes in
database.go:SetDatabasePassword()– ClearDatabaseRolesfrom the user object before callingUsers.Update. The Cloud SQL Admin API returnsdatabaseRolesin GET responses but rejects them in Update requests with"Invalid request to update database roles". This is a pre-existing bug unrelated to HA/PITR but was discovered during testing.Re-enabling after migration
After promotion,
UpdateTargetInstanceAfterPromotion()explicitly restores HA and PITR settings by reading the source CNRM spec and applying them to the target. Additionally,UpdateApplicationInstance()applies the original app spec via naiserator reconciliation.Testing
Verified end-to-end with a real migration in
nav-dev/nais:Source instance:
migrator-test-ha-pitrTarget instance:
migrator-test-targetResults
What about audit logging?
The
cloudsql.enable_pgauditdatabase flag is copied viaDeepCopy()and was not stripped. Testing confirmed it does not break the DMS migration – the flag survived the full setup→promote cycle without issues.Fixes #240
/cc @mortenlj