Phased Migrations

Problem & Solution

Changing stateful systems (databases, secrets, certificates) in a single step risks downtime or data corruption. A phased migration maintains backward compatibility at every stage so each deploy is independently reversible.

Pattern

The Six Phases

Add new schema -- Deploy the new column, table, or secret alongside the existing one
Dual write, old read -- Write to both old and new schema, but the application still reads from the old schema only
Migrate data -- Copy existing data from the old schema into the new one
Dual write, new read -- Write to both old and new schema, but the application now reads from the new schema only
New schema only -- Once verified, stop writing to the old schema entirely and read and write only the new schema
Remove old schema -- Drop the old column, table, or secret

Not every migration needs all six phases. Column renames, type changes, and secret rotations each follow the same principle -- never break compatibility in a single deploy -- but the specific steps vary.

When to Apply

Database schema migrations (column renames, type changes, table splits)
Secret, password, or certificate rotation
API version transitions
Configuration format changes

When NOT to Apply

Additive-only changes (new nullable column, new table) that don't touch existing data or behavior -- these are safe in a single deploy
Development or staging environments where downtime is acceptable

Examples

Column Type Change: `char` to `boolean`

From a real migration that converted a char(1) include column (values "X" and "") to a boolean:

Phase 1 -- Add new column alongside old:

sql

ALTER TABLE accounts
ADD COLUMN include_bool boolean NULL;

UPDATE accounts
SET include_bool = CASE
  WHEN include = "X" THEN 1
  ELSE 0
END;

Phase 2 -- Application writes to both columns. Deploy code that writes both include and include_bool on every update, but still reads from include.

Phase 3 -- Backfill. The UPDATE above handles existing rows. For ongoing migrations, a background job catches any rows missed during the deploy gap.

Phase 4 -- Switch reads to new column. Deploy code that reads from include_bool but still writes both.

Phase 5 -- Drop old column, rename new column:

sql

ALTER TABLE accounts DROP COLUMN include;

ALTER TABLE accounts
CHANGE include_bool include boolean NOT NULL DEFAULT 1;

The old include column must be dropped before renaming include_bool to include. Each phase is a separate deploy. If phase 4 reveals data issues, roll back to reading from include without data loss.

Primary Key Swap

Replacing a legacy primary key (aid) with a new auto-increment id column:

sql

-- Phase 1: Add new column
ALTER TABLE accounts ADD COLUMN id int NULL;
UPDATE accounts SET id = aid WHERE id IS NULL;

-- Phase 5: Swap keys (after application code is updated)
ALTER TABLE accounts CHANGE aid aid int NOT NULL;
ALTER TABLE accounts DROP PRIMARY KEY;
ALTER TABLE accounts CHANGE id id int NOT NULL AUTO_INCREMENT;
ALTER TABLE accounts ADD PRIMARY KEY (id);

Between phases 1 and 5, the application writes both aid and id, giving time to update foreign keys in other tables before removing the old key.

Table Migration with Data Copy

Creating a normalized replacement table while the old one is still in use:

sql

-- Phase 1: Create new table
CREATE TABLE Account (
  id int NOT NULL AUTO_INCREMENT,
  name varchar(191) NOT NULL,
  user_id int NOT NULL,
  type ENUM('ASSET', 'LIABILITY', 'INCOME', 'EXPENSE', 'EQUITY'),
  PRIMARY KEY (id),
  CONSTRAINT Account_user_id_fk FOREIGN KEY (user_id) REFERENCES User (id)
);

-- Phase 3: Migrate data with transformation
INSERT INTO Account (id, name, user_id, type)
SELECT id, name, user_id,
  CASE accttype
    WHEN 'CHECKING' THEN 'ASSET'
    WHEN 'SAVINGS'  THEN 'ASSET'
    WHEN 'CREDITLINE' THEN 'LIABILITY'
  END
FROM legacy_accounts;

The old legacy_accounts table stays live through phases 2-4 while the application transitions.

Foreign Key Addition

Adding a foreign key constraint to an existing table:

sql

-- Phase 1: Add nullable column
ALTER TABLE accounts ADD COLUMN user_id int NULL;

-- Phase 3: Populate from existing data
UPDATE accounts
SET user_id = (SELECT id FROM User WHERE username = 'target_user');

-- Phase 5: Make NOT NULL after verification
ALTER TABLE accounts CHANGE user_id user_id int NOT NULL;

-- Phase 5: Add constraint
ALTER TABLE accounts
ADD CONSTRAINT user_id_fk FOREIGN KEY (user_id) REFERENCES User (id);

Adding the constraint before populating data would fail. Adding it as NOT NULL before populating would also fail. The phased approach avoids both problems.

Secret Rotation

Add new secret to config (both old and new available)
Deploy code that tries new secret first, falls back to old
Rotate the actual secret on the external system
Verify new secret works in production
Deploy code that only uses new secret
Remove old secret from config

Reversible Migrations

Every migration should have a reverse script. A column type change from char to boolean reverses cleanly:

sql

-- unapply: boolean back to char
ALTER TABLE accounts
CHANGE include include char(1) NULL DEFAULT "X";

UPDATE accounts
SET include = CASE
  WHEN include = "1" THEN "X"
  WHEN include = "0" THEN ""
END;

Keeping reverse scripts alongside forward scripts means any phase can be rolled back without improvising under pressure.

Checks

[ ] Each phase can be rolled back independently
[ ] No single deploy causes downtime
[ ] Verification step exists between read switchover and cleanup
[ ] Reverse/unapply scripts exist for every forward migration
[ ] Nullable columns are used during transition, made NOT NULL only after data is fully populated

Phased Migrations ​

Problem & Solution ​

Pattern ​

The Six Phases ​

When to Apply ​

When NOT to Apply ​

Examples ​

Column Type Change: char to boolean ​

Primary Key Swap ​

Table Migration with Data Copy ​

Foreign Key Addition ​

Secret Rotation ​

Reversible Migrations ​

Checks ​

See also ​