Skip to content

Phased Migrations

Problem & Solution

Changing stateful systems (databases, secrets, certificates) in a single step risks downtime or data corruption. A phased migration maintains backward compatibility at every stage so each deploy is independently reversible.

Pattern

The Six Phases

  1. Add new schema -- Deploy the new column, table, or secret alongside the existing one
  2. Dual write, old read -- Write to both old and new schema, but the application still reads from the old schema only
  3. Migrate data -- Copy existing data from the old schema into the new one
  4. Dual write, new read -- Write to both old and new schema, but the application now reads from the new schema only
  5. New schema only -- Once verified, stop writing to the old schema entirely and read and write only the new schema
  6. Remove old schema -- Drop the old column, table, or secret

Not every migration needs all six phases. Column renames, type changes, and secret rotations each follow the same principle -- never break compatibility in a single deploy -- but the specific steps vary.

When to Apply

  • Database schema migrations (column renames, type changes, table splits)
  • Secret, password, or certificate rotation
  • API version transitions
  • Configuration format changes

When NOT to Apply

  • Additive-only changes (new nullable column, new table) that don't touch existing data or behavior -- these are safe in a single deploy
  • Development or staging environments where downtime is acceptable

Examples

Column Type Change: char to boolean

From a real migration that converted a char(1) include column (values "X" and "") to a boolean:

Phase 1 -- Add new column alongside old:

sql
ALTER TABLE accounts
ADD COLUMN include_bool boolean NULL;

UPDATE accounts
SET include_bool = CASE
  WHEN include = "X" THEN 1
  ELSE 0
END;

Phase 2 -- Application writes to both columns. Deploy code that writes both include and include_bool on every update, but still reads from include.

Phase 3 -- Backfill. The UPDATE above handles existing rows. For ongoing migrations, a background job catches any rows missed during the deploy gap.

Phase 4 -- Switch reads to new column. Deploy code that reads from include_bool but still writes both.

Phase 5 -- Drop old column, rename new column:

sql
ALTER TABLE accounts DROP COLUMN include;

ALTER TABLE accounts
CHANGE include_bool include boolean NOT NULL DEFAULT 1;

The old include column must be dropped before renaming include_bool to include. Each phase is a separate deploy. If phase 4 reveals data issues, roll back to reading from include without data loss.

Primary Key Swap

Replacing a legacy primary key (aid) with a new auto-increment id column:

sql
-- Phase 1: Add new column
ALTER TABLE accounts ADD COLUMN id int NULL;
UPDATE accounts SET id = aid WHERE id IS NULL;

-- Phase 5: Swap keys (after application code is updated)
ALTER TABLE accounts CHANGE aid aid int NOT NULL;
ALTER TABLE accounts DROP PRIMARY KEY;
ALTER TABLE accounts CHANGE id id int NOT NULL AUTO_INCREMENT;
ALTER TABLE accounts ADD PRIMARY KEY (id);

Between phases 1 and 5, the application writes both aid and id, giving time to update foreign keys in other tables before removing the old key.

Table Migration with Data Copy

Creating a normalized replacement table while the old one is still in use:

sql
-- Phase 1: Create new table
CREATE TABLE Account (
  id int NOT NULL AUTO_INCREMENT,
  name varchar(191) NOT NULL,
  user_id int NOT NULL,
  type ENUM('ASSET', 'LIABILITY', 'INCOME', 'EXPENSE', 'EQUITY'),
  PRIMARY KEY (id),
  CONSTRAINT Account_user_id_fk FOREIGN KEY (user_id) REFERENCES User (id)
);

-- Phase 3: Migrate data with transformation
INSERT INTO Account (id, name, user_id, type)
SELECT id, name, user_id,
  CASE accttype
    WHEN 'CHECKING' THEN 'ASSET'
    WHEN 'SAVINGS'  THEN 'ASSET'
    WHEN 'CREDITLINE' THEN 'LIABILITY'
  END
FROM legacy_accounts;

The old legacy_accounts table stays live through phases 2-4 while the application transitions.

Foreign Key Addition

Adding a foreign key constraint to an existing table:

sql
-- Phase 1: Add nullable column
ALTER TABLE accounts ADD COLUMN user_id int NULL;

-- Phase 3: Populate from existing data
UPDATE accounts
SET user_id = (SELECT id FROM User WHERE username = 'target_user');

-- Phase 5: Make NOT NULL after verification
ALTER TABLE accounts CHANGE user_id user_id int NOT NULL;

-- Phase 5: Add constraint
ALTER TABLE accounts
ADD CONSTRAINT user_id_fk FOREIGN KEY (user_id) REFERENCES User (id);

Adding the constraint before populating data would fail. Adding it as NOT NULL before populating would also fail. The phased approach avoids both problems.

Secret Rotation

  1. Add new secret to config (both old and new available)
  2. Deploy code that tries new secret first, falls back to old
  3. Rotate the actual secret on the external system
  4. Verify new secret works in production
  5. Deploy code that only uses new secret
  6. Remove old secret from config

Reversible Migrations

Every migration should have a reverse script. A column type change from char to boolean reverses cleanly:

sql
-- unapply: boolean back to char
ALTER TABLE accounts
CHANGE include include char(1) NULL DEFAULT "X";

UPDATE accounts
SET include = CASE
  WHEN include = "1" THEN "X"
  WHEN include = "0" THEN ""
END;

Keeping reverse scripts alongside forward scripts means any phase can be rolled back without improvising under pressure.

Checks

  • [ ] Each phase can be rolled back independently
  • [ ] No single deploy causes downtime
  • [ ] Verification step exists between read switchover and cleanup
  • [ ] Reverse/unapply scripts exist for every forward migration
  • [ ] Nullable columns are used during transition, made NOT NULL only after data is fully populated

See also

  • Node.js SQL Patterns: SQL query organization and database access patterns for implementing migration queries.
  • Refactoring: The same principle — small verified steps — applied to code structure instead of stateful systems.