Appearance
Phased Migrations
Problem & Solution
Changing stateful systems (databases, secrets, certificates) in a single step risks downtime or data corruption. A phased migration maintains backward compatibility at every stage so each deploy is independently reversible.
Pattern
The Six Phases
- Add new schema -- Deploy the new column, table, or secret alongside the existing one
- Dual write, old read -- Write to both old and new schema, but the application still reads from the old schema only
- Migrate data -- Copy existing data from the old schema into the new one
- Dual write, new read -- Write to both old and new schema, but the application now reads from the new schema only
- New schema only -- Once verified, stop writing to the old schema entirely and read and write only the new schema
- Remove old schema -- Drop the old column, table, or secret
Not every migration needs all six phases. Column renames, type changes, and secret rotations each follow the same principle -- never break compatibility in a single deploy -- but the specific steps vary.
When to Apply
- Database schema migrations (column renames, type changes, table splits)
- Secret, password, or certificate rotation
- API version transitions
- Configuration format changes
When NOT to Apply
- Additive-only changes (new nullable column, new table) that don't touch existing data or behavior -- these are safe in a single deploy
- Development or staging environments where downtime is acceptable
Examples
Column Type Change: char to boolean
From a real migration that converted a char(1) include column (values "X" and "") to a boolean:
Phase 1 -- Add new column alongside old:
sql
ALTER TABLE accounts
ADD COLUMN include_bool boolean NULL;
UPDATE accounts
SET include_bool = CASE
WHEN include = "X" THEN 1
ELSE 0
END;Phase 2 -- Application writes to both columns. Deploy code that writes both include and include_bool on every update, but still reads from include.
Phase 3 -- Backfill. The UPDATE above handles existing rows. For ongoing migrations, a background job catches any rows missed during the deploy gap.
Phase 4 -- Switch reads to new column. Deploy code that reads from include_bool but still writes both.
Phase 5 -- Drop old column, rename new column:
sql
ALTER TABLE accounts DROP COLUMN include;
ALTER TABLE accounts
CHANGE include_bool include boolean NOT NULL DEFAULT 1;The old include column must be dropped before renaming include_bool to include. Each phase is a separate deploy. If phase 4 reveals data issues, roll back to reading from include without data loss.
Primary Key Swap
Replacing a legacy primary key (aid) with a new auto-increment id column:
sql
-- Phase 1: Add new column
ALTER TABLE accounts ADD COLUMN id int NULL;
UPDATE accounts SET id = aid WHERE id IS NULL;
-- Phase 5: Swap keys (after application code is updated)
ALTER TABLE accounts CHANGE aid aid int NOT NULL;
ALTER TABLE accounts DROP PRIMARY KEY;
ALTER TABLE accounts CHANGE id id int NOT NULL AUTO_INCREMENT;
ALTER TABLE accounts ADD PRIMARY KEY (id);Between phases 1 and 5, the application writes both aid and id, giving time to update foreign keys in other tables before removing the old key.
Table Migration with Data Copy
Creating a normalized replacement table while the old one is still in use:
sql
-- Phase 1: Create new table
CREATE TABLE Account (
id int NOT NULL AUTO_INCREMENT,
name varchar(191) NOT NULL,
user_id int NOT NULL,
type ENUM('ASSET', 'LIABILITY', 'INCOME', 'EXPENSE', 'EQUITY'),
PRIMARY KEY (id),
CONSTRAINT Account_user_id_fk FOREIGN KEY (user_id) REFERENCES User (id)
);
-- Phase 3: Migrate data with transformation
INSERT INTO Account (id, name, user_id, type)
SELECT id, name, user_id,
CASE accttype
WHEN 'CHECKING' THEN 'ASSET'
WHEN 'SAVINGS' THEN 'ASSET'
WHEN 'CREDITLINE' THEN 'LIABILITY'
END
FROM legacy_accounts;The old legacy_accounts table stays live through phases 2-4 while the application transitions.
Foreign Key Addition
Adding a foreign key constraint to an existing table:
sql
-- Phase 1: Add nullable column
ALTER TABLE accounts ADD COLUMN user_id int NULL;
-- Phase 3: Populate from existing data
UPDATE accounts
SET user_id = (SELECT id FROM User WHERE username = 'target_user');
-- Phase 5: Make NOT NULL after verification
ALTER TABLE accounts CHANGE user_id user_id int NOT NULL;
-- Phase 5: Add constraint
ALTER TABLE accounts
ADD CONSTRAINT user_id_fk FOREIGN KEY (user_id) REFERENCES User (id);Adding the constraint before populating data would fail. Adding it as NOT NULL before populating would also fail. The phased approach avoids both problems.
Secret Rotation
- Add new secret to config (both old and new available)
- Deploy code that tries new secret first, falls back to old
- Rotate the actual secret on the external system
- Verify new secret works in production
- Deploy code that only uses new secret
- Remove old secret from config
Reversible Migrations
Every migration should have a reverse script. A column type change from char to boolean reverses cleanly:
sql
-- unapply: boolean back to char
ALTER TABLE accounts
CHANGE include include char(1) NULL DEFAULT "X";
UPDATE accounts
SET include = CASE
WHEN include = "1" THEN "X"
WHEN include = "0" THEN ""
END;Keeping reverse scripts alongside forward scripts means any phase can be rolled back without improvising under pressure.
Checks
- [ ] Each phase can be rolled back independently
- [ ] No single deploy causes downtime
- [ ] Verification step exists between read switchover and cleanup
- [ ] Reverse/unapply scripts exist for every forward migration
- [ ] Nullable columns are used during transition, made
NOT NULLonly after data is fully populated
See also
- Node.js SQL Patterns: SQL query organization and database access patterns for implementing migration queries.
- Refactoring: The same principle — small verified steps — applied to code structure instead of stateful systems.