Database Migrations

When to use

Use this pattern when you need versioned, reversible schema changes for a SQL database and you want to own the runner rather than adopt a framework like Knex, node-pg-migrate, or Umzug. It keeps each migration as plain, reviewable SQL, tracks applied state in the database itself, and drives everything through a few CLI subcommands.

It pairs naturally with the SQL Queries in Separate Files and Named SQL Parameters patterns: the runner's own bookkeeping queries live in .sql files and use :name placeholders. The Phased Migrations methodology covers how to sequence a schema change so each deploy stays reversible (add column, dual-write, backfill, cut over, drop); this topic covers the concrete Node.js machinery that applies those phases.

The pattern

Migrations are directories, not files

Each migration is a directory whose name is a UTC timestamp followed by a human-readable description. The timestamp uses a sortable, filesystem-safe format (yyyyMMdd'T'HHmmss'Z'), so a plain lexicographic sort of the directory names yields chronological order:

migrations/
├── 20210522T022536Z Initial schema/
│   ├── apply.sql
│   └── unapply.sql
├── 20210522T030739Z Add column `users.id`/
│   ├── apply.sql
│   └── unapply.sql
└── 20210522T034419Z Populate empty `users.id` columns/
    ├── apply.sql
    └── unapply.sql

The description is part of the directory name, so the migration list is self-documenting: git log and ls both read like a changelog. Backticks in a description are legal directory characters and survive fine.

Each migration is a forward and a reverse script

A migration directory holds exactly two SQL files. apply.sql makes the change; unapply.sql reverses it. Each is a small, focused script with a leading comment describing intent:

sql

-- apply.sql
-- Adds a column `id` to the `users` table. This column is eventually intended
-- to replace the `uid` column.
ALTER TABLE users ADD COLUMN id integer;

sql

-- unapply.sql
-- Removes the `users.id` column
ALTER TABLE users DROP COLUMN id;

Writing the reverse script at the same time as the forward script forces you to think about rollback up front, while the change is fresh, instead of improvising it during an incident.

Applied state lives in the database

A Migration table records which migrations have run and when. The runner creates it on demand (idempotent CREATE TABLE IF NOT EXISTS) before reading or writing it:

sql

-- create-migration-table.sql
CREATE TABLE IF NOT EXISTS "Migration" (
  id       bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  name     text NOT NULL,
  applied  timestamptz NOT NULL DEFAULT now()
);

The remaining bookkeeping queries are tiny and live in their own .sql files, using named parameters:

sql

-- get-migrations.sql: most-recently-applied first
SELECT * FROM "Migration" ORDER BY applied DESC, id DESC;

sql

-- insert-migration.sql
INSERT INTO "Migration" (name) VALUES (:name);

sql

-- delete-migration.sql
DELETE FROM "Migration" WHERE name = :name;

Order by applied DESC, id DESC, not by applied alone. Postgres now() returns the transaction start time, so several migrations recorded in one apply n run share an identical applied value and tie. The monotonic id is the tiebreaker that keeps "most-recently-applied first" deterministic, which matters because unapply reverses migrations in exactly this order.

Unapplied migrations are a set difference

The runner never stores a "current version" counter. Instead it computes what to run by diffing two lists: every migration directory on disk, and every migration name recorded in the Migration table. Whatever is on disk but not in the table is unapplied:

javascript

class Migration {
  // Every migration directory on disk, sorted chronologically by name.
  static async all() {
    const entries = await fs.promises.readdir(migrationsRoot, {
      withFileTypes: true,
    });
    return entries
      .filter(entry => entry.isDirectory())
      .map(entry => new Migration(entry.name))
      .sort((a, b) => a.name.localeCompare(b.name));
  }
  
  // Names recorded in the `Migration` table.
  static async applied() {
    await Migration.ensureDbTable();
    const rows = await query(file("./get-migrations.sql"));
    return Array.from(rows).map(row => new Migration(row.name));
  }
  
  // On disk but not yet recorded as applied.
  static async unapplied() {
    const all = await Migration.all();
    const applied = await Migration.applied();
    const appliedNames = new Set(applied.map(m => m.name));
    return all.filter(m => !appliedNames.has(m.name));
  }
}

Because state is derived, the system self-heals: drop a new migration directory into the folder and it is automatically "unapplied" without touching any counter. Two branches that each add a migration merge cleanly, since order is determined by timestamp, not by an incrementing number that would collide.

Applying and unapplying record their own bookkeeping

apply runs the next n unapplied migrations (one by default), in chronological order. Each migration runs its apply.sql, and only on success records itself in the Migration table, so a failed migration is never marked applied and will be retried next time. Crucially, the batch stops at the first failure instead of pressing on: later migrations routinely assume earlier ones succeeded, so running them against a half-migrated schema can corrupt data or record a later migration as applied on top of a broken one.

javascript

class Migration {
  static async apply(n = 1) {
    const migrations = (await Migration.unapplied()).slice(0, n);
    for (const migration of migrations) {
      const succeeded = await migration.apply();
      // Stop the batch on the first failure; do not run later migrations
      // against a schema the failed one was supposed to change.
      if (!succeeded) break;
    }
  }
  
  async apply() {
    const sql = (await this.getScripts()).apply;
    try {
      await query(sql);
      await this.record();    // INSERT into "Migration"
      return true;
    } catch (error) {
      console.error(error);
      return false;            // not recorded, so it stays unapplied
    }
  }
}

unapply is the mirror image: it takes the most-recently-applied migrations, runs each unapply.sql, and removes the row from the Migration table on success.

Wrap any migration with more than one statement in an explicit transaction, so a failure partway through rolls the whole script back instead of leaving the schema half-changed. The "record only on success" logic above protects the Migration table, but it cannot undo statements that already committed inside a multi-step apply.sql.

Serialize concurrent applies with an advisory lock

apply reads the unapplied set and then runs it, and those are two steps with a window between them. A single developer at a terminal never fills that window. Automation can: overlapping CI/CD deploys, or several app instances each running migrations on boot. Two runners then compute the same pending set and both run the same migration:

Process A: reads table -> "0001 is pending"
Process B: reads table -> "0001 is pending"   (A hasn't recorded it yet)
Process A: runs 0001's apply.sql  -> ok
Process B: runs 0001's apply.sql  -> error: column already exists

The "record only on success" rule does not help here, because both processes decided to run before either recorded anything. Serialize the whole operation behind a database advisory lock instead: a second runner blocks until the first finishes, then re-reads the unapplied set and correctly sees nothing pending.

javascript

static async apply(n = 1) {
  await withLock("migrations", async () => {
    const migrations = (await Migration.unapplied()).slice(0, n);
    for (const migration of migrations) {
      const succeeded = await migration.apply();
      if (!succeeded) break;
    }
  });
}

withLock wraps the engine's own advisory-lock primitive, pg_advisory_lock(...) in Postgres, the same approach mature runners take (Rails' with_advisory_lock). SQLite and D1 serialize writers themselves, so a separate lock is moot there. This is the safeguard to add once migrations run from automation rather than by hand, not a fix for a defect in the core pattern.

Creating a migration copies a template

create <description> builds the timestamped directory name, makes the folder, and copies two starter scripts into it. Generating the timestamp at creation time is what guarantees ordering:

javascript

static async create(description) {
  const datetime = DateTime.utc().toFormat("yyyyMMdd'T'HHmmss'Z'");
  const name = `${datetime} ${description}`;
  const dir = resolve(PROJECT_ROOT, "migrations", name);
  await fs.promises.mkdir(dir, { recursive: true });
  await Promise.all([
    fs.promises.copyFile(applyTemplate, resolve(dir, "apply.sql")),
    fs.promises.copyFile(unapplyTemplate, resolve(dir, "unapply.sql")),
  ]);
}

The templates are intentionally minimal: a comment line marking where the forward and reverse SQL go. You fill them in by hand, which keeps every migration plain SQL with no ORM or DSL in between.

The CLI is one subcommand per verb

The runner is wired up with the CLI Subcommands pattern: a migrate command delegates to create, apply, and unapply, each in its own directory alongside the SQL files it uses.

javascript

// migrate/index.js
program.command("create", "create new migration", {
  executableFile: "create/index.js",
});
program.command("apply", "apply 1 or more migrations", {
  executableFile: "apply/index.js",
});
program.command("unapply", "unapply 1 or more migrations", {
  executableFile: "unapply/index.js",
});

Usage reads naturally:

bash

# scaffold a new migration directory with empty apply/unapply scripts
db migrate create "Add column \`users.id\`"

# apply the next unapplied migration (or the next n)
db migrate apply
db migrate apply 3

# roll back the most recent applied migration (or the last n)
db migrate unapply

A leaf subcommand that opens a database pool must call process.exit() when done, because the pool's open connections otherwise keep the process alive.

Back up and journal before each migration

Before each apply or unapply, the runner exports the current database state and appends a record to a per-database .journal file: the operation, the database instance, the migration name, the path to the pre-migration export, and the result. The export path is the rollback handle if a migration corrupts data in a way unapply.sql cannot cleanly reverse:

DATABASE MIGRATION
Time: 2021-05-22 at 03:07:39 UTC
Operation: apply
Database: db463997968
Migration: 20210522T030739Z Add column `users.id`
Export: gs://.../db463997968-20210522T030739Z.sql.gz
Result: succeeded
---

The journal is an append-only audit log committed alongside the migrations, so the history of what ran against each environment is reviewable in version control. Be deliberate about that last part: a journal records environment identifiers and export locations (the gs:// path above), which are infrastructure details you may not want in a public or widely shared repo. If that applies, keep the .journal out of version control or redact those fields, and rely on the export storage itself as the durable record.

Rollback has three independent layers

"Rolling back" is three different mechanisms, and conflating them leads to using the dangerous one by reflex. Keep them distinct:

Code rollback redeploys a previous build. It is fast and lossless, and it is the automatic response to a bad deploy.
Schema rollback runs one migration's unapply.sql. It is surgical and manual: it reverses a single schema change and nothing else.
Data restore recovers from the pre-migration export. It is break-glass only, for migrations that unapply.sql cannot cleanly reverse.

The backup from the previous section is that third layer, not the primary rollback. The Populate empty users.id columns migration above is exactly why it has to exist: once the column is populated, its unapply.sql cannot recover the pre-populate state, so the export is the only way back.

This is also why auto-revert should mean code, never schema. Automatically running unapply.sql on a failed deploy is unsafe precisely because a migration can be lossy or irreversible. A code-only rollback is safe only when every migration is backward-compatible (expand and contract), so the previous build still runs against the new schema. Changes that would break the old code get split into independently reversible phases; that sequencing is what the Phased Migrations methodology covers.

Treat losslessness as a hard requirement, not a nice-to-have: a data migration must not drop a row, column, or value, and you verify that with row and column counts and a content check, not by eyeballing it. A one-shot transactional rebuild of a table is fine if it is provably lossless.

Verify the live schema matches what the code expects

Tracking which migrations ran proves only that they ran. It cannot catch an out-of-band ALTER or a hand-edited production database, where the recorded state and the real shape have drifted apart. Add a shape-based check: a schema contract that asserts the structure the code actually depends on.

Keep the contract as a single SQL test file (for example schema.test.sql) and run it from one source in two places:

in CI, against a freshly migrated copy of the database, and
at runtime on startup, refusing to serve if the live schema does not match.

This is the same idea as Hibernate's hbm2ddl.auto=validate or Rails' check_pending!, expressed in plain SQL. A workable convention: each statement returns one row per problem it finds, and the runner throws if any statement returns a row. That sidesteps the fact that SQLite and most engines cannot cleanly RAISE from inside a SELECT. For richer assertions, frameworks like pgTAP and tSQLt formalize the same pattern.

Notes for SQLite and Cloudflare D1

The examples above use Postgres syntax, but the pattern ports to SQLite (and to Cloudflare D1, which is SQLite) with a few engine-specific adjustments:

The Migration table DDL differs. SQLite has no GENERATED ... AS IDENTITY or timestamptz. Use id integer PRIMARY KEY AUTOINCREMENT, name text NOT NULL, and store applied as text NOT NULL DEFAULT (datetime('now')). The bookkeeping queries are otherwise unchanged; keep the ORDER BY applied DESC, id DESC tiebreaker, since datetime('now') is only second-resolution and ties just as readily.
ALTER TABLE is limited. SQLite only supports ADD COLUMN, DROP COLUMN, RENAME COLUMN, and RENAME TABLE. Changing a type, a constraint, or the primary key needs either an add, copy, drop sequence at the column level or a full table rebuild. RENAME COLUMN is what lets unapply.sql collapse a rename back to the original name into one statement.
D1 has no native down or rollback (it is a planned feature), so you hand-roll unapply.sql exactly as here. D1 also already keeps a d1_migrations table that records applied migrations by name, so you can read that instead of maintaining a separate Migration table.
The runner shells out to wrangler d1 execute rather than holding a connection pool, so the process.exit() caveat from the CLI section does not apply on D1: there is no open pool keeping the process alive.
Single-instance apps can skip the dual-column dance. With no concurrent old client to protect, the expand and contract phasing is often unnecessary, and a one-shot transactional table rebuild is enough (as long as it is provably lossless).
Tests get a real binding. @cloudflare/vitest-pool-workers provides a live D1 binding in tests, so the migration run and the schema.test.sql contract both execute without a bespoke harness.

Trade-offs

Plain SQL, no abstraction. Migrations are raw .sql files, so you get full database-specific syntax and editor tooling, but no cross-database portability and no schema-builder DSL. This is a deliberate trade for the database engine a given project actually targets.
Derived state, not a version counter. Computing unapplied migrations as a set difference avoids merge collisions and self-heals when files are added, but it means the Migration table and the migrations/ directory must agree. A migration manually run outside the runner will not be recorded, and a recorded migration whose directory was deleted is invisible.
You write rollback up front. Authoring unapply.sql alongside apply.sql costs effort for changes you may never reverse, but it is what makes each migration independently reversible. Some changes (a destructive DROP, a lossy type narrowing) cannot be fully reversed by SQL alone, which is why the pre-migration export exists.
One migration at a time by default. Applying a single migration per invocation makes failures easy to localize and re-run, at the cost of being more deliberate than a single "migrate to latest" command. Pass n to apply several at once.
Backup and journal add latency and storage. Exporting the whole database before every migration is slow on large databases and accumulates export artifacts, but it guarantees a recoverable point even when a migration goes wrong in an unforeseen way.

Database Migrations ​

When to use ​

The pattern ​

Migrations are directories, not files ​

Each migration is a forward and a reverse script ​

Applied state lives in the database ​

Unapplied migrations are a set difference ​

Applying and unapplying record their own bookkeeping ​

Serialize concurrent applies with an advisory lock ​

Creating a migration copies a template ​

The CLI is one subcommand per verb ​

Back up and journal before each migration ​

Rollback has three independent layers ​

Verify the live schema matches what the code expects ​

Notes for SQLite and Cloudflare D1 ​

Trade-offs ​

Database Migrations

When to use

The pattern

Migrations are directories, not files

Each migration is a forward and a reverse script

Applied state lives in the database

Unapplied migrations are a set difference

Applying and unapplying record their own bookkeeping

Serialize concurrent applies with an advisory lock

Creating a migration copies a template

The CLI is one subcommand per verb

Back up and journal before each migration

Rollback has three independent layers

Verify the live schema matches what the code expects

Notes for SQLite and Cloudflare D1

Trade-offs