HomeReadTactics deskDatabase Migration: A Three-Phase Playbook for Production Systems
Tactics·May 27, 2026

Database Migration: A Three-Phase Playbook for Production Systems

A dev.to post details a three-phase "Expanding-Contract Pattern" for database migrations. This playbook prevents production outages by breaking schema changes into small, independently rollable…

A dev.to post details a three-phase "Expanding-Contract Pattern" for database migrations. This playbook prevents production outages by breaking schema changes into small, independently rollable steps, even with billions of rows.

The dev.to post details a production incident where a database migration locked a table for 45 minutes. This downtime, on systems managing "billions of rows," underscores the risk of standard ActiveRecord::Migration practices in high-scale environments. The author presents a three-phase "Expanding-Contract Pattern" designed to prevent such outages, emphasizing small, independently rollable changes.

Small Changes Prevent Production Lockouts

The core problem, as identified by the dev.to author, is that typical development-focused migration patterns, such as add_index :orders, :user_id in ActiveRecord, lock tables. This is acceptable for small datasets but causes significant downtime in production with "50 million orders." Every production-safe migration follows the same pattern: make the change in small, non-breaking steps that can be rolled back independently. The solution hinges on breaking down schema changes into multiple, non-blocking steps. Each step must be deployable and reversible without affecting application functionality.

Phase 1: Expand with Nullable Columns

The initial step involves adding new columns as nullable. This operation is non-blocking on most modern relational databases. The dev.to post provides the SQL: ALTER TABLE orders ADD COLUMN user_email VARCHAR(255);. Following this schema change, the application code is updated to write data to both the old and new columns. This dual-write approach ensures data consistency during the transition, allowing the application to continue functioning while the migration progresses. The deployment of this code change occurs before any data backfills begin.

Phase 2: Batched Backfills for Data Integrity

With the new column in place and the application performing dual writes, historical data needs to be populated. Directly updating "billions of rows" in a single transaction would cause a table lock. The dev.to author recommends a batched backfill strategy. The provided SQL example uses a DO $$ DECLARE block to iterate through records in batches of 10000. This process updates user_email by joining orders with users based on user_id. The batch size, 10000 in the example, is a critical parameter to tune based on system load and row complexity. This incremental approach minimizes the impact on production performance.

Phase 3: Contract and Finalize Schema

After the batched backfill completes and the new column is fully populated, the application is updated to read exclusively from the new user_email column. Once this read change is deployed and verified, the old column can be safely dropped. The dev.to post implies adding necessary constraints, such as NOT NULL or unique indexes, on the new column as the final step. This "contract" phase removes the redundant column, cleans up the schema, and enforces data integrity rules that could not be applied during the initial nullable column addition.

Safe Index Additions

Adding an index can also lock tables. The dev.to post implicitly addresses this by focusing on non-blocking operations. When an index is required, it should be added concurrently if the database supports it (e.g., CREATE INDEX CONCURRENTLY in PostgreSQL). This ensures the index creation does not block writes or reads to the table, maintaining application availability.

WHAT WE'D CHANGE

This "Expanding-Contract Pattern" is robust for large-scale, high-traffic systems, particularly those with "billions of rows" or "50 million orders." For indie founders operating with smaller datasets, the overhead of three distinct migration phases and corresponding application code changes might be disproportionate to the risk. A startup with fewer than a million records might find a standard add_column or add_index operation completes within seconds, making the multi-step approach an over-engineering.

The dev.to post focuses on SQL examples, which implies a direct database interaction or a low-level ORM. While ActiveRecord is mentioned, the provided SQL is raw. Founders using higher-level ORMs or frameworks might need to adapt these principles to their specific migration tooling. Some ORMs offer built-in features for non-blocking migrations or batched operations that abstract away the raw SQL complexity. Relying solely on raw SQL for complex migrations also increases the potential for human error without the guardrails of an ORM.

Furthermore, the playbook assumes a single, monolithic application. In a microservices architecture, coordinating the application code changes (dual-write, read-new, drop-old) across multiple services that interact with the same database could introduce significant deployment complexity and potential for desynchronization. The "deploy this first" instruction for dual-writes requires careful orchestration across services. The dev.to post does not detail how to manage rollbacks of application code changes in conjunction with database migration rollbacks, which is a critical consideration for production safety.

The dev.to author's "Expanding-Contract Pattern" offers a direct solution to the problem of database migrations causing production outages. By segmenting schema changes into distinct, non-blocking steps—expanding the schema, backfilling data in batches, then contracting the old schema—founders can maintain application availability even with "billions of rows." This disciplined approach, while requiring more upfront planning and code changes, prioritizes system stability over migration simplicity, a trade-off essential for high-stakes production environments.

Pull quote: “Every production-safe migration follows the same pattern: make the change in small, non-breaking steps that can be rolled back independently.”

Sources · how we verified
  1. Database Migration Strategies That Actually Work in Production

Every claim ties to a primary source. See our methodology.

Reported by the Maya desk on Founderr Pulse’s Tactics beat. Every factual claim is tied to a primary source and linked; anything that can’t be stood up doesn’t run. Founderr (RIKHATH LLC) is the accountable publisher and corrects in place. How we work · About · File a correction.
M
Maya

The Maya desk covers tactics: concrete playbooks, growth experiments, and operating decisions indie founders are running now. Every claim is sourced and linked. Operated by Founderr (RIKHATH LLC) See the desk →

Founderr Pulse — free & independent. The desk for people who build & back.