Migration file conventions, templates, multi-tenant schema migration runner, zero-downtime strategies, and rollback procedures for PostgreSQL 18 schema-per-tenant SaaS platform. Covers golang-migrate naming convention, :schema placeholder substitution in all SQL, mandatory CONCURRENTLY index creation outside transaction blocks, three-phase column rename, bounded worker-pool migration runner (MigrateAllTenantSchemas), rollback checklist, and audit trigger setup. No FK constraints in any migration. Every migration applied to ALL active tenant schemas via the runner. Keywords: migration, golang-migrate, schema-per-tenant, CONCURRENTLY, zero-downtime, rollback, :schema placeholder, MigrateAllTenantSchemas, worker pool, audit trigger, PostgreSQL 18.
SCOPE: DAL SERVICE ONLY — These patterns run INSIDE the
data-access-layerservice. Entity services NEVER use pgxpool, AcquireForTenant, or execute SQL directly. Entity services use the/dal-service-patternsskill to communicate with DAL via NATS. If you are implementing an entity service (identity, incident-core, etc.), STOP and use/dal-service-patternsinstead.
Used by: database-architect, code-generator (DAL service only)
migrations/
├── 000001_create_incidents.up.sql
├── 000001_create_incidents.down.sql
├── 000002_add_incident_sla_fields.up.sql
├── 000002_add_incident_sla_fields.down.sql
├── 000003_create_audit_log.up.sql
└── 000003_create_audit_log.down.sql
| Rule | Convention |
|---|---|
| Prefix | 6-digit zero-padded: 000001, 000002 |
| Name | Snake-case description of the change |
| Suffix | .up.sql for apply, .down.sql for rollback |
| Pairing | Every up MUST have a matching down |
| Idempotency | IF NOT EXISTS / IF EXISTS everywhere |
| Schema | Always use :schema placeholder — replaced at runtime |
| FK | Never — no FK constraints in any migration |
-- 000001_create_incidents.up.sql
BEGIN;
CREATE TABLE IF NOT EXISTS :schema.incidents (
id UUID NOT NULL DEFAULT gen_random_uuid(),
title TEXT NOT NULL
CONSTRAINT chk_incidents_title_len CHECK (char_length(title) <= 1000),
description TEXT,
status TEXT NOT NULL DEFAULT 'open'
CONSTRAINT chk_incidents_status_valid
CHECK (status IN ('open','in_progress','resolved','closed')),
priority TEXT NOT NULL DEFAULT 'medium'
CONSTRAINT chk_incidents_priority_valid
CHECK (priority IN ('critical','high','medium','low')),
assignee_id UUID, -- app-managed ref, no FK
asset_id UUID, -- app-managed ref, no FK
custom_fields JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
created_by UUID NOT NULL,
updated_by UUID NOT NULL,
is_deleted BOOLEAN NOT NULL DEFAULT false,
deleted_at TIMESTAMPTZ,
version INTEGER NOT NULL DEFAULT 1,
CONSTRAINT pk_incidents PRIMARY KEY (id)
-- NO FOREIGN KEY constraints
);
COMMENT ON TABLE :schema.incidents IS
'ITSM incident entity. Schema-per-tenant. No FK constraints.';
COMMIT;
-- Indexes: OUTSIDE transaction — CONCURRENTLY avoids write lock
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_incidents_status_created
ON :schema.incidents (status, created_at DESC)
WHERE is_deleted = false;
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_incidents_assignee
ON :schema.incidents (assignee_id)
WHERE is_deleted = false AND assignee_id IS NOT NULL;
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_incidents_custom_fields
ON :schema.incidents USING GIN (custom_fields jsonb_path_ops)
WHERE is_deleted = false;
-- 000001_create_incidents.down.sql
DROP TABLE IF EXISTS :schema.incidents;
-- Phase 1: 000010_add_incident_due_date_nullable.up.sql
-- Nullable column add is instant — no lock
ALTER TABLE :schema.incidents
ADD COLUMN IF NOT EXISTS due_date TIMESTAMPTZ;
-- Phase 2: 000011_backfill_incident_due_date.up.sql
-- Records the backfill requirement. Actual backfill runs as a Go background job
-- in batches to avoid long-running UPDATE locks.
COMMENT ON COLUMN :schema.incidents.due_date IS
'SLA due date. Backfill job: created_at + 7 days for historical rows.';
-- Phase 3: 000012_enforce_incident_due_date_notnull.up.sql
-- Only after all rows are backfilled
ALTER TABLE :schema.incidents
ALTER COLUMN due_date SET NOT NULL,
ALTER COLUMN due_date SET DEFAULT now() + INTERVAL '7 days';
| Phase | SQL Action | App Code Change |
|---|---|---|
| 1 | Add new column, backfill, add sync trigger | Write to both old and new |
| 2 | Switch reads to new column | Read from new only |
| 3 | Drop old column and trigger | Remove old references |
-- 000009_add_incident_priority_index.up.sql
-- No BEGIN/COMMIT — CONCURRENTLY cannot run inside a transaction
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_incidents_priority
ON :schema.incidents (priority)
WHERE is_deleted = false;
-- 000009_add_incident_priority_index.down.sql
DROP INDEX CONCURRENTLY IF EXISTS :schema.idx_incidents_priority;
When a new migration is deployed, run it against every active tenant schema. Uses a bounded worker pool — never one goroutine per schema.
// internal/provisioning/migration_runner.go
package provisioning
import (
"context"
"fmt"
"strings"
"sync"
"github.com/jackc/pgx/v5/pgxpool"
"go.uber.org/zap"
)
// MigrateAllTenantSchemas applies migrations to all active tenant schemas.
// maxWorkers: goroutine concurrency cap — never 0, never unbounded.
func MigrateAllTenantSchemas(
ctx context.Context,
pool *pgxpool.Pool,
migrations []Migration,
maxWorkers int,
logger *zap.Logger,
) error {
// Fetch active schemas from registry
rows, err := pool.Query(ctx,
`SELECT schema_name FROM _mgmt.tenant_registry WHERE status = 'active'`)
if err != nil {
return fmt.Errorf("listing active schemas: %w", err)
}
defer rows.Close()
var schemas []string
for rows.Next() {
var s string
if err := rows.Scan(&s); err != nil {
return fmt.Errorf("scanning schema name: %w", err)
}
schemas = append(schemas, s)
}
if len(schemas) == 0 {
logger.Info("no active tenant schemas to migrate")
return nil
}
if maxWorkers <= 0 {
maxWorkers = 4
}
// Bounded worker pool
jobs := make(chan string, len(schemas))
for _, s := range schemas {
jobs <- s
}
close(jobs)
var (
wg sync.WaitGroup
mu sync.Mutex
errs []string
)
for i := 0; i < maxWorkers; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for schemaName := range jobs {
if err := applyMigrationsToSchema(ctx, pool, schemaName, migrations); err != nil {
logger.Error("migration failed",
zap.String("schema", schemaName), zap.Error(err))
mu.Lock()
errs = append(errs, fmt.Sprintf("%s: %v", schemaName, err))
mu.Unlock()
} else {
logger.Info("migration applied", zap.String("schema", schemaName))
}
}
}()
}
wg.Wait()
if len(errs) > 0 {
return fmt.Errorf("%d schema(s) failed: %s", len(errs), strings.Join(errs, "; "))
}
return nil
}
-- 000003_create_audit_log.up.sql
BEGIN;
CREATE TABLE IF NOT EXISTS :schema.audit_log (
id UUID NOT NULL DEFAULT gen_random_uuid(),
table_name TEXT NOT NULL,
record_id UUID NOT NULL,
action TEXT NOT NULL
CONSTRAINT chk_audit_action CHECK (action IN ('INSERT','UPDATE','DELETE')),
old_data JSONB,
new_data JSONB,
changed_by UUID,
changed_at TIMESTAMPTZ NOT NULL DEFAULT now(),
CONSTRAINT pk_audit_log PRIMARY KEY (id, changed_at)
-- NO FK constraints
) PARTITION BY RANGE (changed_at);
CREATE TABLE IF NOT EXISTS :schema.audit_log_default
PARTITION OF :schema.audit_log DEFAULT;
CREATE OR REPLACE FUNCTION :schema.audit_trigger_func()
RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'INSERT' THEN
INSERT INTO audit_log (table_name, record_id, action, new_data, changed_by)
VALUES (TG_TABLE_NAME, NEW.id, 'INSERT', to_jsonb(NEW),
current_setting('app.current_user_id', true)::uuid);
RETURN NEW;
ELSIF TG_OP = 'UPDATE' THEN
INSERT INTO audit_log (table_name, record_id, action, old_data, new_data, changed_by)
VALUES (TG_TABLE_NAME, NEW.id, 'UPDATE', to_jsonb(OLD), to_jsonb(NEW),
current_setting('app.current_user_id', true)::uuid);
RETURN NEW;
ELSIF TG_OP = 'DELETE' THEN
INSERT INTO audit_log (table_name, record_id, action, old_data, changed_by)
VALUES (TG_TABLE_NAME, OLD.id, 'DELETE', to_jsonb(OLD),
current_setting('app.current_user_id', true)::uuid);
RETURN OLD;
END IF;
END;
$$ LANGUAGE plpgsql;
COMMIT;
-- Indexes outside transaction
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_audit_log_changed
ON :schema.audit_log (changed_at DESC);
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_audit_log_record
ON :schema.audit_log (table_name, record_id);
-- Attach trigger to a table (separate migration, applied per table)
CREATE TRIGGER incidents_audit
AFTER INSERT OR UPDATE OR DELETE ON :schema.incidents
FOR EACH ROW EXECUTE FUNCTION :schema.audit_trigger_func();
.down.sql reverses ALL DDL in its .up.sql.MigrateAllTenantSchemas with down migrations._mgmt.tenant_registry migration state is correct.:schema placeholder — migration runs against wrong schema or public.CREATE INDEX inside a BEGIN/COMMIT block — locks table for writes. Always CONCURRENTLY outside transaction..down.sql — blocks rollback across all tenant schemas at deploy time.go func() per schema.ENUM type — requires ACCESS EXCLUSIVE lock to add values. Use TEXT + CHECK instead.