Name: Merge Database Performance Tuning
Author: datasurface

Search skills.../

Merge Database Performance Tuning | Skills Pool

# Kubernetes pod/node network (adjust CIDR to match your cluster)
host    all    all    192.168.4.0/24    scram-sha-256

# Tailscale / overlay network (if applicable)
host    all    all    100.64.0.0/10     scram-sha-256

# Kubernetes pod CIDR (common defaults - adjust to your cluster)
host    all    all    10.244.0.0/16     scram-sha-256

-- Check dead tuple buildup on staging tables
SELECT relname, n_live_tup, n_dead_tup,
       last_autovacuum, last_autoanalyze
FROM pg_stat_user_tables
WHERE relname LIKE '%_s'
ORDER BY n_dead_tup DESC;

# In postgresql.conf (or per-table with ALTER TABLE ... SET)
autovacuum_vacuum_scale_factor = 0.05    # default 0.2 — vacuum sooner on high-churn tables
autovacuum_analyze_scale_factor = 0.02   # default 0.1 — re-analyze more often
autovacuum_vacuum_cost_delay = 2ms       # default 2ms — reduce if I/O headroom exists

-- Buffer hit ratio (should be > 99%)
SELECT
  sum(blks_hit) * 100.0 / nullif(sum(blks_hit) + sum(blks_read), 0) AS hit_ratio
FROM pg_stat_database
WHERE datname = 'merge_db';

-- Temp file usage (should be 0 for normal operation)
SELECT temp_files, temp_bytes
FROM pg_stat_database WHERE datname = 'merge_db';

-- Table bloat from staging churn
SELECT relname, n_live_tup, n_dead_tup,
       pg_size_pretty(pg_total_relation_size(relid)) AS total_size
FROM pg_stat_user_tables
WHERE relname LIKE '%_s' OR relname LIKE '%_m'
ORDER BY pg_total_relation_size(relid) DESC;

-- Connection count (should be well under max_connections)
SELECT count(*) FROM pg_stat_activity WHERE datname = 'merge_db';

	Airflow Metadata DB	Merge DB
Access pattern	Read-heavy (UI queries), frequent small updates	Write-heavy (bulk inserts/deletes per batch)
Connection pooling	Through PgBouncer (transaction mode with `server_reset_query = DISCARD ALL`)	Direct connections from K8s job pods
Deadlock risk	High (scheduler vs UI on `dag` table)	Low (jobs operate on separate stream tables)
Autovacuum pressure	Moderate	High (staging table churn)
WAL volume	Moderate	High

-- Enable optimistic concurrency (MVCC) — MANDATORY
-- Without this, SQL Server uses pessimistic locking that causes deadlocks
-- under concurrent access from multi-threaded CQRS sync or parallel ingestion
ALTER DATABASE [your_database] SET READ_COMMITTED_SNAPSHOT ON;

-- For CQRS targets only: enable delayed durability for higher write throughput
-- Batches transaction log writes (small risk of losing last few ms on crash)
-- Safe because CQRS data can be re-synced from the source
ALTER DATABASE [your_cqrs_database] SET DELAYED_DURABILITY = FORCED;

SELECT name, is_read_committed_snapshot_on, delayed_durability_desc
FROM sys.databases WHERE name = 'your_database';

Parameter	Value	Notes
`max_connections`	200	Ingestion and merge jobs connect directly (not through pgbouncer). Each concurrent DAG run can hold 1-2 connections during its batch
`shared_buffers`	4 GB	~25% of system RAM. SCD2 staging tables are heavily churned (insert + delete each cycle), so buffer cache is critical

Parameter	Value	Notes
`max_connections`	200	Ingestion and merge jobs connect directly (not through pgbouncer). Each concurrent DAG run can hold 1-2 connections during its batch
`shared_buffers`	4 GB	~25% of system RAM. SCD2 staging tables are heavily churned (insert + delete each cycle), so buffer cache is critical

Merge Database Performance Tuning

PostgreSQL

Sizing Tier: Small System (< 100 DAGs/Streams)

postgresql.conf

Merge Database Performance Tuning

PostgreSQL

Sizing Tier: Small System (< 100 DAGs/Streams)

postgresql.conf

pg_hba.conf

Autovacuum Tuning

Monitoring

Key Differences from Airflow Metadata DB

SQL Server

Required Settings

Verification

VM/Hypervisor Disk Configuration

Postgres Patterns

Postgres Patterns

Database Migrations

Postgres Patterns

Postgres Patterns

Jpa Patterns