Designs production-grade database schemas, migration plans, indexing strategies, and query optimizations across Firestore, PostgreSQL, SQLite/Room, and Redis. Every recommendation considers data volume, access patterns, consistency requirements, and Cure Consulting Group's Firebase-first but multi-database approach.
Pre-Processing (Auto-Context)
Before starting, gather project context silently:
Read PORTFOLIO.md if it exists in the project root or parent directories for product/team context
Feature flags: gate new schema reads behind feature flags; roll back by disabling the flag
Data Backfill Scripts
Write idempotent backfill scripts that can be re-run safely
Process in batches (1000–5000 rows per batch) to avoid locking and memory issues
Log progress and support resumption from the last processed ID
Run backfills during low-traffic windows; monitor database load during execution
Testing Migrations
Test every migration against a snapshot of production data (anonymized if necessary)
Verify both forward migration and rollback
Measure migration execution time on production-scale data
Include migration tests in CI pipeline
Step 6: Indexing Strategy
General Principles
Index columns that appear in WHERE, JOIN, ORDER BY, and GROUP BY clauses
Avoid over-indexing — each index slows writes and consumes storage
Review and drop unused indexes quarterly
Composite Indexes
Column order matters: place equality conditions first, then range conditions, then sort columns
A composite index on (a, b, c) supports queries on (a), (a, b), and (a, b, c) but not (b, c) alone
Covering Indexes
Include all columns needed by a query in the index to enable index-only scans
PostgreSQL: use INCLUDE clause for non-key columns in the index
Firestore Indexes
Automatic indexes: Firestore auto-indexes every field; no action needed for single-field queries
Composite indexes: required for queries with multiple where clauses or where + orderBy on different fields; define in firestore.indexes.json
Exempt fields: exempt large string/array fields from automatic indexing to save costs
PostgreSQL EXPLAIN ANALYZE
Run EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) on every slow query
Look for: sequential scans on large tables, nested loop joins on large sets, high buffer reads
Target: all frequent queries should use index scans or bitmap index scans
Monitor with pg_stat_user_indexes for unused indexes and pg_stat_statements for slow queries
Index Maintenance
Schedule REINDEX or pg_repack for heavily updated tables
Monitor index bloat with pgstattuple extension
For Firestore, review composite index usage in the Firebase console and remove unused indexes
Step 7: Query Optimization
N+1 Detection
Identify loops that issue one query per iteration — replace with batch reads or joins
Firestore: use whereIn (max 30 values) or getAll() for batch document fetches
PostgreSQL: use JOIN or WHERE IN subqueries; use EXISTS over IN for large subquery results
Room: use @Transaction with @Relation to load parent + children in one call
Batch Reads
Firestore: batch reads with getAll() (max 500 documents); for larger sets, paginate
PostgreSQL: use ANY(ARRAY[...]) for parameterized batch lookups; limit batch size to 1000
Pagination Patterns
Cursor-based (preferred): use startAfter(lastDocument) in Firestore, WHERE id > :lastId ORDER BY id LIMIT :pageSize in SQL; stable under concurrent writes
Offset-based (use sparingly):OFFSET + LIMIT in SQL; degrades on large offsets; acceptable for admin/backoffice UIs with <100k rows
Denormalization Trade-offs
Denormalize when read frequency is 10x+ write frequency for the same data
Track all locations of denormalized data in documentation
Use Cloud Functions (Firestore) or database triggers (PostgreSQL) to keep copies in sync
Monitor for data drift — schedule periodic consistency checks
Read Replicas
Route read-heavy analytics and reporting queries to read replicas
Accept replication lag (typically <1s for PostgreSQL streaming replication)
Never route writes to read replicas
Use connection pooling (PgBouncer) to manage replica connections efficiently
Step 8: Backup & Disaster Recovery
Automated Backups
Firestore: enable daily automated exports to Cloud Storage; use gcloud firestore export
PostgreSQL: configure continuous WAL archiving + daily pg_dump base backups; retain 30 days minimum
SQLite/Room: back up database file on app update and before migrations; sync to cloud storage for critical data
Redis: enable RDB snapshots + AOF persistence; schedule BGSAVE during low-traffic periods
Point-in-Time Recovery
PostgreSQL: configure WAL archiving for PITR; test recovery to a specific timestamp quarterly
Firestore: use timestamped exports; restore by importing a specific export
Document Recovery Time Objective (RTO) and Recovery Point Objective (RPO) for each database
Export Strategies
Maintain automated export pipelines to BigQuery or data warehouse for analytics
Export anonymized datasets for development and testing environments
Version export scripts alongside migration scripts
Cross-Region Replication
Firestore: use multi-region locations (nam5, eur3) for automatic cross-region replication
PostgreSQL: configure streaming replication to a secondary region; automate failover with Patroni or Cloud SQL HA
Redis: use Redis Sentinel or Redis Cluster for cross-region replication and automatic failover
Test failover procedures quarterly; document runbooks for manual failover
Step 9: Output Templates
For every database architecture recommendation, deliver:
Schema diagram — ASCII or Mermaid ER diagram showing entities, relationships, and cardinality
Migration script — versioned, idempotent SQL/script with rollback counterpart
Index recommendations — table of recommended indexes with justification and expected query improvement