PostgreSQL database analysis, performance tuning, and health monitoring. You MUST read this entire skill document before executing any PostgreSQL operations — it contains mandatory workflows, safety constraints, and two-phase execution rules that prevent common errors like hallucinated column names and unsafe queries.
What discovery provides:
get_database_overview():
schemas: List of database schemastables: Tables with names, row counts, sizes, and column informationindexes: Index information and statisticsrelationships: Foreign key relationships between tablessize_info: Database and table size metricssecurityperformance_hotspots: Tables with high sequential scans, dead tuples, bloat, or high modification ratesDiscovery options (script defaults, NOT MCP server defaults):
--max-tables N: Limit tables analyzed (script default: 50, MCP server default: 500)--sampling true: Use sampling for large tables (script default: false, MCP server default: true)--timeout N: Timeout in seconds (script default: 60, MCP server default: 300)--alias <name>: Target a specific postgres instance (required for multi-instance workspaces)Why run discovery:
ALWAYS use format() for output (40-60% token savings):
import { format } from "@connections/_utils/format";
console.log(format(result)); // CORRECT
console.log(JSON.stringify(result, null, 2)); // WRONG
Phase 1 - Discovery Script:
const schema = await get_object_details({ schema_name: "public", object_name: "user", object_type: "table" });
console.log(format(schema));
// ⛔ STOP - End script here, read output
[CHECKPOINT: Read output, identify actual column names]
Phase 2 - Query Script (NEW execution):
// Use ONLY verified column names from Phase 1
const results = await execute_sql({ sql: `SELECT verified_col FROM ...` });
❌ FORBIDDEN: get_object_details() + execute_sql() in same script
</critical>
| Bad ❌ | Good ✅ | Why |
|---|---|---|
| Discovery + query in one script | Two separate scripts | Prevents hallucinated column names |
Assuming user_id exists | Run get_object_details() first | Foreign key naming varies |
created_at, updated_at | Verify columns exist | May be named differently |
| Writing JOINs without discovery | Discover ALL tables first | Relationships vary |
Schema (3): list_schemas(), list_objects(schema, type?), get_object_details(schema, name, type?)
Query (1): execute_sql(sql) - read-only in restricted mode
Performance (4):
explain_query(sql, analyze?, hypothetical_indexes?) - CONSTRAINT: analyze + hypothetical_indexes cannot be used together; hypothetical_indexes requires HypoPG extensionget_top_queries(sort_by?, limit?) - default sort_by: "resources" (multi-dimensional blend, not just time)analyze_workload_indexes(max_size_mb?, method?) - analyzes top queries from pg_stat_statementsanalyze_query_indexes(queries[], max_size_mb?, method?) - CONSTRAINT: max 10 queries per callHealth (3):
analyze_db_health(type?) - accepts comma-separated types (e.g. "index,vacuum")get_blocking_queries() - works even without active blocking (returns deadlock analysis, contention hotspots, proactive recommendations)analyze_vacuum_requirements() - 6-phase analysis with severity levelsAdvanced (2):
get_database_overview(max_tables?, sampling_mode?, timeout?) - includes security score, performance hotspots, relationship mappinganalyze_schema_relationships() - inter-schema dependency analysis with visual representation dataDiscover:
const schemas = await list_schemas();
const tables = await list_objects({ schema_name: "public", object_type: "table" });
const details = await get_object_details({ schema_name: "public", object_name: "users", object_type: "table" });
Top resource-intensive queries (recommended default):
const top = await get_top_queries({ sort_by: "resources" });
// Returns queries consuming >5% of ANY resource dimension (CPU, I/O, WAL, etc.)
Hypothetical Index (requires HypoPG, cannot use with analyze:true):
const baseline = await explain_query({ sql: "SELECT * FROM orders WHERE customer_id = $1" });
const withIndex = await explain_query({
sql: "SELECT * FROM orders WHERE customer_id = $1",
hypothetical_indexes: [{ table: "orders", columns: ["customer_id"], using: "btree" }]
});
// Use $1, $2 etc. for parameterized queries (bind variables)
Targeted health check:
const health = await analyze_db_health({ health_type: "index,vacuum,buffer" });
// Only run specific checks instead of "all" to reduce noise
Schema relationships:
const rels = await analyze_schema_relationships();
// Cross-schema FK dependencies, hub tables, isolated schemas
Slow query (decision tree):
explain_query({ sql, analyze: false }) - get estimated planexplain_query({ sql, analyze: true }) - get actual timingshypothetical_indexes (requires HypoPG)analyze_query_indexes({ queries: [sql] }) for formal recommendationWorkload optimization (use resources sort):
get_top_queries({ sort_by: "resources" }) - find queries consuming >5% of any resource (CPU, buffer reads, dirty pages, WAL)explain_query() to identify bottleneckanalyze_workload_indexes({ method: "dta" }) for index recommendations across workloadHealth check (targeted deep-dives):
analyze_db_health({ health_type: "all" }) - initial scananalyze_db_health({ health_type: "index" }) shows invalid/duplicate/bloated/unused indexesanalyze_vacuum_requirements() for 6-phase bloat analysis with maintenance commandsanalyze_db_health({ health_type: "connection" }) for pool utilizationanalyze_db_health({ health_type: "buffer" }) for cache hit rates (index + table)analyze_db_health({ health_type: "replication" }) for lag and slot healthNew database assessment:
get_database_overview())analyze_schema_relationships() - understand cross-schema dependenciesget_top_queries({ sort_by: "resources" }) - identify workload hotspotsanalyze_db_health({ health_type: "all" }) - full health scananalyze_workload_indexes({ method: "dta" }) - index optimization opportunitiesBlocking queries: get_blocking_queries() - run immediately when investigating locks; also useful proactively (returns deadlock stats, contention hotspots, and recommendations even when no active blocking exists)
Multi-table query: Phase 1: get_object_details() for ALL tables → [CHECKPOINT] → Phase 2: query with verified columns
Incident triage (general entry point):
get_blocking_queries() — check active lock contention first (time-sensitive, always safe)analyze_db_health({ health_type: "connection,buffer,vacuum" }) — quick health snapshotPerformance incident:
get_blocking_queries() — active locks and deadlock statsanalyze_db_health({ health_type: "connection" }) — connection pool saturationget_top_queries({ sort_by: "resources" }) — resource-heavy queriesexplain_query({ sql, analyze: false }) — plan without adding loadanalyze_vacuum_requirements() — maintenance recommendationsData investigation (usage checks, missing data, verification):
get_object_details() for ALL relevant tables — [CHECKPOINT: verify actual column names]execute_sql() — query using ONLY verified columns from step 2analyze_schema_relationships() — FK chains, cascade effectsanalyze_db_health({ health_type: "replication" }) — check lagexplain_query:
sql (string) - supports bind variables: $1, $2 for parameterized queriesanalyze (bool, default: false) - runs query for real statisticshypothetical_indexes ([{table, columns, using?}]) - requires HypoPG extensionanalyze: true + hypothetical_indexes cannot be used together (returns error)get_top_queries:
sort_by (default: "resources") - resources uses multi-dimensional blend: includes queries where ANY of 5 fractions (exec time, buffer access, buffer reads, dirty pages, WAL bytes) exceeds 5% of workload total. total_time and mean_time rank by execution time onlylimit (int, default: 10) - only applies to total_time and mean_time sortsanalyze_db_health:
health_type (default: "all") - comma-separated list of checks:
index: invalid, duplicate, bloated, and unused indexes (4 sub-checks)connection: connection count and pool utilizationvacuum: transaction ID wraparound dangersequence: sequences approaching max valuereplication: replication lag and slot healthbuffer: cache hit rates for both indexes and tables (2 sub-checks)constraint: invalid (not-validated) constraintsall: runs all aboveanalyze_vacuum_requirements (6 phases):
analyze_query_indexes:
queries (string[]) - max 10 queries per callmax_index_size_mb (int, default: 10000)method ("dta" | "llm", default: "dta")analyze_*_indexes: method = dta (Pareto-optimal) | llm
| Risk | Operations | Behavior |
|---|---|---|
| LOW | list_*, get_object_details, explain(analyze:false), get_blocking_queries | Always safe |
| MEDIUM | get_database_overview, analyze_db_health, analyze_schema_relationships | Use sampling on large DBs |
| HIGH | explain(analyze:true) | Check cost first with analyze:false |
| CRITICAL | execute_sql(INSERT/UPDATE/DELETE/DDL) | Require confirmation |
| Error | Fix |
|---|---|
column X does not exist | Run get_object_details() - never guess column names |
relation does not exist | Refresh table list with list_objects() |
permission denied | Use read-only queries |
pg_stat_statements not found | Request DBA to enable extension |
timeout | Reduce scope: max_tables: 100, use sampling |
HypoPG not installed | hypothetical_indexes requires the HypoPG extension - request DBA to install, or skip hypothetical analysis |
Cannot use analyze and hypothetical indexes together | Remove analyze: true when using hypothetical_indexes - they are mutually exclusive |
up to 10 queries to analyze | analyze_query_indexes accepts max 10 queries - split larger batches into multiple calls |
Present results as a structured report:
Analyzing Postgres Report
═════════════════════════
Resources discovered: [count]
Resource Status Key Metric Issues
──────────────────────────────────────────────
[name] [ok/warn] [value] [findings]
Summary: [total] resources | [ok] healthy | [warn] warnings | [crit] critical
Action Items: [list of prioritized findings]
Target ≤50 lines of output. Use tables for multi-resource comparisons.
--help output.| Shortcut | Counter | Why |
|---|---|---|
| "I'll skip discovery and check known resources" | Always run Phase 1 discovery first | Resource names change, new resources appear — assumed names cause errors |
| "The user only asked for a quick check" | Follow the full discovery → analysis flow | Quick checks miss critical issues; structured analysis catches silent failures |
| "Default configuration is probably fine" | Audit configuration explicitly | Defaults often leave logging, security, and optimization features disabled |
| "Metrics aren't needed for this" | Always check relevant metrics when available | API/CLI responses show current state; metrics reveal trends and intermittent issues |
| "I don't have access to that" | Try the command and report the actual error | Assumed permission failures prevent useful investigation; actual errors are informative |