SQL writing and database engineering patterns, standards, and procedures. Use for designing database schemas, writing performant SQL queries, normalisation strategies, indexing, joins optimisation, locking mechanics, transactions, query debugging with EXPLAIN, and ORM integration. Applies to PostgreSQL, MySQL, MariaDB, SQL Server, and Oracle. Covers ORM usage with TypeORM, Prisma, Doctrine, Eloquent, Entity Framework, Hibernate, and GORM.
This skill provides standards, patterns, and procedures for database schema design, writing performant SQL, query debugging, and database engineering best practices. It is database-engine-agnostic with notes on engine-specific behavior where critical.
| Principle | Application |
|---|---|
| Data Integrity First | Enforce constraints at the database level (NOT NULL, UNIQUE, FK, CHECK). Never rely solely on application-level validation. |
| Least Privilege | Database users/roles should have only the permissions they need. Application connections should never use the superuser account. |
| Explicit Over Implicit | Always specify column lists in SELECT, INSERT, and JOIN clauses. Avoid SELECT * in production code. |
| Measure Before Optimising | Use EXPLAIN ANALYZE to identify actual bottlenecks before adding indexes or restructuring queries. |
| Schema as Code |
| All schema changes go through versioned migration files. Never modify production schemas manually. |
| Element | Convention | Example |
|---|---|---|
| Tables | snake_case, plural nouns | order_items, user_addresses |
| Columns | snake_case | first_name, created_at |
| Primary keys | id (preferred) or <table_singular>_id | id, user_id |
| Foreign keys | <referenced_table_singular>_id | user_id, order_id |
| Indexes | idx_<table>_<columns> | idx_orders_user_id_status |
| Unique constraints | uq_<table>_<columns> | uq_users_email |
| Check constraints | chk_<table>_<description> | chk_orders_total_positive |
| Junction/pivot tables | <table1>_<table2> (alphabetical) | products_tags, roles_users |
Every table MUST include:
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(), -- or BIGSERIAL for auto-increment
-- ... domain columns ...
created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW()
);
When soft deletes are required by business rules:
deleted_at TIMESTAMP WITH TIME ZONE DEFAULT NULL
| Strategy | When to Use | Trade-offs |
|---|---|---|
UUID v4 / UUID v7 | Distributed systems, microservices, public-facing IDs | Larger storage, random UUIDs cause index fragmentation (prefer UUID v7 for ordered inserts) |
ULID | When you need sortable unique IDs with good index locality | Not natively supported in all databases |
BIGSERIAL / IDENTITY | Single-database monoliths, internal IDs | Sequential, predictable (security concern if exposed), not portable across DB instances |
Rule: Never expose auto-increment IDs in public APIs. Use UUIDs or ULIDs for external identifiers.
| Need | Use | Avoid |
|---|---|---|
| Monetary values | NUMERIC(19,4) or DECIMAL(19,4) | FLOAT, DOUBLE (precision loss) |
| Timestamps | TIMESTAMP WITH TIME ZONE | TIMESTAMP without timezone |
| Boolean flags | BOOLEAN | TINYINT, CHAR(1) |
| Short text (name, email) | VARCHAR(n) with appropriate limit | Unbounded TEXT for structured fields |
| Long text (descriptions) | TEXT | VARCHAR(10000) |
| Enums | VARCHAR with CHECK constraint or native ENUM | Magic integers |
| JSON/semi-structured | JSONB (PostgreSQL) or JSON | Storing relational data as JSON |
| IP addresses | INET (PostgreSQL) or VARCHAR(45) | VARCHAR(15) (IPv6 won't fit) |
Prefer CHECK constraints over native ENUM types for portability and ease of migration:
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
status VARCHAR(20) NOT NULL DEFAULT 'pending'
CONSTRAINT chk_orders_status CHECK (status IN ('pending', 'confirmed', 'shipped', 'delivered', 'cancelled')),
created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW()
);
Adding a new enum value is a simple ALTER TABLE ... DROP CONSTRAINT + ADD CONSTRAINT, no table rewrite required.
Schema design checklist:
- [ ] Every table has a primary key (UUID or BIGSERIAL)
- [ ] Every table has created_at and updated_at columns
- [ ] All foreign keys have corresponding indexes
- [ ] All columns have appropriate NOT NULL constraints
- [ ] Unique constraints are defined where business rules require uniqueness
- [ ] CHECK constraints enforce valid value ranges and enums
- [ ] Naming follows snake_case conventions consistently
- [ ] Data types are chosen precisely (no FLOAT for money, no TEXT for emails)
- [ ] Soft delete (deleted_at) is used only where business rules require it
- [ ] No business logic is embedded in database triggers or stored procedures (keep logic in the application layer)
| Normal Form | Rule | Violation Example | Fix |
|---|---|---|---|
| 1NF | Every column holds atomic (indivisible) values. No repeating groups. | tags: "php,sql,go" in a single column | Create a separate tags table with a junction table |
| 2NF | 1NF + every non-key column depends on the entire primary key (relevant for composite keys) | order_items(order_id, product_id, product_name) — product_name depends only on product_id | Move product_name to the products table |
| 3NF | 2NF + no transitive dependencies (non-key column depends on another non-key column) | employees(id, department_id, department_name) — department_name depends on department_id, not on id | Move department_name to a departments table |
| BCNF | 3NF + every determinant is a candidate key | Rare in practice; address when composite keys create functional dependency issues | Decompose the table so every determinant is a key |
Target 3NF by default. This eliminates redundancy while keeping the schema manageable.
Use 2NF only as an intermediate step when refactoring legacy schemas — never as a design target.
Use BCNF when you have composite primary keys with overlapping candidate keys (uncommon in application databases).
Unnormalised (0NF):