技能檔案

Database Patterns

Name: Database Patterns
Author: softspark

Loaded when user asks about database schema or query optimization

softspark136 星標2026年4月8日

職業: 資料庫架構師
分類: SQL 數據庫

技能內容

Database Patterns Skill

ORM Selection

Scenario	ORM
Node.js, type-safe	Prisma
Node.js, SQL-first	Drizzle
Python, async	SQLAlchemy 2.0
Python, simple	SQLModel
PHP	Doctrine, Eloquent

Schema Design

Naming Conventions

-- Tables: plural, snake_case
CREATE TABLE user_profiles (...);

-- Columns: snake_case
user_id, created_at, is_active

-- Indexes: idx_{table}_{columns}
CREATE INDEX idx_users_email ON users(email);

-- Foreign keys: fk_{table}_{ref_table}
CONSTRAINT fk_orders_users FOREIGN KEY (user_id) REFERENCES users(id)

相關技能

Database Patterns | Skills Pool

ALTER TABLE users ADD COLUMN deleted_at TIMESTAMP NULL;

-- Query active records
SELECT * FROM users WHERE deleted_at IS NULL;

created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
created_by UUID REFERENCES users(id),
updated_by UUID REFERENCES users(id)

-- Good: matches query pattern
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
SELECT * FROM orders WHERE user_id = 1 AND created_at > '2024-01-01';

-- Index used for:
-- WHERE user_id = 1
-- WHERE user_id = 1 AND created_at > ...

-- Index NOT used for:
-- WHERE created_at > '2024-01-01' (missing leading column)

EXPLAIN ANALYZE
SELECT * FROM users WHERE email = '[email protected]';

# Bad: N+1
for user in users:
    print(user.orders)  # Query per user

# Good: Eager loading
users = User.query.options(joinedload(User.orders)).all()

-- Add column (safe)
ALTER TABLE users ADD COLUMN phone VARCHAR(20);

-- Add NOT NULL column (safe pattern)
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
UPDATE users SET phone = '' WHERE phone IS NULL;
ALTER TABLE users ALTER COLUMN phone SET NOT NULL;

-- Rename column (use application-level)
-- 1. Add new column
-- 2. Copy data
-- 3. Update application
-- 4. Remove old column

[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

# Sync client
client = QdrantClient(host="localhost", port=6333)

# Async client
from qdrant_client import AsyncQdrantClient
async_client = AsyncQdrantClient(host="localhost", port=6333)

# Create collection (single vector)
client.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(size=384, distance=Distance.COSINE)
)

# Create collection (multi-vector)
from qdrant_client.models import VectorParams

client.create_collection(
    collection_name="multimodal",
    vectors_config={
        "text": VectorParams(size=384, distance=Distance.COSINE),
        "image": VectorParams(size=512, distance=Distance.EUCLID),
    }
)

# Single upsert
client.upsert(
    collection_name="documents",
    points=[
        PointStruct(
            id=1,
            vector=[0.1, 0.2, 0.3, ...],  # 384-dim vector
            payload={"title": "Doc 1", "category": "tech"}
        )
    ]
)

# Batch upsert
points = [
    PointStruct(id=i, vector=vectors[i], payload=payloads[i])
    for i in range(len(vectors))
]
client.upsert(collection_name="documents", points=points, batch_size=100)

from qdrant_client.models import Filter, FieldCondition, MatchValue

# Basic search
results = client.search(
    collection_name="documents",
    query_vector=[0.1, 0.2, ...],
    limit=10
)

# Search with filter
results = client.search(
    collection_name="documents",
    query_vector=[0.1, 0.2, ...],
    query_filter=Filter(
        must=[
            FieldCondition(key="category", match=MatchValue(value="tech"))
        ]
    ),
    limit=10,
    with_payload=True,
    score_threshold=0.7
)

# Search with range filter
from qdrant_client.models import Range

results = client.search(
    collection_name="documents",
    query_vector=query_vector,
    query_filter=Filter(
        must=[
            FieldCondition(key="price", range=Range(gte=10, lte=100))
        ]
    ),
    limit=10
)

# Create payload index for faster filtering
client.create_payload_index(
    collection_name="documents",
    field_name="category",
    field_schema="keyword"  # or "integer", "float", "bool"
)

Aspect	Recommendation
Batch Size	100-1000 points per upsert
Vector Dim	Match your embedding model (384, 768, 1536)
Filters	Index frequently filtered fields
Distance	COSINE for normalized, EUCLID for raw
Sharding	Use for >1M vectors

Excuse	Why It's Wrong
"We'll add indexes later when it's slow"	Missing indexes on production tables cause outages, not slowdowns — index from design
"The ORM handles performance"	ORMs generate queries, they don't optimize them — always check the query plan
"NoSQL is faster"	NoSQL trades consistency for speed — if you need joins, use a relational DB
"We don't need migrations, we'll update the schema directly"	Direct schema changes are irreversible and untestable — migrations are the safety net
"One big table is simpler"	Denormalization without measurement creates update anomalies — normalize first, denormalize with data

Use Case	Type
Internal only	SERIAL/BIGSERIAL
External/distributed	UUID
Human readable	SERIAL with prefix

Type	Use Case
B-tree	Equality, range (default)
Hash	Equality only
GIN	Arrays, JSONB, full-text
GiST	Geometric, full-text
BRIN	Large sequential data

Issue	Solution
Seq Scan on large table	Add index
High row estimate	Update statistics
Nested Loop on large sets	Consider hash join
Sort in memory	Increase work_mem

Framework	Pool Size Formula
General	(cores * 2) + disk spindles
Read-heavy	cores * 4
Write-heavy	cores * 2

Metric	Best For	Normalized
COSINE	Text embeddings	Yes
EUCLID	Image embeddings	No
DOT	When vectors pre-normalized	Yes

Database Patterns

Database Patterns Skill

ORM Selection

Schema Design

Naming Conventions

Database Patterns

Database Patterns Skill

ORM Selection

Schema Design

Naming Conventions

Common Patterns

Soft Delete

Audit Columns

UUID vs Serial

Index Strategies

When to Index

Index Types

Composite Index Order

Query Optimization

Explain Analyze

Common Issues

N+1 Prevention

Migration Best Practices

Safe Migrations

Migration Checklist

Connection Pooling

PgBouncer Settings

Application Settings

Vector Database (Qdrant) Patterns

Client Setup

Collection Management

Upserting Vectors

Searching Vectors

Payload Indexing

Best Practices

Distance Metrics

Common Rationalizations

Postgres Patterns

Postgres Patterns

Database Migrations

Postgres Patterns

Postgres Patterns

Jpa Patterns