Name: pgvector for Semantic Search
Author: timescale

pgvector for Semantic Search

Use this skill for setting up vector similarity search with pgvector for AI/ML embeddings, RAG applications, or semantic search. **Trigger when user asks to:** - Store or search vector embeddings in PostgreSQL - Set up semantic search, similarity search, or nearest neighbor search - Create HNSW or IVFFlat indexes for vectors - Implement RAG (Retrieval Augmented Generation) with PostgreSQL - Optimize pgvector performance, recall, or memory usage - Use binary quantization for large vector datasets **Keywords:** pgvector, embeddings, semantic search, vector similarity, HNSW, IVFFlat, halfvec, cosine distance, nearest neighbor, RAG, LLM, AI search Covers: halfvec storage, HNSW index configuration (m, ef_construction, ef_search), quantization strategies, filtered search, bulk loading, and performance tuning.

timescale1,690 スター2026/04/07

職業
カテゴリ: LLM・AI

Semantic search finds content by meaning rather than exact keywords. An embedding model converts text into high-dimensional vectors, where similar meanings map to nearby points. pgvector stores these vectors in PostgreSQL and uses approximate nearest neighbor (ANN) indexes to find the closest matches quickly—scaling to millions of rows without leaving the database. Store your text alongside its embedding, then query by converting your search text to a vector and returning the rows with the smallest distance.

This guide covers pgvector setup and tuning—not embedding model selection or text chunking, which significantly affect search quality. Requires pgvector 0.8.0+ for all features (halfvec, binary_quantize, iterative scan).

Golden Path (Default Setup)

Use this configuration unless you have a specific reason not to.

Embedding column data type: halfvec(N) where N is your embedding dimension (must match everywhere). Examples use 1536; replace with your dimension N.
Distance: cosine (<=>)
Index: HNSW (m = 16, ef_construction = 64). Use halfvec_cosine_ops and query with .

pgvector for Semantic Search

timescale1,690 スター2026/04/07

職業
カテゴリ: LLM・AI

Golden Path (Default Setup)

Use this configuration unless you have a specific reason not to.

Embedding column data type: halfvec(N) where N is your embedding dimension (must match everywhere). Examples use 1536; replace with your dimension N.

Distance: cosine (<=>)

Index: HNSW (m = 16, ef_construction = 64). Use halfvec_cosine_ops and query with .

Parameter	Default	Description
`m`	16	Max connections per layer. Higher = better recall, more memory
`ef_construction`	64	Build-time candidate list. Higher = better graph quality, slower build
`hnsw.ef_search`	40	Query-time candidate list. Higher = better recall, slower queries. Should be ≥ LIMIT.

ef_search	Approx Recall	Relative Speed
40	lower (~95% on some benchmarks)	1x (baseline)
100	higher	~2x slower
200	very-high	~4x slower
400	near-exact	~8x slower

RAM	Approx max halfvec vectors
16 GB	~2–3M vectors
32 GB	~4–6M vectors
64 GB	~8–12M vectors
128 GB	~16–25M vectors

Scale	Vectors	Config	Notes
Small	<100K	Defaults	Index optional but improves tail latency
Medium	100K–5M	Defaults	Monitor p95 latency; most common production range
Large	5M+	`ef_construction=100+`	Memory residency critical
Very Large	10M+	Binary quantization + re-ranking	Add RAM or partition first if possible

Symptom	Likely Cause	Fix
Query does not use ANN index	Missing `ORDER BY` + `LIMIT`, operator mismatch, or implicit casts	Use `ORDER BY` with a distance operator that matches the index ops class; explicitly cast query vectors
Fewer results than expected (filtered query)	HNSW stops early due to filter	Enable iterative scan; increase `hnsw.max_scan_tuples`; or prefilter (B-tree), use partial indexes, or partition
Fewer results than expected (unfiltered query)	ANN recall too low	Increase `hnsw.ef_search`
High latency with low CPU usage	HNSW index not resident in memory	Use `halfvec`, reduce `m`/`ef_construction`, add RAM, partition, or use binary quantization
Slow index builds	Insufficient build memory or parallelism	Increase `maintenance_work_mem` and `max_parallel_maintenance_workers`; build after bulk load
Out-of-memory errors	Index too large for available RAM	Use `halfvec`, reduce index parameters, or switch to binary quantization with re-ranking
Zero or missing results	NULL or zero vectors	Avoid NULL embeddings; do not use zero vectors with cosine distance

pgvector for Semantic Search

Golden Path (Default Setup)

pgvector for Semantic Search

Golden Path (Default Setup)

Core Rules

Type Rules

Standard Pattern

HNSW Index

HNSW Parameters

IVFFlat Index (Generally Not Recommended)

Quantization Strategies

Guidelines for 1536-dim vectors

Binary Quantization (For Very Large Datasets)

Performance by Dataset Size

Filtering Best Practices

Iterative scan (recommended when filters are selective)

Choose the right filtering strategy

Key rules

Alternative: pgvectorscale for label-based filtering

Bulk Loading

Maintenance

Monitoring & Debugging

Common Issues (Symptom → Fix)

Prose

Openai Whisper

Clawhub

Coding Agent (bash-first)

Feishu Wiki

Skill Creator