Skill ファイル

Vector DB Expert

Name: Vector DB Expert
Author: mlun6440-bit

Expert guidance on vector embeddings, vector databases, and retrieval-augmented generation (RAG) systems. Use when users ask about choosing embedding models (OpenAI, Voyage AI, Cohere, open-source), vector database selection (pgvector/Supabase, Pinecone, Qdrant, Weaviate, Milvus/Zilliz, Chroma), chunking strategies for documents, HNSW or IVFFlat index tuning, hybrid search (BM25 + vector), re-ranking with cross-encoders, Matryoshka/MRL dimension reduction, embedding quantization, RAG pipeline architecture, contextual retrieval, GraphRAG vs traditional RAG, pgvector performance tuning, Supabase vector search functions, similarity metrics (cosine, dot product, L2), metadata filtering, multi-tenancy with RLS, production scaling, cost optimization, or any question about storing, indexing, and searching vector embeddings. Also use when building or debugging RAG applications, designing semantic search systems, or migrating between vector databases.

mlun6440-bit0 スター2026/02/19

職業
カテゴリ: バイオインフォマティクス

スキル内容

Expert guidance for vector embeddings, vector databases, and RAG systems. Current as of early 2025.

Quick Decision Frameworks

Embedding Model Selection

Priority	Model	Cost/1M tokens
Budget RAG	voyage-3.5-lite or text-embedding-3-small	$0.02
Quality RAG	voyage-3.5 or text-embedding-3-large	$0.06–$0.13
Multilingual	Cohere Embed v4 or BGE-M3	$0.12 / free
Code retrieval	voyage-code-3	$0.18
On-premise / privacy	BGE-M3 or Qwen3-Embedding	Free (self-hosted)
Multimodal (text+images)	Cohere Embed v4	$0.12 text / $0.47 images

Full details: See references/embedding-models.md

関連 Skill

Vector DB Expert | Skills Pool

Scale	Recommended	Why
Prototyping (<100K)	Chroma or Qdrant free tier	Zero setup, free
Production (<10M)	pgvector (Supabase)	SQL-native, hybrid search, JOINs, RLS, $25/mo
Growth (10M–100M)	pgvector or Qdrant Cloud	Proven at scale, $100–300/mo
Enterprise (100M+)	Milvus self-hosted or Zilliz	Distributed, GPU-accelerated
Billion-scale	Milvus/Zilliz Cloud	Only credible option at this scale

Content type	Strategy	Chunk size
General documents	Recursive character splitting	512 tokens, 75-token overlap
Factoid Q&A	Smaller chunks	256–512 tokens
Analytical/summary	Larger chunks	512–1024 tokens
Code	Function/class boundaries	Varies
High-stakes accuracy	Contextual retrieval	512 tokens + LLM context prefix

Scenario	Index	Key params
Default production	HNSW	M=16, ef_construction=64
High-accuracy (legal/medical)	HNSW	M=32, ef_construction=128
Static data, fast build	IVFFlat	lists=rows/1000, probes=√lists
Billion-scale single node	DiskANN	Via pgvectorscale or Milvus
Exact results (<50K vectors)	Flat/brute-force	None

Vector DB Expert

Quick Decision Frameworks

Embedding Model Selection

Vector DB Expert

Quick Decision Frameworks

Embedding Model Selection

Vector Database Selection

Chunking Strategy

Index Type

Core Workflow

Critical Rules

Nanoclaw Repl

Bioinformatics

Smart Explore

Vector Database Engineer

Skin Health Analyzer

Scanpy