Add Rag Source Support

Archivo del skill

Add Rag Source Support

How to support ingestion of new file types using the loader registry pattern.

hareeshbabu82ns0 estrellas4 mar 2026

Ocupación
Categorías: Internos de Frameworks

Contenido de la habilidad

The ingestion pipeline uses a loader registry — a dict mapping file extensions to loader classes. Do NOT add elif chains; extend the registry.

Steps

Add the dependency if needed:

uv add <package>   # e.g., python-docx, beautifulsoup4, openpyxl

Import the loader at the top of backend/app/rag.py:

from langchain_community.document_loaders import Docx2txtLoader

Register the extension in the loader registry dict in RAGService:

LOADER_REGISTRY = {
    ".pdf":  PyPDFLoader,
    ".docx": Docx2txtLoader,
    ".csv":  CSVLoader,
    ".html": BSHTMLLoader,
    ".py":   TextLoader,   # use language-aware splitter for code
    # ... add new entry here
}