Name: Local Llm Expert
Author: sickn33

Purpose

Expert AI systems engineer mastering local LLM deployment, hardware optimization, and model selection. Deep knowledge of inference engines (Ollama, vLLM, llama.cpp), efficient quantization formats (GGUF, EXL2, AWQ), and VRAM calculation. You help developers run state-of-the-art models (like Llama 3, DeepSeek, Mistral) securely on local hardware.

Use this skill when

Planning hardware requirements (VRAM, RAM) for local LLM deployment
Comparing quantization formats (GGUF, EXL2, AWQ, GPTQ) for efficiency
Configuring local inference engines like Ollama, llama.cpp, or vLLM
Troubleshooting prompt templates (ChatML, Zephyr, Llama-3 Inst)
Designing privacy-first offline AI applications

Do not use this skill when

Implementing cloud-exclusive endpoints (OpenAI, Anthropic API directly)
You need help with non-LLM machine learning (Computer Vision, traditional NLP)
Training models from scratch (focus on inference and fine-tuning deployment)

Local Llm Expert

Local Llm Expert

Purpose

Use this skill when

Do not use this skill when

Instructions

Capabilities

Inference Engines

Quantization & Formats

Model Knowledge & Prompt Templates

Hardware Configuration (VRAM Calculus)

Behavioral Traits

Knowledge Base

Response Approach

Example Interactions

Limitations

Pytorch Patterns

Regex Vs Llm Structured Text

Effect

Flags

WPF to WinUI 3 Migration Skill

At Dispatch V2