스킬 파일

Model Deployment

Name: Model Deployment
Author: AlexiosBluffMara

End-to-end model deployment for Jemma: Ollama local, HuggingFace Hub, Google Cloud Run, GGUF export. Use when: deploying, publishing, exporting, packaging, or releasing a trained model.

AlexiosBluffMara0 스타2026. 4. 14.

직업
카테고리: 머신러닝

스킬 내용

Model Deployment — Full Workflow

When to Use

Deploying a fine-tuned checkpoint to local Ollama
Publishing model + artifacts to HuggingFace Hub
Preparing a Google Cloud Run deployment bundle
Exporting to GGUF format for distribution

Deployment Targets

Target 1: Local Ollama

# Import GGUF into Ollama
python -u toolbox/import_gguf_to_ollama.py <path_to_gguf>

# Verify model registered
ollama list

# Smoke test
curl http://127.0.0.1:11434/api/chat -d '{
  "model": "gemma4-e4b-it:q8_0",
  "messages": [{"role": "user", "content": "Hello"}]
}'

Target 2: HuggingFace Hub

관련 스킬

Model Deployment | Skills Pool

# Validate token
python -u -W ignore demos/validate_hf_token.py

# Publish (includes model card, NOTICE, demos)
python -u -W ignore toolbox/publish_to_hf.py --demos

# Generate Docker bundle from GGUF
python -u toolbox/prepare_ollama_cloud_bundle.py

# Review generated artifacts (Dockerfile, Modelfile, deploy script)
# Follow docs/google-cloud-ollama-deployment.md for cloud deploy

Target	Verify Command	Success Criteria
Ollama	`ollama list`	Model name appears in list
Ollama	`curl .../api/chat`	Valid JSON response
HuggingFace	`python toolbox/publish_to_hf.py`	Exit code 0, URL printed
Cloud Run	`gcloud run services describe`	Service status: READY
GGUF	`llama.cpp/main -m <file>`	Model loads, generates text

Model Deployment

Model Deployment — Full Workflow

When to Use

Deployment Targets

Target 1: Local Ollama

Target 2: HuggingFace Hub

Model Deployment

Model Deployment — Full Workflow

When to Use

Deployment Targets

Target 1: Local Ollama

Target 2: HuggingFace Hub

Target 3: Google Cloud Run

Target 4: GGUF Export

Verification Matrix

Key Files

Continuous Learning V2

Continuous Learning V2

Continuous Learning V2

Continuous Learning

Continuous Learning

Pytorch Patterns