Ollama Local Inference Skill

Use this skill when FORGE needs local LLM inference without external API calls. Ollama runs on the local GPU server (RTX 3090) and provides zero-cost text generation, classification, and analysis.

Default operating pattern:

Select the model based on task complexity (fast, balanced, or high).
Pipe the prompt to ollama run via stdin for multi-line prompts.
Capture stdout as the complete response.
Parse or post-process the text output as needed.

Ollama is not a coding agent. It does not edit files, manage sessions, or run tools. Use it for inference tasks: classification, summarisation, text generation, and structured analysis.

Model Selection

Choose the model based on the task:

Routing tier	Model	Use case
`fast`	`gpt-oss:latest`	Short answers, quick classification, low-latency checks

Ollama Local Inference Skill

Use this skill when FORGE needs local LLM inference without external API calls. Ollama runs on the local GPU server (RTX 3090) and provides zero-cost text generation, classification, and analysis.

Default operating pattern:

Select the model based on task complexity (fast, balanced, or high).
Pipe the prompt to ollama run via stdin for multi-line prompts.
Capture stdout as the complete response.
Parse or post-process the text output as needed.

Ollama is not a coding agent. It does not edit files, manage sessions, or run tools. Use it for inference tasks: classification, summarisation, text generation, and structured analysis.

Model Selection

Choose the model based on the task:

Routing tier	Model	Use case
`fast`	`gpt-oss:latest`	Short answers, quick classification, low-latency checks

FORGE session field	Ollama CLI
`prompt`	stdin pipe or positional arg
`agent = "ollama"`	`ollama run`
`model`	model name arg (e.g., `qwen3-coder:latest`)
`tools`	N/A
`workdir`	process working directory
`timeout`	managed by FORGE wrapper
`budget`	N/A (local inference, no cost)
`resume`	N/A
`output_mode`	N/A (always plain text)

Ollama output	AgentResult field
stdout (full text)	`output`
exit code 0	success (no error)
non-zero exit code	`error`
N/A	`session_id` (not supported)
N/A	`tokens_in` (not exposed by CLI)
N/A	`tokens_out` (not exposed by CLI)
0.00	`cost_usd` (local, always zero)

Ollama

Ollama Local Inference Skill

Model Selection

Ollama

Ollama Local Inference Skill

Model Selection

Capabilities

`generate(prompt, model)`

`classify(text, categories, model)`

`analyze(prompt, model)`

Session Adapter Mapping

AgentResult Mapping

Safety Rules

Continuous Learning V2

Continuous Learning V2

Continuous Learning V2

Continuous Learning

Continuous Learning

Pytorch Patterns