Command-line interface for Ollama - Local LLM inference and model management via Ollama REST API. Designed for AI agents and power users who need to manage models, generate text, chat, and create embeddings without a GUI.
Local LLM inference and model management via the Ollama REST API. Designed for AI agents and power users who need to manage models, generate text, chat, and create embeddings without a GUI.
This CLI is installed as part of the cli-anything-ollama package:
pip install cli-anything-ollama
Prerequisites:
ollama serve)# Show help
cli-anything-ollama --help
# Start interactive REPL mode
cli-anything-ollama
# List available models
cli-anything-ollama model list
# Run with JSON output (for agent consumption)
cli-anything-ollama --json model list
When invoked without a subcommand, the CLI enters an interactive REPL session:
cli-anything-ollama
# Enter commands interactively with tab-completion and history
Model management commands.
| Command | Description |
|---|---|
list | List locally available models |
show | Show model details (parameters, template, license) |
pull | Download a model from the Ollama library |
rm | Delete a model from local storage |
copy | Copy a model to a new name |
ps | List models currently loaded in memory |
Text generation and chat commands.
| Command | Description |
|---|---|
text | Generate text from a prompt |
chat | Send a chat completion request |
Embedding generation commands.
| Command | Description |
|---|---|
text | Generate embeddings for text |
Server status and info commands.
| Command | Description |
|---|---|
status | Check if Ollama server is running |
version | Show Ollama server version |
Session state commands.
| Command | Description |
|---|---|
status | Show current session state |
history | Show chat history for current session |
# List available models
cli-anything-ollama model list
# Pull a model
cli-anything-ollama model pull llama3.2
# Show model details
cli-anything-ollama model show llama3.2
# Stream text (default)
cli-anything-ollama generate text --model llama3.2 --prompt "Explain quantum computing in one sentence"
# Non-streaming with JSON output (for agents)
cli-anything-ollama --json generate text --model llama3.2 --prompt "Hello" --no-stream
# Single-turn chat
cli-anything-ollama generate chat --model llama3.2 --message "user:What is Python?"
# Multi-turn chat
cli-anything-ollama generate chat --model llama3.2 \
--message "user:What is Python?" \
--message "user:How does it compare to JavaScript?"
# Chat from JSON file
cli-anything-ollama generate chat --model llama3.2 --file messages.json
cli-anything-ollama embed text --model nomic-embed-text --input "Hello world"
cli-anything-ollama embed text --model nomic-embed-text --input "Hello" --input "World"
Start an interactive session for exploratory use.
cli-anything-ollama
# Enter commands interactively
# Use 'help' to see available commands
cli-anything-ollama --host http://192.168.1.100:11434 model list
The CLI maintains lightweight session state:
--hostAll commands support dual output modes:
--json flag): Structured JSON for agent consumption# Human output
cli-anything-ollama model list
# JSON output for agents
cli-anything-ollama --json model list
When using this CLI programmatically:
--json flag for parseable output--no-stream for generate/chat to get complete responsesserver status before other commands1.0.1