Cookbook of ready-to-use PromQL queries, preset catalog, metric name dictionaries, and label references for Ceph storage, network traffic, pod statistics, and MTV migrations. Use when you need specific queries, exact metric names, or label filters.
Ready-to-use queries, preset catalog, and metric name/label references for OpenShift clusters with ODF, OVN-Kubernetes, KubeVirt, and Forklift/MTV.
All examples use the kubectl-metrics MCP server tools (metrics_read and metrics_help).
Output format guidance: Use default (markdown) when presenting to user. Use output: "json" only when you need to parse values programmatically. Use selector to filter results by labels post-query.
Every preset works as both an instant (default) and range query. Pass start to get a time-series trend.
| Preset | Description |
|---|---|
cluster_cpu_utilization | Cluster CPU utilization percentage |
cluster_memory_utilization | Cluster memory utilization percentage |
cluster_pod_status |
| Pod counts by phase (Running, Pending, Failed, Succeeded, Unknown) |
cluster_node_readiness | Node readiness status counts |
namespace_cpu_usage | Top 10 namespaces by CPU usage (cores) |
namespace_memory_usage | Top 10 namespaces by memory usage (bytes) |
namespace_network_rx | Top 10 namespaces by network receive rate |
namespace_network_tx | Top 10 namespaces by network transmit rate |
namespace_network_errors | Network errors + drops by namespace (top 10) |
pod_restarts_top10 | Top 10 pods by container restart count |
| Preset | Description |
|---|---|
mtv_migration_status | Migration counts by status (succeeded/failed/running) |
mtv_plan_status | Plan-level status counts |
mtv_migration_duration | Migration duration per plan (seconds) |
mtv_avg_migration_duration | Average migration duration (seconds) |
mtv_data_transferred | Total bytes migrated per plan |
mtv_net_throughput | Migration network throughput |
mtv_storage_throughput | Migration storage throughput |
mtv_migration_pod_rx | Migration pod receive rate (bytes/sec, top 20) |
mtv_migration_pod_tx | Migration pod transmit rate (bytes/sec, top 20) |
mtv_forklift_traffic | Forklift operator pod network traffic (bytes/sec) |
mtv_vmi_migrations_pending | KubeVirt VMI migrations in pending phase |
mtv_vmi_migrations_running | KubeVirt VMI migrations in running phase |
metrics_read { "command": "query", "flags": { "query": "ceph_health_status", "output": "markdown" } }
Result: 0 = OK, 1 = WARN, 2 = ERR.
metrics_read { "command": "query", "flags": { "query": "ceph_cluster_total_bytes", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "ceph_cluster_total_used_bytes", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "ceph_pool_percent_used * 100", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "rate(ceph_pool_rd[5m])", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "rate(ceph_pool_wr[5m])", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "rate(ceph_osd_op_latency_sum[5m]) / rate(ceph_osd_op_latency_count[5m])", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "ceph_pg_total", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "ceph_pg_degraded", "output": "markdown" } }
| Label | Description | Example values |
|---|---|---|
pool_id | Ceph pool identifier (pool-level metrics) | 1, 2, 3, 4 |
ceph_daemon | OSD daemon name (OSD-level metrics) | osd.0, osd.1, osd.2 |
namespace | Storage operator namespace | openshift-storage |
managedBy | Managing resource | ocs-storagecluster |
job | Scrape job | rook-ceph-mgr, rook-ceph-exporter |
| Metric | Description |
|---|---|
ceph_health_status | Overall cluster health (0=OK, 1=WARN, 2=ERR) |
ceph_cluster_total_bytes | Total cluster capacity |
ceph_cluster_total_used_bytes | Used cluster capacity |
ceph_pool_percent_used | Per-pool usage percentage |
ceph_pool_stored | Bytes stored per pool |
ceph_pool_max_avail | Available bytes per pool |
ceph_pool_rd, ceph_pool_wr | Read/write IOPS per pool |
ceph_pool_rd_bytes, ceph_pool_wr_bytes | Read/write bytes per pool |
ceph_osd_op_latency_sum/count | OSD operation latency (use as rate ratio) |
ceph_pg_total, ceph_pg_active, ceph_pg_degraded | Placement group counts |
node_filesystem_avail_bytes, node_filesystem_size_bytes | Node filesystem capacity |
metrics_read { "command": "preset", "flags": { "name": "namespace_network_rx", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "namespace_network_tx", "output": "markdown" } }
Replace TARGET_NAMESPACE with the actual namespace -- ASK the user if not known.
metrics_read {
"command": "query",
"flags": { "query": "topk(10, sort_desc(sum by (pod)(rate(container_network_receive_bytes_total{namespace=\"TARGET_NAMESPACE\"}[5m]))))", "output": "markdown" }
}
metrics_read {
"command": "query",
"flags": { "query": "topk(10, sort_desc(sum by (pod)(rate(container_network_transmit_bytes_total{namespace=\"TARGET_NAMESPACE\"}[5m]))))", "output": "markdown" }
}
metrics_read { "command": "preset", "flags": { "name": "namespace_network_errors", "output": "markdown" } }
metrics_read {
"command": "query",
"flags": { "query": "instance:node_network_receive_bytes_excluding_lo:rate1m + instance:node_network_transmit_bytes_excluding_lo:rate1m", "output": "markdown" }
}
| Label | Description | Example values |
|---|---|---|
namespace | Pod namespace | openshift-storage, konveyor-forklift |
pod | Pod name | forklift-controller-6df77f6bf5-jtt7q |
interface | Network interface (per-pod metrics) | eth0 |
instance | Node instance (node-level metrics) | 10.0.0.5:9100 |
node | Node name (node-level metrics) | worker-0 |
| Metric | Description |
|---|---|
container_network_receive_bytes_total | Bytes received per pod/namespace |
container_network_transmit_bytes_total | Bytes transmitted per pod/namespace |
container_network_receive_errors_total | Receive errors per pod/namespace |
container_network_transmit_errors_total | Transmit errors per pod/namespace |
container_network_receive_packets_dropped_total | Dropped receive packets |
container_network_transmit_packets_dropped_total | Dropped transmit packets |
node_network_receive_bytes_total | Bytes received per node/interface |
node_network_transmit_bytes_total | Bytes transmitted per node/interface |
instance:node_network_receive_bytes_excluding_lo:rate1m | Pre-computed node receive rate |
instance:node_network_transmit_bytes_excluding_lo:rate1m | Pre-computed node transmit rate |
metrics_read { "command": "query", "flags": { "query": "topk(15, count by (namespace)(kube_pod_info))", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "cluster_pod_status", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "namespace_cpu_usage", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "namespace_memory_usage", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "pod_restarts_top10", "output": "markdown" } }
debug_read for details)After finding pods with high restarts, use debug_read to get pod details and logs:
debug_read { "command": "list", "flags": { "resource": "pods", "namespace": "<NAMESPACE>", "query": "where status.containerStatuses[0].restartCount > 5", "output": "markdown" } }
debug_read { "command": "logs", "flags": { "name": "<POD_NAME>", "namespace": "<NAMESPACE>", "tail": 100, "query": "where level = 'ERROR'", "output": "markdown" } }
| Label | Description | Example values |
|---|---|---|
namespace | Pod namespace | konveyor-forklift, openshift-cnv |
pod | Pod name | forklift-controller-6df77f6bf5-jtt7q |
container | Container name | main, inventory, extract |
node | Node the pod runs on | worker-0, worker-1 |
phase | Pod phase (on status metrics) | Running, Pending, Failed, Succeeded |
uid | Pod UID | 793fb1cb-3e58-4eef-b95a-733f237365a3 |
created_by_kind | Owner resource kind (on kube_pod_info) | ReplicaSet, DaemonSet, StatefulSet |
created_by_name | Owner resource name (on kube_pod_info) | forklift-controller-6df77f6bf5 |
host_ip | Node IP (on kube_pod_info) | 192.168.0.77 |
pod_ip | Pod IP (on kube_pod_info) | 10.129.3.3 |
| Metric | Description |
|---|---|
kube_pod_info | Pod metadata (node, namespace, IPs, owner) |
kube_pod_status_phase | Pod phase (Running/Pending/Failed/Succeeded) |
kube_pod_container_status_restarts_total | Container restart count |
kube_pod_container_status_waiting_reason | Waiting reason (CrashLoopBackOff, ImagePullBackOff, etc.) |
container_cpu_usage_seconds_total | Container CPU usage |
container_memory_working_set_bytes | Container memory usage |
namespace:container_cpu_usage:sum | Pre-aggregated CPU by namespace |
namespace:container_memory_usage_bytes:sum | Pre-aggregated memory by namespace |
All mtv_* metrics share these labels for filtering and grouping:
| Label | Description | Example values |
|---|---|---|
provider | Source provider type | vsphere, ovirt, openstack, ova, ec2 |
mode | Migration mode | Cold, Warm |
target | Target cluster | Local (host cluster) or remote cluster name |
owner | User who owns the migration | [email protected] |
plan | Migration plan UUID | 363ce137-dace-4fb4-b815-759c214c9fec |
namespace | Forklift operator namespace | konveyor-forklift, openshift-mtv |
status | Migration/plan status (on status metrics) | Succeeded, Failed, Executing |
| Metric | Description |
|---|---|
mtv_migrations_status_total | Migration counts by status (succeeded/failed/running) |
mtv_plans_status | Plan-level status counts |
mtv_migration_data_transferred_bytes | Total bytes migrated per plan |
mtv_migration_net_throughput | Migration network throughput |
mtv_migration_storage_throughput | Migration storage throughput |
mtv_migration_duration_seconds | Migration duration per plan |
mtv_plan_alert_status | Alerts on migration plans |
mtv_workload_migrations_status_total | Per-workload migration status (per plan + status) |
kubevirt_vmi_migrations_in_pending_phase | Live VMI migrations pending |
kubevirt_vmi_migrations_in_running_phase | Live VMI migrations in progress |
metrics_read { "command": "preset", "flags": { "name": "mtv_migration_status", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "mtv_plan_status", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "mtv_data_transferred", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "mtv_net_throughput", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "mtv_storage_throughput", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "mtv_migration_duration", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "mtv_avg_migration_duration", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "mtv_plan_alert_status", "output": "markdown" } }
Use {label="value"} in PromQL or use the selector flag:
metrics_read { "command": "query", "flags": { "query": "mtv_migration_data_transferred_bytes", "selector": "provider=vsphere", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "mtv_migration_data_transferred_bytes{mode=\"Cold\"}", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "mtv_migration_data_transferred_bytes{provider=\"ovirt\", mode=\"Warm\"}", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "mtv_migrations_status_total{status=\"Failed\"}", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "mtv_workload_migrations_status_total{plan=\"PLAN_UUID\", status=\"Failed\"}", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "sum by (provider)(mtv_migration_data_transferred_bytes)", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "sum by (mode)(mtv_migration_data_transferred_bytes)", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "sum by (provider, mode)(mtv_migration_data_transferred_bytes)", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "sum by (status, provider)(mtv_migrations_status_total)", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "avg by (provider)(mtv_migration_duration_seconds)", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "sum by (plan, status)(mtv_workload_migrations_status_total)", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "sum by (provider, status)(mtv_plans_status)", "output": "markdown" } }
During active Forklift migrations, data-transfer pods run in the target namespace. Migration pod names follow the pattern {plan-name}-{vm-id}-{random} (e.g. test-vmware-metrics-vm-43-tws62).
Step 1 -- Discover migration pods:
VMware/general migration pods (carry a plan label):
debug_read { "command": "list", "flags": { "resource": "pods", "namespace": "<NAMESPACE>", "selector": "plan", "output": "markdown" } }
oVirt/OpenStack populator pods (named populate-{uuid}-...):
debug_read { "command": "list", "flags": { "resource": "pods", "namespace": "<NAMESPACE>", "query": "where name ~= '^populate-'", "output": "markdown" } }
Step 2 -- Query network traffic for discovered pods:
Use the pod names from Step 1 to build a regex filter (replace POD1|POD2 with the actual names):
metrics_read {
"command": "query",
"flags": { "query": "topk(10, sort_desc(sum by (pod)(rate(container_network_receive_bytes_total{namespace=\"TARGET_NAMESPACE\",pod=~\"POD1|POD2\"}[5m]))))", "output": "markdown" }
}
metrics_read {
"command": "query",
"flags": { "query": "topk(10, sort_desc(sum by (pod)(rate(container_network_transmit_bytes_total{namespace=\"TARGET_NAMESPACE\",pod=~\"POD1|POD2\"}[5m]))))", "output": "markdown" }
}
Pods that run under ~60 seconds (e.g. oVirt/OpenStack populator pods) may not have container-level network metrics (container_network_*). This is because cadvisor needs 1-2 collection cycles (~10-20s) to establish network namespace tracking, and the pod may complete before tracking starts. CPU and memory metrics are unaffected.
Node-level network metrics capture the transfer at the node level. Determine which node ran the pod (spec.nodeName or kube_pod_info), then query RX and TX together:
metrics_read {
"command": "query_range",
"flags": {
"query": [
"instance:node_network_receive_bytes_excluding_lo:rate1m{instance=~\"NODE_NAME.*\"}",
"instance:node_network_transmit_bytes_excluding_lo:rate1m{instance=~\"NODE_NAME.*\"}"
],
"name": ["node_rx", "node_tx"],
"start": "<MIGRATION_START>",
"end": "<MIGRATION_END>",
"step": "30s",
"output": "markdown"
}
}
Compare against baseline before/after the migration window to isolate transfer traffic.
CPU activity confirms the pod was active during the window:
metrics_read { "command": "query_range", "flags": { "query": "rate(container_cpu_usage_seconds_total{pod=\"<POD>\",namespace=\"<NS>\"}[1m])", "start": "<START>", "end": "<END>", "step": "30s", "output": "markdown" } }
When querying metrics for a migration that already finished, use the plan's start/completion timestamps as absolute time bounds:
mtv_read { "command": "describe plan", "flags": { "name": "<PLAN>", "namespace": "<NS>", "output": "markdown" } }
start/end in query_range:metrics_read {
"command": "query_range",
"flags": {
"query": "sum by (pod)(rate(container_network_receive_bytes_total{namespace=\"<NS>\"}[5m]))",
"start": "2025-06-15T10:00:00Z",
"end": "2025-06-15T12:30:00Z",
"step": "60s",
"output": "markdown"
}
}
Do not use relative offsets like -1h for completed migrations -- the data may fall outside that window.
debug_readTo investigate migration pod issues alongside metrics:
debug_read { "command": "list", "flags": { "resource": "pods", "namespace": "<NAMESPACE>", "selector": "plan", "output": "markdown" } }
debug_read { "command": "logs", "flags": { "name": "<POD_NAME>", "namespace": "<NAMESPACE>", "tail": 100, "query": "where level = 'ERROR'", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "mtv_forklift_traffic", "output": "markdown" } }
These track live VM migrations (vMotion-style), not Forklift cold migrations:
metrics_read { "command": "preset", "flags": { "name": "mtv_vmi_migrations_pending", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "mtv_vmi_migrations_running", "output": "markdown" } }
Run key queries for a cluster overview:
metrics_read { "command": "preset", "flags": { "name": "cluster_cpu_utilization", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "cluster_memory_utilization", "output": "markdown" } }
metrics_read { "command": "query", "flags": { "query": "ceph_health_status", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "namespace_network_rx", "output": "markdown" } }
metrics_read { "command": "preset", "flags": { "name": "mtv_migration_status", "output": "markdown" } }