Guidance for creating standalone CLI tools that perform neural network inference by extracting PyTorch model weights and reimplementing inference in C/C++. This skill applies when tasks involve converting PyTorch models to standalone executables, extracting model weights to portable formats (JSON), implementing neural network forward passes in C/C++, or creating CLI tools that load images and run inference without Python dependencies.
This skill provides guidance for tasks that require converting PyTorch models into standalone command-line tools, typically implemented in C/C++ for portability and independence from Python runtime.
This skill applies when the task involves:
Before writing any code, thoroughly analyze the available resources:
Identify the model architecture
model.py) completelyExamine available libraries
Understand input requirements
Verify preprocessing pipeline
Extract model weights from PyTorch format to a portable format:
Load the model checkpoint
import torch
import json
# Load state dict
state_dict = torch.load('model.pth', map_location='cpu')
Convert tensors to lists
weights = {}
for key, tensor in state_dict.items():
weights[key] = tensor.numpy().tolist()
Save to JSON
with open('weights.json', 'w') as f:
json.dump(weights, f)
Verify extraction
Before implementing in C/C++, create a reference output:
Run inference in PyTorch
model.eval()
with torch.no_grad():
output = model(input_tensor)
prediction = output.argmax().item()
Save reference outputs
Implement the inference logic in C/C++:
Image loading and preprocessing
Weight loading
Forward pass implementation
Output handling
Compile with appropriate flags
g++ -o cli_tool main.cpp lodepng.cpp cJSON.c -std=c++11 -lm
-std=c++11-lm)Test against reference
map_location='cpu' when loading on CPU-only systems