This skill should be used when the user asks to "build llamafile", "rebuild llamafile", "run llamafile", "run llamafile tests", "debug llamafile", "set up llamafile", "update patches", "fix patch conflict", "update llama.cpp", "pull latest llama.cpp", "sync upstream llama.cpp", "reset submodules", "write a test for llamafile", "how does llamafile work", "llamafile architecture", or needs guidance on the llamafile build system, patch workflow, submodule integration, cosmocc toolchain, or development practices.
Llamafile combines llama.cpp, whisper.cpp, and stable-diffusion.cpp with Cosmopolitan Libc to create single-file executables that run LLMs locally across Windows, macOS, Linux, and BSD without installation.
main branch, used for releases >=0.10.0This guide covers the new llamafile project.
make setup
Immediately after cloning the repo (or after a reset done with make reset-repo), this command initializes git submodules and applies llamafile-specific patches.
Run llamafile:build to build all targets.
Run llamafile:check to run the unit test suite.
Run llamafile:clean to remove all build outputs.
After make setup, submodules contain patches and are no longer in a clean state.
To reset them, run:
make reset-repo # Warning: removes all local changes
WARNING: this command removes all local changes. Do not run it without first generating patches from any modifications.
To build llamafile from a fresh clone:
make setup to initialize submodules and apply patchesllamafile:buildBuild outputs appear in o/$(MODE)/ directory.
For changes to llamafile's own code (not submodules):
llamafile/ directoryllamafile:buildllamafile:checkSubmodules (llama.cpp, whisper.cpp, stable-diffusion.cpp) require a patch-based workflow:
llamafile:buildllamafile:checkNOTE: never try to edit patches or generate them manually. This step is
done only after rebuild and tests (even manual ones) are successful. See
development.md for detailed patch workflow.
Tests use the .runs pattern in BUILD.mk files:
o/$(MODE)/llamafile/json_test.runs
To run all tests: llamafile:check
The project uses Cosmopolitan Libc (cosmocc) to create Actually Portable Executables (APE) - single files that run on multiple platforms without modification. Always use the llamafile:build, llamafile:check, and llamafile:clean commands (which use cosmocc's make), not system make.
Each submodule has a corresponding patches directory:
llama.cpp.patches/whisper.cpp.patches/stable-diffusion.cpp.patches/Patches include:
Outputs: o/$(MODE)/package/file.o
Binaries include both x86_64 and aarch64 code paths with runtime CPU feature detection (AVX, AVX2, AVX-512, ARM NEON).
After building, find binaries in o/$(MODE)/:
| Binary | Purpose |
|---|---|
llamafile/llamafile | Main llamafile executable |
third_party/zipalign/zipalign | Bundle assets into executables |
whisperfile/whisperfile | Main whisperfile executable |
Run make setup to reapply patches after any submodule changes.
To reset a single submodule:
cd <submodule> && git reset --hard && git clean -fdx
To reset all submodules:
make reset-repo
Ensure using the llamafile:build command (which uses cosmocc's make), not system make.
For detailed information, consult:
building.md - Complete build system documentation, toolchain detailsarchitecture.md - Repository structure, component overviewdevelopment.md - Development workflow, patch management, submodule integrationtesting.md - Test patterns, running and writing testsupdate_llamacpp.md - Keeping llamafile updated with upstream llama.cpp--help