Get started with TinyFacade — on-device AI inference for Android apps via AIDL. Use when someone wants to integrate on-device AI, use AIDL inference, or connect to TinyFacade.
Present the following information to the user clearly.
Think of TinyFacade as a local AI API server running on your phone. It loads GGUF language models via llama.cpp and exposes them over Android's AIDL interface. Any app on the device can bind to TinyFacade and run AI inference — no cloud, no API keys, no internet required.
┌──────────────┐ AIDL (IPC) ┌──────────────┐
│ Your App │ <─────────────────────> │ TinyFacade │
│ │ tokens stream back │ (service) │
│ "What time │ │ │
│ is it?" │ │ llama.cpp │
└──────────────┘ └──────────────┘
Your app sends a message. TinyFacade runs inference. Tokens stream back in real-time. Tool calling lets the model execute actions and respond with real data. Your app never sees the internal tool loop — just streamed tokens and a natural response.
arm64-v8a)Use these skills to build your integration:
| Skill | What it does |
|---|---|
/tinyfacade-scaffold [package] [name] | Generate a complete client project — AIDL files, Gradle config, manifest, Activity, layout. Ready to build. |
/tinyfacade-connect [package] | Generate just the service connection code for an existing Android project. |
/tinyfacade-tools [name] [type] | Generate code to register custom tools with TinyFacade (HTTP, file, system actions). |
/tinyfacade-troubleshoot [issue] | Debug common issues — binding failures, missing models, slow inference, tool problems. |
Your App TinyFacade (host)
──────── ─────────────────
bindService()
│
▼
ServiceConnection
│ onServiceConnected(binder)
▼
IInferenceService.Stub.asInterface(binder)
│
├── loadModel(path, params, callback)
├── sendMessage(json, params, callback) ──> tokens stream via onToken()
├── isModelLoaded()
├── stopGeneration()
├── releaseModel()
├── getAvailableModels()
├── getAvailableTools()
├── registerTool(definition, action)
└── unregisterTool(name)
For the complete AIDL interface reference, see reference.md.