Add PDF reading to NanoClaw agents. Extracts text from PDFs via pdftotext CLI. Handles WhatsApp attachments, URLs, and local files.
Adds PDF reading capability to all container agents using poppler-utils (pdftotext/pdfinfo). PDFs sent as WhatsApp attachments are auto-downloaded to the group workspace.
container/skills/pdf-reader/pdf-reader exists — skip to Phase 3 if already appliedskill/whatsapp merged). This skill modifies WhatsApp channel files.git remote -v
If whatsapp is missing, add it:
git remote add whatsapp https://github.com/qwibitai/nanoclaw-whatsapp.git
git fetch whatsapp skill/pdf-reader
git merge whatsapp/skill/pdf-reader || {
git checkout --theirs package-lock.json
git add package-lock.json
git merge --continue
}
This merges in:
container/skills/pdf-reader/SKILL.md (agent-facing documentation)container/skills/pdf-reader/pdf-reader (CLI script)poppler-utils in container/Dockerfilesrc/channels/whatsapp.tssrc/channels/whatsapp.test.tsIf the merge reports conflicts, resolve them by reading the conflicted files and understanding the intent of both sides.
npm run build
npx vitest run src/channels/whatsapp.test.ts
./container/build.sh
launchctl kickstart -k gui/$(id -u)/com.nanoclaw # macOS
# Linux: systemctl --user restart nanoclaw
Send a PDF file in any registered WhatsApp chat. The agent should:
attachments/Ask the agent to read a PDF from a URL. It should use pdf-reader fetch <url>.
tail -f logs/nanoclaw.log | grep -i pdf
Look for:
Downloaded PDF attachment — successful downloadFailed to download PDF attachment — media download issueContainer needs rebuilding. Run ./container/build.sh and restart the service.
The PDF may be scanned (image-based). pdftotext only handles text-based PDFs. Consider using the agent-browser to open the PDF visually instead.
Verify the message has documentMessage with mimetype: application/pdf. Some file-sharing apps send PDFs as generic files without the correct mimetype.