技能内容

defuddle: Web Page Cleaner

Defuddle extracts the meaningful content from a web page and drops everything else: ads, cookie banners, nav bars, related articles, footers, social sharing buttons. What remains is the article body as clean markdown.

Use this before any URL ingestion. It is optional but strongly recommended. It cuts token usage by 40-60% on typical web articles and produces cleaner wiki pages.

Install

npm install -g defuddle-cli

Verify: defuddle --version

Usage

Clean a URL directly

defuddle https://example.com/article

Outputs clean markdown to stdout.

defuddle https://example.com/article > .raw/articles/article-slug-$(date +%Y-%m-%d).md

SLUG="article-slug-$(date +%Y-%m-%d)"
{ echo "---"; echo "source_url: https://example.com/article"; echo "fetched: $(date +%Y-%m-%d)"; echo "---"; echo ""; defuddle https://example.com/article; } > .raw/articles/$SLUG.md

defuddle page.html

which defuddle 2>/dev/null || echo "not installed"

Defuddle | Skills Pool

Defuddle

Defuddle

defuddle: Web Page Cleaner

Install

Usage

Clean a URL directly

Save to .raw/

Add frontmatter header after saving

Clean a local HTML file

When to Use

Fallback

Integration with /wiki-ingest

Notion

Feishu Wiki

Gemini

Obsidian Vault Maintainer

Openclaw Pr Maintainer

Wiki Maintainer