Core rule

Do not promise direct access to every site. Some sites use Cloudflare, login walls, bot detection, or legal restrictions. In those cases, switch to the next fallback instead of insisting the first method should work.

Preferred fetch order

1) Direct readable fetch

Try lightweight conversion services first:

r.jina.ai
```
https://r.jina.ai/http://example.com
```

markdown.new

https://markdown.new/https://example.com

defuddle

Web Fetcher

Core rule

Preferred fetch order

1) Direct readable fetch

Web Fetcher

Core rule

Preferred fetch order

1) Direct readable fetch

2) Detect failure modes early

3) Browser fallback for protected sites

4) Search / indexed fallback

5) Partial-completion mode

Practical extraction strategy

For one page

For many similar pages

Output guidance

Examples

Feishu Doc

Summarize

Nano Pdf

Diffs

Customs Trade Compliance

Nutrient Document Processing