Discover, validate, and update RSS feed URLs for the 34 frontier AI companies in the Supabase companies table. Use this skill when checking if feeds work, finding RSSHub alternatives for locked platforms, and writing validated URLs back to Supabase.
This skill encodes the exact process for finding and validating RSS feeds for the 34 companies in the Frontier AI Intelligence Platform.
rss — Direct RSS/Atom feedsTest these with curl from the n8n server IP (not local machine) to
detect datacenter IP blocks. Feeds that work locally may fail from DigitalOcean.
# Test from n8n server (SSH into DigitalOcean droplet):
curl -s -A "Mozilla/5.0" --max-time 10 "FEED_URL" | head -c 500
# Expected: XML starting with <?xml or <rss or <feed
# Blocked: HTML with "Cloudflare", "Vercel", "Just a moment" etc.
github — GitHub releases.atomGitHub atom feeds work reliably from any IP. Format:
https://github.com/ORG/REPO/releases.atomTest: curl -s "https://github.com/deepseek-ai/DeepSeek-V3/releases.atom" | head -c 300
Expected: <?xml version="1.0"?><feed ...
scrape — Firecrawl web scrapingFor companies with no RSS. Use Firecrawl MCP scrape tool with:
{
"url": "https://company.com/blog",
"formats": ["markdown"],
"onlyMainContent": true,
"waitFor": 1000
}
A working scrape returns markdown with H2/H3 headings for article titles. An empty/blocked scrape returns empty markdown or a CAPTCHA page.
| Company | Feed URL | Last Tested |
|---|---|---|
| Google DeepMind | https://deepmind.google/blog/rss.xml | Mar 2026 |
| Microsoft | https://blogs.microsoft.com/ai/feed/ | Mar 2026 |
| Amazon AWS | https://aws.amazon.com/blogs/machine-learning/feed/ | Mar 2026 |
| NVIDIA | https://developer.nvidia.com/blog/feed/ | Mar 2026 |
| Databricks | https://www.databricks.com/feed | Mar 2026 |
| Mistral AI | https://mistral.ai/news/rss.xml | Mar 2026 |
| Cohere | https://txt.cohere.ai/rss/ | Mar 2026 |
| Runway | https://runwayml.com/blog/rss | Mar 2026 |
| Stability AI | https://stability.ai/blog?format=rss | Mar 2026 |
| Company | Atom Feed URL |
|---|---|
| DeepSeek | https://github.com/deepseek-ai/DeepSeek-V3/releases.atom |
| Alibaba/Qwen | https://github.com/QwenLM/Qwen/releases.atom |
| Baidu/Paddle | https://github.com/PaddlePaddle/Paddle/releases.atom |
| 01.AI | https://github.com/01-ai/Yi/releases.atom |
| Baichuan | https://github.com/baichuan-inc/Baichuan2/releases.atom |
| Company | Scrape URL |
|---|---|
| Anthropic | https://www.anthropic.com/news |
| xAI | https://x.ai/news |
| Meta AI (FAIR) | https://ai.meta.com/blog/ |
| OpenAI (fallback) | https://openai.com/news |
| AI21 Labs | https://www.ai21.com/blog |
| Moonshot AI | https://www.moonshot.cn/news |
| ByteDance | https://seed.bytedance.com |
| Tencent | https://ai.tencent.com/ailab/en/news |
| Zhipu AI | https://www.zhipuai.cn/news |
| Minimax | https://www.minimaxi.com/news |
| Sakana AI | https://sakana.ai/blog/ |
| Naver | https://clova.ai/en |
| Sarvam AI | https://www.sarvam.ai/blogs |
| Perplexity | https://www.perplexity.ai/hub/blog |
| Physical Intelligence | https://www.physicalintelligence.company/blog |
| Figure AI | https://www.figure.ai/news |
| Covariant | https://covariant.ai/insights |
| Skild AI | https://www.skild.ai/blog |
| Magic | https://magic.dev/blog |
| Midjourney | https://www.midjourney.com/updates |
RSSHub can generate RSS feeds for platforms that don't provide them natively.
Self-host at https://rsshub.app or use the public instance.
| Platform | RSSHub Route | Notes |
|---|---|---|
| OpenAI blog | /openai/blog | Backup if official RSS drops |
| Anthropic blog | /anthropic/blog | Community maintained |
| Twitter/X | /twitter/user/perplexity_ai | Requires Twitter API |
| WeChat (Tencent) | /wechat/mp/MPID | Requires WeChat MP ID — complex |
RSSHub deployment for self-hosting:
docker run -d -p 1200:1200 diygod/rsshub
# Then use: http://YOUR_IP:1200/route
# SSH into DigitalOcean droplet
ssh root@YOUR_DROPLET_IP
curl -s -A "Mozilla/5.0" --max-time 15 "FEED_URL" | head -c 300
If the response contains any of these → the feed is blocked from datacenter IPs:
"Just a moment" (Cloudflare)"Vercel Security" or "vercel.com""Access denied" or "403 Forbidden"<html> when you expected XMLAfter validating, update the feed URL and status:
UPDATE companies
SET rss_feed_url = 'VALIDATED_URL',
source_type = 'rss' -- or 'github' or 'scrape'
WHERE "Company" = 'Company Name';
Then log to rss_validation_log:
INSERT INTO rss_validation_log (company_id, feed_url, status, http_status, article_count, notes)
VALUES (1, 'https://feed.url', 'ok', 200, 15, 'Tested from DO droplet');
After each validation session, update docs/rss-research.md with:
docs/rss-research.md)# RSS Feed Validation Log
Last updated: DATE
## Summary
- Confirmed RSS: 9
- GitHub Atom: 5
- Firecrawl scrape: 20
- Blocked/empty: 0
## Per-Company Status
| Company | URL | Status | Articles | Tested | Notes |
|---------|-----|--------|----------|--------|-------|
| OpenAI | openai.com/news/rss/ | ok | 10 | Mar 2026 | |
...
Create a minimal test workflow in n8n that:
companies tablerssFeedRead or HTTP Requestrss_validation_logThis gives you accurate results from the DigitalOcean server IP where the production pipeline runs — not from your local machine.
OpenAI's RSS feed (openai.com/news/rss/) is confirmed to exist but posts
infrequently (sometimes weeks between posts). During testing:
freshnessHours: 720 in the pipeline configIf the OpenAI feed goes down entirely, fall back to:
https://openai.com/news via Firecrawl scrape.