Extract and parse content from URLs. Triggers on: user provides a URL to extract content from, another skill needs to parse source material, "parse this URL", "extract content", "解析链接", "提取内容".
Extract and normalize content from URLs across supported platforms. Returns structured data including content body, metadata, and references. Useful as a preprocessing step for content generation skills or standalone content extraction.
shared/authentication.mdshared/common-patterns.md for polling, errors, and interaction patternsshared/config-pattern.md before any interaction~/Downloads/ or .listenhub/ — save to the current working directoryFollow shared/config-pattern.md § API Key Check. If the key is missing, stop immediately.
Follow shared/config-pattern.md Step 0.
If file doesn't exist — ask location, then create immediately:
mkdir -p ".listenhub/content-parser"
echo '{"autoDownload":true}' > ".listenhub/content-parser/config.json"
CONFIG_PATH=".listenhub/content-parser/config.json"
# (or $HOME/.listenhub/content-parser/config.json for global)
Then run Setup Flow below.
If file exists — read config, display summary, and confirm:
当前配置 (content-parser):
自动下载:{是 / 否}
Ask: "使用已保存的配置?" → 确认,直接继续 / 重新配置
autoDownload: trueautoDownload: falseSave immediately:
NEW_CONFIG=$(echo "$CONFIG" | jq --argjson dl {true/false} '. + {"autoDownload": $dl}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")
Free text input. Ask the user:
What URL would you like to extract content from?
Ask if the user wants to configure extraction options:
Question: "Do you want to configure extraction options?"
Options:
- "No, use defaults" — Extract with default settings
- "Yes, configure options" — Set summarize, maxLength, or Twitter tweet count
If "Yes", ask follow-up questions:
Summarize:
Ready to extract content:
URL: {url}
Options: {summarize: true, maxLength: 5000, twitter.count: 50} / default
Proceed?
Wait for explicit confirmation before calling the API.
Validate URL: Must be HTTP(S). Normalize if needed (see references/supported-platforms.md)
Build request body:
{
"source": {
"type": "url",
"uri": "{url}"
},
"options": {
"summarize": true/false,
"maxLength": 5000,
"twitter": {
"count": 50
}
}
}
Omit options if user chose defaults.
Submit (foreground): POST /v1/content/extract → extract taskId
Tell the user extraction is in progress
Poll (background): Run the following exact bash command with run_in_background: true and timeout: 300000. Note: status field is .data.status (not processStatus), interval is 5s, values are processing/completed/failed:
TASK_ID="<id-from-step-3>"
for i in $(seq 1 60); do
RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/content/extract/$TASK_ID" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" 2>/dev/null)
STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.status // "processing"')
case "$STATUS" in
completed) echo "$RESULT"; exit 0 ;;
failed) echo "FAILED: $RESULT" >&2; exit 1 ;;
*) sleep 5 ;;
esac
done
echo "TIMEOUT" >&2; exit 2
When notified, download and present result:
If autoDownload is true:
{taskId}-extracted.md to the current directory — full extracted content in markdown{taskId}-extracted.json to the current directory — full raw API response dataecho "$CONTENT_MD" > "${TASK_ID}-extracted.md"
echo "$RESULT" > "${TASK_ID}-extracted.json"
Present:
内容提取完成!
来源:{url}
标题:{metadata.title}
长度:~{character count} 字符
消耗积分:{credits}
已保存到当前目录:
{taskId}-extracted.md
{taskId}-extracted.json
Show a preview of the extracted content (first ~500 chars)
Offer to use content in another skill (e.g. /podcast, /tts)
Estimated time: 10-30 seconds depending on content size and platform.
shared/api-content-extract.mdreferences/supported-platforms.mdshared/common-patterns.md § Async Pollingshared/common-patterns.md § Error Handlingshared/config-pattern.mdUser: "Parse this article: https://en.wikipedia.org/wiki/Topology"
Agent workflow:
https://en.wikipedia.org/wiki/Topologycurl -sS -X POST "https://api.marswave.ai/openapi/v1/content/extract" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source": {
"type": "url",
"uri": "https://en.wikipedia.org/wiki/Topology"
}
}'
curl -sS "https://api.marswave.ai/openapi/v1/content/extract/69a7dac700cf95938f86d9bb" \
-H "Authorization: Bearer $LISTENHUB_API_KEY"
User: "Extract recent tweets from @elonmusk, get 50 tweets"
Agent workflow:
https://x.com/elonmusk{"twitter": {"count": 50}}curl -sS -X POST "https://api.marswave.ai/openapi/v1/content/extract" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source": {
"type": "url",
"uri": "https://x.com/elonmusk"
},
"options": {
"twitter": {
"count": 50
}
}
}'