Automated research subagent. Accepts a natural-language query, gathers sources from the knowledge graph (KG + BM25 + E5 + Web + Notion), performs SGR synthesis, generates validated knowledge cards, and writes them to the graph. Explicit trigger: "autosearch" — activates the subagent without confirmation. Soft triggers (require confirmation): "research", "analyze", "compare", "deep dive", "review".
Explicit trigger: user writes autosearch <query> — activate immediately, no confirmation needed.
Soft triggers (ask for confirmation before activating):
Do NOT activate for simple questions like "what is X?" or "explain Y" — those are regular answers, not research.
There are three paths to launch. The recommended one is via Python API (subagent).
The main agent calls AutoResearch as a tool via a 3-step API:
from skills.autoresearch.api import AutoResearchAgent
agent = AutoResearchAgent(mode="hybrid") # "internal" | "hybrid" | "web"
# Step 1: Plan
plan = agent.plan("<user query>")
# → Execute LLM call with plan.system_prompt + plan.user_prompt → raw_json
# Step 2: Gather
gather = agent.gather(raw_json)
# gather.sources_count — number of collected sources
# → Execute LLM call with gather.system_prompt + gather.user_prompt → synthesis_json
# Step 3: Finalize
report = agent.finalize(synthesis_json, write=True)
# report.summary() — result for the user
Choosing mode:
"internal" — internal database only (KG + BM25), no API keys needed"hybrid" — internal + web search (requires TAVILY_API_KEY)"web" — all providers including webStep 1 — Generate planner prompts:
python3 -m skills.autoresearch.run --request "<query>" --offline
Outputs planner system + user prompts. The agent executes an LLM call and saves the JSON to a file.
Step 2 — From planner response to checkpoint:
python3 -m skills.autoresearch.run \
--request "<query>" \
--planner-response /tmp/planner_response.json \
--offline
Parses response → brief → search_plan → gather → checkpoint + synthesis prompt.
Steps 3-5 — as below (synthesis → resume → report).
For simple queries or when the brief is already prepared.
For complex queries — you can use a separate planner CLI:
python3 -m skills.autoresearch.plan_research \
--request "<natural-language user query>" \
--offline
Parsing: python3 -m skills.autoresearch.plan_research --parse /tmp/planner_response.json --output-brief /tmp/autoresearch_brief.yaml --output-plan /tmp/autoresearch_search_plan.yaml
If the planner was used — skip directly to Step 3 (brief and search_plan are already ready).
If the planner was NOT used — analyze the user query manually:
| Field | How to determine | Example |
|---|---|---|
| topic | Main topic of the query | "Notion API integration for Client Alpha" |
| scope | What exactly to research (boundaries) | "evaluate integration approaches, risks, existing tools" |
| questions | 2-7 specific questions | ["Which approaches exist?", "What are the risks?"] |
| target_clusters | Where in the graph the result belongs | ["clients", "client_alpha"] |
| target_type | Node type (usually insight) | "insight" |
| sensitivity | internal (default) or public | "internal" |
| client_key | If for a client — the client key | "client_alpha" |
If anything is unclear — ask the user before continuing.
python3 -m skills.autoresearch.make_brief \
--topic "<topic>" \
--scope "<scope>" \
--questions "<q1>" "<q2>" "<q3>" \
--clusters <cluster1> <cluster2> \
--type insight \
--sensitivity internal \
--client-key <key_or_omit> \
--output /tmp/autoresearch_brief.yaml
python3 -m skills.autoresearch.run --brief /tmp/autoresearch_brief.yaml --offline
If there is a search_plan from the planner (Step 0):
python3 -m skills.autoresearch.run \
--brief /tmp/autoresearch_brief.yaml \
--search-plan /tmp/autoresearch_search_plan.yaml \
--offline
For full mode (with web search, if TAVILY_API_KEY is available) — remove --offline.
This creates a checkpoint_*.json and outputs a synthesis prompt.
You (the agent) are the LLM — perform the synthesis based on the prompt from Step 3.
Read the checkpoint file, take synth_request.system_prompt and synth_request.user_prompt.
Generate a response in JSON format with these fields:
{
"reasoning_steps": ["step 1...", "step 2...", "step 3..."],
"summary": "10-500 character research summary",
"findings": [
{"claim": "...", "evidence": "source", "confidence": 0.0-1.0, "source_ids": [0]}
],
"suggested_relations": [
{"target": "existing_node_id", "type": "RELATION_TYPE"}
],
"suggested_tags": ["tag1", "tag2"],
"retrieval_hints": ["search phrase 1", "search phrase 2"],
"when_to_read": ["when to read this card"],
"card_sections": [
{"heading": "Section title", "content": "Markdown content..."}
],
"follow_up_tasks": [
{"title": "Task", "description": "What to do", "priority": "high/medium/low"}
]
}
Save the result to /tmp/autoresearch_synthesis.json.
python3 -m skills.autoresearch.run \
--resume checkpoint_*.json \
--synthesis /tmp/autoresearch_synthesis.json \
--write
Add --obsidian-sync if Obsidian vault synchronization is needed.
Show the user:
Query: "Compare our Knowledge Layer with market competitors"
Agent executes:
Determines: topic="Knowledge Layer competitive analysis", scope="compare with Notion AI, Mem, Obsidian plugins", questions=["What are the main competitors?", "How do they handle knowledge graphs?", "What is our differentiator?"], clusters=["clients", "client_alpha"]
python3 -m skills.autoresearch.make_brief --topic "Knowledge Layer competitive analysis" ...
python3 -m skills.autoresearch.run --brief /tmp/autoresearch_brief.yaml --offline
Reads checkpoint, generates synthesis JSON itself
python3 -m skills.autoresearch.run --resume checkpoint_*.json --synthesis /tmp/autoresearch_synthesis.json --write
Reports: "Created 2 cards: insight + task. Written to 02_distill/clients/..."
Synthesis MUST adhere to closed vocabularies from config/ontology.yaml:
Online (recommended) — uses all available providers:
python3 -m skills.autoresearch.run --request "..."
Automatically enables web search if TAVILY_API_KEY is present in the environment.
On VPS the key is already configured in /docker/openclaw/.env.
Offline — internal database only (KG + BM25):
python3 -m skills.autoresearch.run --request "..." --offline
| Provider | What | Env var | Status |
|---|---|---|---|
| kg | Keyword search across the graph | — | always |
| rag | BM25 full-text across markdown | — | always |
| web | Tavily Search API + content extraction | TAVILY_API_KEY | ready, key on VPS |
| e5 | Semantic vector search | EMBEDDER_URL | requires endpoint |
| notion | Notion databases/pages | NOTION_API_KEY | requires key + client_key |
A separate planning layer before gather — converts a natural-language query into brief.yaml + search_plan.yaml.
Usage protocol:
Step 0 — Prompt generation (automatic)
planner = ResearchPlanner()
req = planner.prepare("Compare KL with competitors", offline=True)
Step 1 — Agent executes LLM call with req.system_prompt + req.user_prompt
→ receives JSON with brief + search_plan
Step 2 — Parsing and validation
result = planner.parse_response(raw_json)
assert result.is_valid
Step 3 — Write artifacts
planner.write_artifacts(result, "/tmp/brief.yaml", "/tmp/search_plan.yaml")
Step 4 — Continue standard flow
python3 -m skills.autoresearch.run --brief /tmp/brief.yaml --offline
CLI (for debugging):
# Generate prompts
python3 -m skills.autoresearch.plan_research \
--request "Compare Knowledge Layer with competitors" \
--offline
# Parse LLM response and write artifacts
python3 -m skills.autoresearch.plan_research \
--parse /tmp/planner_response.json \
--output-brief /tmp/autoresearch_brief.yaml \
--output-plan /tmp/autoresearch_search_plan.yaml
Designing new autoresearch components follows a contract-first approach.
Full protocol: references/DESIGN_FLOW.md
Order: design draft → schema → eval criteria → code → integrate → cleanup references/
Active designs: see references/DESIGN_FLOW.md → Active Documents
After successful write, remove temporary files:
rm -f /tmp/autoresearch_brief.yaml /tmp/autoresearch_synthesis.json checkpoint_*.json
Автоматизований дослідницький субагент. Приймає запит природною мовою, збирає джерела з графу знань (KG + BM25 + Web + Notion), виконує SGR-синтез та генерує валідовані knowledge cards. Три шляхи запуску: Python API (рекомендований), CLI Planner-driven, або Manual Brief-first.