Guide users through creating effective research questions for automated paper discovery. Use when user wants to set up automated paper discovery or create new search queries.
This skill guides you through helping users create research questions for automated paper discovery in the Thoth system.
Research questions in Thoth define:
Goal: Understand what the user wants to research.
Questions to Ask:
Actions:
Example Conversation:
User: I want to find papers about AI memory systems.
Agent: Great! Let's set up automated discovery for AI memory systems.
To help create an effective research question, could you tell me more about what
specific aspects of AI memory systems you're interested in? For example:
- Long-term memory in language models?
- Memory architectures for agents?
- Episodic vs semantic memory?
- Or all of the above?
Goal: Identify effective search keywords.
Process:
Guidance:
Example:
Agent: Based on your interest in AI memory systems, here are some keyword suggestions:
- Core terms: "memory", "retrieval", "agent memory"
- Related: "context management", "long-term memory", "episodic memory"
- Technical: "memory architectures", "stateful agents", "memory augmentation"
Which of these resonate with your research focus? Any others you'd like to add?
Goal: Choose appropriate discovery sources.
Action: Call list_available_sources to show options.
Available Sources:
Guidance for Selection:
Example:
Agent: Now let's choose which sources to query. I'll show you what's available.
[calls list_available_sources]
For AI memory systems research, I'd recommend:
- arxiv: For the latest CS preprints
- semantic_scholar: For AI-focused search
- openalex: For comprehensive coverage
Would you like to use these three, or would you prefer to query all sources?
Goal: Identify relevant academic categories/topics.
For ArXiv: Suggest specific categories
For Other Sources: General topic areas work
Example:
Agent: For ArXiv searches, I can filter by specific categories. For your research,
I'd suggest:
- cs.AI (Artificial Intelligence)
- cs.LG (Machine Learning)
Should we include both of these?
Goal: Set up automated discovery schedule.
Options:
Default Time: 3:00 AM (can be customized)
Example:
Agent: How often would you like to check for new papers?
- Daily: Best for fast-moving fields like AI/ML
- Weekly: Good for most research areas
- Monthly: For slower-paced or niche topics
I'd recommend daily checks for AI memory systems since it's an active area.
Goal: Set relevance and filtering criteria.
Parameters:
min_relevance_score: 0.0-1.0 (default: 0.7)
max_articles_per_run: 1-500 (default: 50)
Example:
Agent: Let's set quality thresholds. I recommend:
- Minimum relevance: 0.7 (captures relevant papers without too much noise)
- Max articles per run: 50 (manageable daily volume)
Would you like to adjust these?
Goal: Confirm all parameters before creation.
Actions:
Example:
Agent: Let me summarize your research question:
**Name**: AI Memory Systems Research
**Keywords**: memory, retrieval, agent memory, stateful agents
**Topics**: cs.AI, cs.LG
**Sources**: arxiv, semantic_scholar, openalex
**Schedule**: Daily at 3:00 AM
**Min Relevance**: 0.7
**Max Articles**: 50 per run
Does this look good, or would you like to refine anything?
Goal: Execute the creation.
Action: Call create_research_question with finalized parameters.
Parameters to Pass:
{
'name': 'AI Memory Systems Research',
'keywords': ['memory', 'retrieval', 'agent memory', 'stateful agents'],
'topics': ['cs.AI', 'cs.LG'],
'authors': [], # Optional, if user mentioned specific researchers
'selected_sources': ['arxiv', 'semantic_scholar', 'openalex'],
'schedule_frequency': 'daily',
'schedule_time': '03:00',
'min_relevance_score': 0.7,
'max_articles_per_run': 50,
'auto_download_pdfs': True
}
Success Response:
Agent: ✓ Research question created successfully!
Your automated discovery is now active. The system will:
- Search arxiv, semantic_scholar, and openalex daily at 3:00 AM
- Look for papers matching your keywords and topics
- Download PDFs for articles scoring above 0.7 relevance
- Process up to 50 articles per run
You can manually trigger discovery anytime with the run_discovery_for_question tool.
Would you like to run it now to see initial results?
User: I'm not sure which sources to use.
Agent: No worries! For computer science and AI research, I'd typically recommend
starting with arxiv and semantic_scholar - they're the most comprehensive for
this field. We can always adjust later if you're not getting what you need.
Or if you'd like, we can use "*" to search all available sources. What sounds better?
User: I want daily ArXiv alerts for "transformer architectures" in cs.AI and cs.LG.
Agent: Perfect! Let me set that up for you.
[calls list_available_sources to show options]
I'll create a research question that searches ArXiv daily for transformer
architectures in cs.AI and cs.LG. Should I include any related terms like
"attention mechanism" or "self-attention" to catch more relevant papers?
User: I want to learn about quantum computing.
Agent: Great topic! Let's set up automated discovery. To help narrow it down,
what aspect of quantum computing interests you most? For example:
- Quantum algorithms?
- Quantum error correction?
- Quantum hardware/qubits?
- Applications to cryptography or optimization?
- General quantum computing research?
User: I'm interested in both quantum computing and AI safety.
Agent: Those are both fascinating areas! I'd recommend creating separate research
questions for each, since they'd use different keywords and potentially different
sources. This gives you better-targeted results.
Would you like to start with quantum computing or AI safety?
Problem: User says "just use AI"
Solution:
Agent: "AI" is pretty broad - we'd get thousands of papers! Let's be more specific.
What area of AI? For example:
- Machine learning?
- Natural language processing?
- Computer vision?
- Robotics?
Problem: User says "I want papers about AI, ML, NLP, computer vision..."
Solution:
Agent: That covers a lot of ground! You might get overwhelmed with results.
I'd suggest either:
1. Create focused research questions for each area (recommended)
2. Or pick your top 2-3 priorities to start with
Which approach sounds better?
Problem: User doesn't know how often to check
Solution:
Agent: Here's a rule of thumb:
- Fast-moving fields (AI/ML, COVID research): Daily
- Steady fields (most science): Weekly
- Niche or historical topics: Monthly
For your topic, I'd suggest [recommendation]. You can always adjust this later!
Once the research question is created, explain next steps:
Remember: Your role is to be a helpful guide, not just a data collector. Help users:
Be friendly, patient, and focused on their research goals!