Lightweight automated research pipeline for searching, filtering, and extracting skills from academic papers without knowledge graph infrastructure. Use for: quick research synthesis, paper-to-skill conversion, domain literature surveys. Activation: automated research, paper to skill, research pipeline, quick literature review, skill extraction from papers.
Lightweight automated research pipeline for searching, filtering, and extracting reusable skills from academic papers. Provides a middle ground between basic search and full knowledge graph workflows - no KG infrastructure required.
Search → Filter → Score → Select → Extract → Create Skill → Sync
Search arXiv with multiple related keywords to maximize coverage:
keywords = [
"systems engineering",
"system design",
"distributed systems",
"control systems",
"complex systems"
]
all_papers = []
for kw in keywords:
papers = await search_arxiv(kw, max_results=15, days=30)
all_papers.extend(papers)
Remove duplicates by arXiv ID:
seen_ids = set()
unique_papers = []
for p in all_papers:
if p["id"] not in seen_ids:
seen_ids.add(p["id"])
unique_papers.append(p)
Score papers based on domain-specific criteria:
def score_paper(paper, domain_keywords, category_bonuses):
"""
Score paper relevance.
Args:
paper: Paper dict with title, abstract, categories
domain_keywords: Dict of {keyword: weight}
category_bonuses: Dict of {category: bonus}
Returns:
relevance_score: Integer score
"""
score = 0
title_lower = paper['title'].lower()
abstract_lower = paper['abstract'].lower()
categories = [c.lower() for c in paper.get('categories', [])]
# Keyword scoring
for kw, weight in domain_keywords.items():
if kw in title_lower:
score += weight * 2 # Title match = higher weight
elif kw in abstract_lower:
score += weight
# Category bonuses
for cat, bonus in category_bonuses.items():
if any(cat in c for c in categories):
score += bonus
return score
Sort by score and select top N:
# Add scores
for p in papers:
p['relevance_score'] = score_paper(p, domain_keywords, category_bonuses)
# Sort and select
papers.sort(key=lambda x: x['relevance_score'], reverse=True)
top_papers = papers[:3] # Select top 3
Extract skill patterns from selected papers:
def extract_skill_pattern(paper):
"""
Extract skill components from paper.
Returns dict with:
- name: Skill name (derived from title)
- description: Core contribution
- activation_keywords: Domain terms
- core_concepts: Key ideas
- implementation_patterns: Code patterns
"""
return {
"name": derive_skill_name(paper['title']),
"description": extract_contribution(paper['abstract']),
"activation_keywords": extract_keywords(paper),
"core_concepts": extract_concepts(paper),
"arxiv_id": paper['id'],
"authors": paper['authors'],
"category": paper['category']
}
Generate SKILL.md from template:
---