Distill accepted Google Summer of Code and open-source proposal corpora into reusable drafting strategy, scoring heuristics, project-research workflows, and revision guidance. Use when Codex needs to write, critique, outline, score, or strengthen a GSoC proposal; compare a draft against prior accepted proposals; turn project notes into a week-by-week plan; scan a target codebase; inspect GitHub issues and pull requests from the applicant, maintainers, and other contributors; or refresh proposal guidance from a local corpus stored on disk.
Use this skill to convert a local proposal archive into practical drafting guidance, then combine that wisdom with codebase and GitHub research for a specific target project. Prefer the bundled scripts for corpus analysis, repo research, and scoring, and load the reference files only after the generated artifacts exist.
Be explicit that this is corpus distillation, not model fine-tuning. Treat the output as evidence-backed heuristics gathered from examples on disk.
/Users/nodesagar/Documents/Experiments/fos-proposals unless the user provides a different archive.Run:
scripts/run_pipeline.py --corpus-root /Users/nodesagar/Documents/Experiments/fos-proposals
This generates:
references/generated/manifest.jsonreferences/generated/corpus_data.jsonreferences/generated/patterns.jsonreferences/generated/scores.jsonreferences/generated/analysis_prompts.mdRun:
scripts/06_research_project.py \
--repo-path /path/to/local/repo \
--github owner/repo \
--user applicant-github-login
This generates a repo-aware brief under:
references/generated/project_research/<slug>.jsonreferences/generated/project_research/<slug>.mdWhat this research should cover:
references/gsoc_wisdom.md.references/project_intelligence.md when you need repo/GitHub research guidance.references/generated/patterns.json when you need prevalence, baselines, or scoring weights.references/generated/analysis_prompts.md when you need scaffolding prompts for drafting, critique, or timeline generation.references/generated/scores.json when you want examples of stronger and weaker structures inside the corpus.references/generated/corpus_data.json only when you need document-level features or section inventories.references/generated/project_research/<slug>.md brief for the target project before writing motivation, scope, deliverables, or timeline.Default blueprint:
If the organization has its own template, preserve the required headings but still inject the corpus-derived signals above.
Use project research to strengthen:
Run:
scripts/04_score_proposals.py \
--patterns references/generated/patterns.json \
--input /path/to/draft.md
Use the output to patch missing sections, weak evidence, vague deliverables, or an under-specified timeline.
scripts/01_walk_corpus.py: index markdown, text, and PDF proposals in a corpus directory.scripts/02_extract_structure.py: extract headings, section signals, timelines, links, and proposal features.scripts/03_mine_patterns.py: aggregate corpus-wide baselines and scoring guidance.scripts/04_score_proposals.py: score a corpus or a single draft against the distilled heuristics.scripts/05_prepare_prompts.py: generate reusable prompt scaffolds from the mined patterns.scripts/06_research_project.py: inspect a local repo plus GitHub activity and generate a proposal-ready project brief.scripts/run_pipeline.py: run the full pipeline in order.