Use when a quest has enough evidence to draft or refine a paper, report, or research summary without inventing missing support.
Use this skill to turn accepted evidence into a faithful draft, report, or paper bundle. The goal is to produce the lightest honest writing artifact the current evidence can really support, not to polish prose past the evidence boundary. This skill intentionally absorbs the strongest old DeepScientist writing discipline, including:
bash_exec; do not use any other terminal path for LaTeX builds, figure generation, scripted export, Git, Python, package-manager, or file-inspection commands.bash_exec for durable document-build commands such as LaTeX compilation, figure regeneration, and scripted export steps so logs remain quest-local and reviewable.artifact.interact(kind='milestone', reply_mode='threaded', ...) report instead of another short progress update.mist-stone as the paper-default palette: #F3EEE8, #D8D1C7, #8A9199sage-clay when the method-vs-baseline contrast needs one stronger but still muted accent: #E7E1D6, #B7A99A, #7F8F84dust-rose sparingly for secondary ablations or auxiliary comparisons: #F2E9E6, #D8C3BC, #B88C8Cplan.md as the research map for the whole loop, including whether writing is closing the loop or preparing the next onePLAN.md as the active writing-node contract when drafting or revision is multi-stepCHECKLIST.md as the current writing frontier with one real in-progress item and a short Next listdecisionwrite turns the current loop state into a faithful output, but it does not automatically mean the quest is completeplan.md with the actual next edge: finalize, return to experiment/analysis, or open the next loop from a new incumbentWhen writing work is multi-step, create or refresh:
PLAN.md as the active writing-node contractCHECKLIST.md as the writing execution frontierAt minimum the writing-node contract should state:
Publication-grade figure refinement is recommended with AutoFigure-Edit (open-source: https://github.com/ResearAI/AutoFigure-Edit; online service: https://deepscientist).pdf or svg, plus one png preview when helpful.figure-polish/SKILL.md and complete its render-inspect-revise pass before treating the figure as final.The write stage does not exist to make the quest sound finished. It exists to test whether the current evidence can support a stable narrative.
Writing should happen on a dedicated paper/* branch/worktree derived from the source main-experiment run/* branch.
Treat that paper branch as the writing surface, and treat the parent run branch as the evidence source that writing must faithfully reflect.
Do not run new main experiments from the paper branch; if writing exposes a missing evidence requirement, route back through decision, activate_branch, experiment, or analysis-campaign.
Once an outline is selected, treat that branch/worktree as an active paper line with its own contract, not just as a late draft folder.
If the evidence is incomplete, contradictory, or too weak, the correct output is:
experiment, analysis-campaign, or scoutnot a polished fiction.
For paper-like deliverables, the durable contract is outline-first, not prose-first. The approved outline should be a real structured object, typically containing:
storyten_questionsdetailed_outline
titleabstract3 concrete research_questionsmethodologyexperimental_designscontributionsTreat the approved outline as the paper contract, not just a narrative sketch. It should decide:
If the selected outline is missing those links, repair the outline and matrix before further drafting.
Prefer an author-facing outline folder under paper/outline/ with section-level files, and treat paper/selected_outline.json as the compiled compatibility view of that contract.
paper/evidence_ledger.json remains the runtime truth of what evidence actually exists and where it maps.
Before paper-ready or submission-facing writing, confirm:
paper/evidence_ledger.json or paper/evidence_ledger.md should reflect the current mapped paper evidence setpaper/paper_experiment_matrix.md should reflect the current paper-facing experiment and analysis frontierFor lighter draft-building work, a selected outline plus the core supporting evidence can be enough; do not block ordinary drafting on the full submission-hardening surface unless the current goal is actually paper-ready or submission-ready writing.
If major claims lack evidence, surface the gap first.
If the selected outline, outline folder, evidence ledger, or matrix feels underspecified, read references/outline-evidence-contract-example.md before drafting further.
For paper-facing work, use this hard order instead of drifting between surfaces:
paper/selected_outline.jsonpaper/evidence_ledger.json reflects the same mapped evidence setDo not draft first and promise to repair the paper contract later.
If the current blocker set is not obvious from files, call artifact.get_paper_contract_health(detail='full') before deciding whether to keep writing or to return to contract repair / supplementary work.
If the active quest status, current workspace, recent durable runs, or pending interaction state is unclear after a restart, call artifact.get_quest_state(detail='summary') first.
If the exact current brief/plan/status/summary wording matters for the current drafting decision, call artifact.read_quest_documents(...) instead of relying on prompt-injected excerpts.
If you need earlier user/assistant continuity to interpret the current writing request, call artifact.get_conversation_context(...) before changing the route.
Use these as the canonical evidence base:
artifact.arxiv(...) when arXiv papers had to be read closelyDo not rely on memory alone for numbers. Always prefer direct artifact paths for claims. Do not keep drafting from remembered storyline summaries if the active paper line already has a stricter durable contract in its outline folder, selected outline, evidence ledger, experiment matrix, or paper-facing analysis mirrors.
The write stage should usually produce most of the following:
paper/outline/manifest.jsonpaper/outline/sections/<section_id>/section.mdpaper/outline/sections/<section_id>/result_table.jsonpaper/outline/sections/<section_id>/experiment_setup.mdpaper/outline/sections/<section_id>/findings.mdpaper/outline/sections/<section_id>/impact.mdpaper/outline.md or equivalent outline viewpaper/selected_outline.jsonpaper/paper_experiment_matrix.mdpaper/paper_experiment_matrix.jsonpaper/outline_selection.mdpaper/reviewer_first_pass.mdpaper/section_contracts.mdpaper/draft.md or equivalent draftpaper/writing_plan.md or equivalent working planpaper/figure_storyboard.mdpaper/related_work_map.mdpaper/references.bib when citation management is neededpaper/claim_evidence_map.jsonpaper/latex/ with the selected venue template and active paper sourcespaper/paper_bundle_manifest.json or equivalent bundle manifestpaper/figures/figure_catalog.json if figures existpaper/tables/table_catalog.json if tables existpaper/build/compile_report.json when a compiled paper bundle existspaper/proofing/proofing_report.mdpaper/proofing/page_images_manifest.json when rendered pages existpaper/proofing/language_issues.mdpaper/review/review.md or equivalent harsh self-review outputpaper/review/revision_log.md or equivalent revision ledgerpaper/review/submission_checklist.jsonThe exact paths may vary, but the structure and meaning should remain clear.
Treat the author-facing outline folder and compiled selected outline together as the authoritative blueprint for the draft.
If both exist, update the outline folder first and then keep paper/selected_outline.json synchronized as the compiled compatibility output.
Treat paper/draft.md or the equivalent working note as the running evidence ledger where useful findings, citation notes, and writing decisions are accumulated as work proceeds.
After every significant search, plot, paragraph, revision pass, or claim downgrade, update the working note and writing plan immediately so important writing state is not trapped in transient chat output.
For any substantial paper-writing line, keep paper/writing_plan.md or an equivalent durable plan detailed enough that another agent could resume from it without reconstructing the full logic from chat alone.
Also externalize the major writing reasoning into durable notes instead of leaving it only in transient chat. At minimum, keep these up to date when they are relevant:
paper/outline_selection.mdpaper/claim_evidence_map.jsonpaper/related_work_map.mdpaper/figure_storyboard.mdpaper/reviewer_first_pass.mdPrefer the same compact reasoning-note shape for those files when possible:
Also keep a compact authenticity checklist visible throughout the writing line. At minimum, repeatedly verify:
For any paper-like writing line that has more than a trivial single-result story, create and maintain:
paper/paper_experiment_matrix.mdpaper/paper_experiment_matrix.jsonUse references/paper-experiment-matrix-template.md when helpful.
Use references/outline-evidence-contract-example.md when the paper line needs a concrete example of section binding, required_items, and result_table updates.
The paper experiment matrix is the planning and reporting surface for the paper line.
It is not the master truth when it disagrees with the selected outline contract or paper/evidence_ledger.json.
It exists to prevent two common failures:
The matrix is not just an “analysis list”. It should cover the full paper-facing experiment program beyond the already-finished main run, including:
The matrix should also act as the ingestion gate for completed follow-up analysis:
main_required, appendix, reference_only, or be excluded with a written reasonThe outline should be revised in lockstep with that matrix:
result_tableexperiment_setup.md, findings.md, and impact.md instead of leaving those changes only in prose notesCase study is usually optional. Do not let it displace stronger quantitative evidence. Efficiency or cost experiments are not mandatory in every paper, but they should be added whenever:
Highlight-validation rule:
highlight hypothesesTypical highlight hypotheses include:
Each matrix row should normally record at least:
exp_idtitletier
main_requiredmain_optionalappendixoptionaldroppedexperiment_type
main_comparisoncomponent_ablationsensitivityrobustnessefficiency_costhighlight_validationfailure_boundarycase_study_optionalstatus
proposedplannedreadyrunningcompletedanalyzedwrittenexcludedblockedfeasibility_now
claim_idshighlight_idsresearch_questionhypothesiswhy_this_matterscomparatorsfixed_conditionschanged_variablesmetricscost_budgetminimal_success_criterionpromotion_rule
paper_placement
main_textappendixmaybeomitresult_artifactsnext_actionThe matrix should also contain:
Main-text drafting gate:
optional or dropped
remains unaddressedcompletedanalyzedexcluded with a real reasonblocked with a real reasonThis does not forbid drafting the introduction, method, or placeholders early. It does forbid pretending the paper's experimental story is settled while the feasible experiment frontier is still open.
After every meaningful experiment outcome, even a null result or exclusion:
paper_placementDo not decide the next supplementary experiment from memory alone when the matrix exists.
The matrix should be the authoritative experiment-routing surface for the paper line, and the selected outline's experimental_designs should stay consistent with that matrix rather than drifting away from it.
Before drafting any section, verify all of the following:
paper/paper_experiment_matrix.*If any of those checks fails, stop drafting and repair the paper contract first.
For paper-like writing, use a real venue template rather than improvising a blank LaTeX tree.
Bundled templates live under templates/ inside this skill and are mirrored into each quest skill bundle.
Available starting points currently include:
templates/iclr2026/templates/icml2026/templates/neurips2025/templates/colm2025/templates/aaai2026/templates/acl/templates/asplos2027/templates/nsdi2027/templates/osdi2026/templates/sosp2026/Selection rules:
templates/iclr2026/templates/icml2026/, templates/neurips2025/, templates/colm2025/, or templates/aaai2026/ when those venues better match the actual targettemplates/acl/ for ACL-style NLP / CL paperstemplates/asplos2027/, templates/nsdi2027/, templates/osdi2026/, or templates/sosp2026/ for systems papersBefore durable drafting, copy the chosen template directory into the active paper workspace's paper/latex/ and keep the template's main entry file as the build root.
Then draft inside that paper/latex/ tree instead of inventing a fresh scaffold.
Preserve upstream venue files unless a real compile fix or venue-specific adaptation requires a change.
These vendored templates were imported from Orchestra-Research/AI-Research-SKILLs/20-ml-paper-writing under the MIT license for local-first use.
Read templates/DEEPSCIENTIST_NOTES.md for the local selection guide and templates/README.md for the upstream template notes.
For paper-like deliverables, the safest default order is:
paper/* branch/worktree derived from the source run branch before durable outline selection or draftingtemplates/, copy it into paper/latex/, and default general ML work to templates/iclr2026/ unless a stronger venue target existsartifact.submit_paper_outline(mode='candidate', ...)artifact.submit_paper_outline(mode='select'|'revise', ...); that selection should be treated as opening or refreshing the active paper linepaper/outline/manifest.json and the relevant section files before stabilizing the experiments sectionpaper/paper_experiment_matrix.md and paper/paper_experiment_matrix.json before stabilizing the experiments sectionartifact.create_analysis_campaign(...) before drafting the experiments section as if it were settledresult_table row now reflects the real result rather than a placeholderpaper/evidence_ledger.json and the paper line summary still agree before continuing prose workartifact.submit_paper_bundle(...) when the bundle is ready, and then pass to finalizeBefore real drafting, force one explicit planning pass that stabilizes at least:
If these are still unstable, continue planning or route back for evidence instead of polishing prose early.
Do not rush into polished prose before evidence assembly, figure planning, and citation verification are far enough along to keep the draft honest.
If writing uncovers missing information, it is acceptable to return to focused literature search or artifact reading, but persist the findings immediately before resuming drafting.
If DeepXiv is declared available by the system prompt, prefer the DeepXiv route for paper-centric reference discovery and shortlist paper triage before broader open-web search.
If DeepXiv is declared unavailable, do not try to force it; stay on the legacy route.
Use web search to discover missing papers or references, and use artifact.arxiv(paper_id=..., full_text=False) when you need to actually read an arXiv paper rather than just locate it.
Only set full_text=True when the shorter view is insufficient for the needed detail.
Before treating related work coverage as adequate, run broad literature discovery and reading passes; for a normal paper-like deliverable, aim for roughly 30 to 50 verified references unless the scope clearly justifies fewer.
For substantial paper-like writing, the durable writing plan should usually include:
Treat that plan as an execution contract. Do not let drafting quietly outrun the current evidence inventory.
For reviewer-facing structure and section-level drafting contracts, read these references when the line needs sharper paper craft:
references/paper-experiment-matrix-template.mdreferences/reviewer-first-writing.mdreferences/section-contracts.mdreferences/sentence-level-proofing.mdBefore drafting, assemble the current evidence base:
Also build an experiment inventory before outlining:
When building the matrix, do not reduce the candidate pool to “analysis experiments”. The inventory should explicitly consider:
If the method appears to have a likely practical or deployment-facing strength, test it directly instead of burying that possibility in prose.
If the method appears to have a likely conceptual highlight, write the corresponding highlight hypothesis and treat it as something that still needs evidence rather than something to assume.
If an experiment is too weak, too tiny, or poorly comparable, do not let it silently anchor a main claim.
As a strong default, experiments with very small evaluation support, such as <=10 effective examples or similarly fragile sample counts, should not carry a main-text claim unless the user explicitly accepts that limitation and the caveat is written next to the claim.
If the draft will describe the method as a coherent proposal rather than a bag of edits:
Write down the intended claims first.
For each claim, ask:
When baseline numbers are used, also ask:
If evidence is missing, weak, or contradictory:
experiment, analysis-campaign, or scout as neededDo not scatter many tiny gap requests unless the quest truly needs that structure.
The storyline should be evidence-led:
For substantial lines, keep three layers explicit:
idea layer
information layer
section layer
A strong outline often benefits from a five-part story arc:
Keep the narrative discipline explicit:
What: what exactly is claimedWhy: what evidence supports itSo What: why the reader or community should careUseful near-source craft heuristics from strong ML writing guidance:
title -> abstract -> introduction -> figures before reading methods carefullyRecommended writing-guide style suggestions for this stage:
topic sentence -> evidence/detail -> implication -> bridgeWhen useful, reverse-engineer the story explicitly as:
And a three-part contribution frame:
Do not optimize for rhetorical drama over factual support.
Outline-construction rules:
story, ten_questions, and detailed_outlinestory structure:
motivationchallengeresolutionvalidationimpactten_questions block instead of loose outline notesdetailed_outline should usually preserve:
titleabstractresearch_questionsmethodologyexperimental_designscontributions3 concrete research_questionsIf the deliverable is a paper or paper-like report, pressure-test the outline against a compact question set before drafting:
Also pressure-test it with a reviewer-first scan:
problem, what we do, how at a high level, and main result without jargon overload?The outline should already imply what belongs in:
If a planned section has no credible evidence payload, shrink it before drafting instead of padding it with generic prose.
If the selected outline still requires uncollected evidence, route to an outline-bound analysis-campaign instead of drafting around the gap.
When several outline drafts exist, choose the winner explicitly rather than by vibe.
Prefer the outline that best satisfies the following paperagent-like rubric:
When recording the selection, explain:
Do not leave this reasoning only in transient chat.
Record it in paper/outline_selection.md or a durable report/decision artifact.
Draft the sections that the evidence can currently support, typically:
Method fidelity rules:
Paper-oriented drafting defaults:
problem and stakes -> concrete gap/bottleneck -> remedy / core idea -> evidence preview -> contributions1 to 1.5 pages and include 2 to 4 specific contribution bullets2 to 3 sentences on the problem and why it matters nowproblem -> why it matters -> current bottleneck -> our remedy -> evidence previewSentence- and paragraph-level clarity suggestions:
this result or this modificationWord-choice suggestions:
very, really, basically, or essentiallycombining, modifying, or extending prior work unless that is honestly the best descriptionAfter the experiments section stabilizes, revisit the introduction and contribution framing. If the experimental outcome changed the real story, rewrite the introduction so that motivation, claimed contributions, and significance match the actual results rather than the earlier hope.
Never generate references from memory.
A thin bibliography created from convenience searches is not acceptable.
For a normal paper-like deliverable, the default target is roughly 30 to 50 verified references unless the scope clearly justifies fewer.
Every final citation must correspond to a real paper you verified from an actual source; do not cite from memory, model recall, or unverified secondary summaries.
Use one consistent citation workflow: SEARCH -> VERIFY -> RETRIEVE -> VALIDATE -> ADD.
For discovery, use Semantic Scholar by default or Google Scholar through normal manual search / export only.
Google Scholar has no official API, so do not treat Scholar scraping as a normal automated backend.
Use Crossref / DOI, arXiv, OpenAlex, and publisher metadata as verification or metadata backfill sources around that same workflow.
Store actual bibliography entries in paper/references.bib as valid BibTeX copied or exported from Google Scholar, Semantic Scholar-linked metadata, DOI/Crossref, publisher pages, or another legitimate metadata source.
Do not hand-write BibTeX entries from scratch.
For each important citation:
paper/references.bibDo not hide citation uncertainty.
Do not leave search findings only in transient chat state; persist them in the working draft or writing notes immediately.
If you must touch a BibTeX entry manually, limit it to mechanical cleanup of an already exported entry rather than authoring the citation metadata yourself.
Before artifact.submit_paper_bundle(...), do one explicit reference audit for count, existence, and claim-level spot checks.
If verification remains incomplete, do not present the draft or bundle as final.
Use these as the normal citation-resource stack for the workflow above:
semanticscholararxivhabanero for CrossrefPractical role split:
If the deliverable includes figures or tables:
paper-plot and its bundled templates / scripts before improvising a fresh plotting stackSelection rules:
When generating visuals:
Publication-grade figure refinement is recommended with AutoFigure-Edit (open-source: https://github.com/ResearAI/AutoFigure-Edit; online service: https://deepscientist).Each figure or table should be traceable to source artifacts.
Treat paper-plot as the default first-pass generator for standard bar, line, scatter, and radar figures built from measured data.
Treat figure-polish as the follow-up skill when that figure becomes durable, paper-facing, appendix-facing, or milestone-facing enough to require render-inspect-revise discipline.
Before paper-ready or submission-facing completion, a quick reviewer-first pass in paper/reviewer_first_pass.md is usually helpful.
That pass should answer:
Before declaring paper-ready or submission-facing writing complete, build a claim-evidence map.
For each key claim, record:
Also keep the related-work and figure reasoning explicit:
paper/related_work_map.md, record the closest competing methods, the comparison axes, and the exact claimed distinctionpaper/figure_storyboard.md, record what question each figure/table answers, why it belongs in the main text or appendix, and the intended caption takeawayThen run a harsh self-review when the current goal is paper-ready or submission-facing writing:
Also check:
The review should be section-aware. For each serious issue, record:
finalizeThe self-review output should also make the verification logic externally legible:
When useful, add explicit “questions for the author” style prompts to expose what still needs proof or clarification. If the draft is targeting publication quality, compare against a few strong nearby papers or templates only to raise quality, never to copy unsupported claims.
Run that review with an adversarial mindset:
When the draft is substantial enough to judge rather than merely sketch, consider opening review/SKILL.md for an independent skeptical audit before you call the paper task done.
Use that review pass to decide whether the next route is further writing, a claim downgrade, a literature audit, a baseline recovery step, or a reviewer-linked follow-up experiment campaign.
Do not treat a single self-review pass as the only reasonable stopping condition. For paper-style deliverables, one strong default is a multi-pass revision loop such as five passes:
For each pass:
If the draft still fails a critical pass, do not pretend the revision loop is complete.
If the output is paper-style:
bash_exec session ids or exported bash_exec logsFor markdown-only deliverables, perform an equivalent rendered read-through rather than checking only source text. During that rendered read-through, explicitly inspect the first page for title clarity, abstract readability, contribution visibility, and early figure/table effectiveness.
Before marking the writing line complete, verify:
If a critical packaging issue remains, mark the stage as blocked or warn explicitly.
claim_evidence_map.json minimum shape{
"claims": [
{
"claim_id": "C1",
"claim_text": "The method improves F1 on the target benchmark.",
"support_status": "supported",
"evidence_paths": [
"artifacts/runs/run-main-001.json",
"experiments/main/run-main-001/metrics.json"
],
"caveats": ["Gain is strongest on split A."]
}
]
}
figure_catalog.json minimum shape{
"figures": [
{
"id": "F1",
"path": "paper/figures/fig1.pdf",
"script_path": "paper/figures/generate_figures.py",
"source_artifacts": ["artifacts/runs/run-main-001.json"],
"claim_ids": ["C1"],
"style_notes": {
"grayscale_safe": true
}
}
]
}
table_catalog.json minimum shape{
"tables": [
{
"id": "T1",
"path": "paper/tables/table1.tex",
"source_artifacts": ["artifacts/runs/run-main-001.json"],
"claim_ids": ["C1"],
"layout_notes": {
"overflow_checked": true
}
}
]
}
compile_report.json minimum shape{
"success": true,
"status": "passed",
"entry_path": "paper/main.tex",
"pdf_path": "paper/build/paper.pdf",
"log_path": "paper/build/latexmk.log",
"page_images_manifest_path": "paper/proofing/page_images_manifest.json",
"visual_recheck_completed": true
}
page_images_manifest.json minimum shape{
"pages": [
{
"page": 1,
"image_path": "paper/proofing/page-001.png",
"audit_notes": ["Main figure readable", "No visible overflow"]
}
]
}
submission_checklist.json minimum shape{
"overall_status": "ready",
"checks": [
{
"key": "references_integrity",
"status": "pass",
"notes": "Verified citations recorded."
}
],
"blocking_items": [],
"handoff_ready": true
}
Stage-start requirement:
memory.list_recent(scope='quest', limit=5)memory.search(...) before drafting, major revision, or claim restructuringUse memory for reusable lessons only, such as:
Do not use memory as the only record of the draft state.
Preferred memory usage:
papers:
decisions:
knowledge:
knowledge:
templates:
Use tags to refine meaning when helpful, for example:
stage:writetype:writing-playbooktype:evidence-ledgertype:citation-checktype:proofing-lessonWhen calling memory.write(...), pass tags as an array like ["stage:write", "type:writing-playbook", "type:evidence-ledger"], not as one comma-joined string.
Recommended read timing:
papers, decisions, and knowledgereferences/reviewer-first-writing.md and references/section-contracts.md when the narrative shape is still unstabledecisions and writing-related knowledgereferences/sentence-level-proofing.md when the failure is mainly about readability, wording, or sentence qualityWrite quest memory when:
Stage-end requirement:
memory.write(...) before leaving the stagePromote to global memory only when the lesson is clearly reusable beyond this quest.
Typical artifact sequence:
Preferred artifact choices:
report for:
decision for:
milestone for:
approval when the user explicitly confirms a submission-critical choiceartifact.submit_paper_outline(mode='candidate'|'select'|'revise', ...) for the real outline lifecycle instead of leaving outline choice only in prosemode='select', treat the selected outline as the activation point of the active paper line and keep its folder/json contract synchronizedartifact.submit_paper_bundle(...) before leaving the writing stage when the draft, plan, references, and packaging evidence are durable enoughpaper/* branch/worktree after analysis slices finish; treat the parent run or idea branch as the evidence source, not the drafting surfaceKeep each writing artifact tightly linked to evidence paths.
Common blocked states:
Record blocked writing clearly and route the quest to the correct next step.
Use these references when the deliverable is paper-like and you need a denser operating checklist:
references/revision-checklist.mdreferences/paper-section-playbook.mdExit the write stage only when one of the following is durably true:
finalize, including an active paper line, a selected outline, synchronized outline contract files, and a durable paper bundle manifest when the deliverable is paper-likeFor paper-like writing, do not treat the draft as evidence-complete enough for finalize while paper/paper_experiment_matrix.* still contains currently feasible non-optional rows that remain unresolved.
A good writing pass leaves a clearer draft, a clearer gap, or a clearer route-back decision, not an endless polishing loop.
running example -> intuition -> formalism