Skill File

Notion Report

Name: Notion Report
Author: olliecrow

Create and maintain scientific/empirical experiment or investigation reports in Notion via MCP from freeform artifacts. Default to direct Notion editing with a Claude-assisted wording/structure pass when available; support local-first draft QA before publishing when explicitly requested. Use the report shape that best communicates the evidence instead of forcing one narrative format.

olliecrow0 starsApr 17, 2026

Occupation
Categories: Literature & Writing

Skill Content

Multi-agent collaboration

Encourage use of multiple agents/subagents when it is likely to improve speed, quality, or confidence.
Split work into clear packets with owners, inputs, acceptance checks, and a synthesis step when parallelizing.
Use single-agent execution when scope is small or coordination overhead outweighs gains.

Proactive autonomy and knowledge compounding

Be proactive: immediately take the next highest-value in-scope action when it is clear.
Default to autonomous execution: do not pause for confirmation between normal in-scope steps.
Request user input only when absolutely necessary: ambiguous requirements, material risk tradeoffs, missing required data/access, or destructive/irreversible actions outside policy.
If blocked by command/tool/env failures, attempt high-confidence fallbacks autonomously before escalating (for example rg -> find/grep, python -> python3, alternate repo-native scripts).
When the workflow uses , ensure required plan directories exist before reading/writing them (create when edits are allowed; otherwise use an in-memory fallback and call it out).

Related Skills

Notion Report | Skills Pool

plan/

Notion ownership gate is mandatory for existing-object mutations:
- allowed owners are Ollie ([email protected]) and Ollie (MCP)
- when available, verify ownership with stable created_by user/bot IDs (not display names alone)
- creating new pages/entries in approved shared destinations (for example Quant Blog) is allowed when explicit in-task user intent is present
- before mutating any existing object (update, move, delete, comment, append), verify ownership from created_by and resolved user identity
- after creation, enforce the same ownership checks on subsequent mutations of that created entry/page
- if ownership is ambiguous or cannot be verified, fail closed and stop with a blocker
no fixed input schema required
accept freeform evidence inputs: notes, logs, tables, plots, artifacts, and run outputs
descriptive-first reporting by default: factual, evidence-backed, and non-prescriptive unless user explicitly requests guidance
choose the report shape that best communicates the evidence: question/answer, objective/outcome, comparison-first, narrative walkthrough, or another structure when that is the clearest fit
use explicit Question -> Answer flow when the real reader need is a concrete question and answer, but do not force Q/A scaffolding when another structure is clearer
if a report is not naturally question-driven (for example operational status, incident summary, design comparison, or evidence ledger), state the objective and outcome plainly instead of forcing Q/A labels
prefer direct, concise, final, objective wording; avoid story-like or theatrical phrasing
avoid narration that adds mood but not evidence (for example "this gave a clear picture" or similar framing)
remove time-window framing such as overnight unless that timing matters to interpretation
unless failure behavior is the explicit experiment question, do not make failure-rate metrics the primary headline; treat them as reliability context supporting core outcomes
one canonical page per report identity; refine in place (no v2/final forks)
default publish destination for generated reports is Quant Blog (database row/page entries), unless the user explicitly asks for a different destination
consolidation-first: keep the fewest report pages that still map cleanly to distinct report identities/questions
plot-first for quantitative reporting; tables are supporting precision
in Quant Blog, keep date in the Date property and keep Title as <clear human title> (no leading date token in title text)
use rerun in titles only when the report explicitly links to the prior report/experiment being rerun
avoid opaque shorthand/acronym-heavy naming in titles/headings/captions (for example coded labels like RB1200 or B500); prefer plain-language wording
keep titles human-facing: do not include hash-like search/run IDs in titles (for example short hex tokens)
when same-topic reports represent distinct runs, keep separate pages and differentiate titles with plain-language descriptors (scope/design/coverage), not internal IDs
for experiment/search reports, record machine-facing IDs (search ID, run ID, cluster job IDs, artifact pointers) in a final Provenance Appendix, not the title
explicit opener emphasis hierarchy for scanability:
- bold label-first opener
- concise high-signal callouts
- bold key metrics/status tokens only (no blanket bolding)
opener clarity is mandatory:
- first Top Takeaways line must contain the clearest high-level framing for the report: either the explicit report question and direct answer, or the explicit objective and outcome
- opener wording must be self-contained plain language; a new reader should not need prior thread context
- when the opener is question-driven, use direct answer tokens (yes, no, inconclusive, or equivalent), not vague status-only wording
- if outcome is mixed, map it to explicit direct-answer or objective/outcome wording in the opener (for example runtime yes, performance no or objective met for runtime, not for performance)
- avoid opener text that names status without concrete content
when editing existing reports, normalize legacy opener labels to the clearest opener framing for the actual report instead of forcing one fixed label pattern
if more than one key question is covered, use numbered Question n / Answer n pairs with strict adjacency (each answer immediately follows its question)
for multi-question or mixed-certainty investigations, add a dedicated Question Decomposition section immediately after Top Takeaways using strict Question n -> Answer n -> Status n triples, with one concrete empirical evidence line in each answer
keep the opener focused on the true experiment objective; do not let reliability/failure-rate stats become the lead claim unless failure analysis is the stated question
if user scope is a report collection/hub, audit every in-scope report page one-by-one (no sampling)
legacy-summary fallback rule: if the first summary section is not Top Takeaways (for example Key Takeaways or Executive Summary), enforce the same explicit opener rule in that section's first line
for collection-wide cosmetic requests, restrict edits to presentation and readability (formatting/emphasis/label normalization) and preserve reported numbers, conclusions, and claim text unless the user explicitly asks for content changes
during readability/cosmetic passes, remove duplicate headings and repeated near-identical intro paragraphs when they are clearly redundant and the change does not alter meaning
during refinement passes, aggressively remove duplicate sections, repeated claims, and repeated captions unless each repetition adds new evidence
for comparative experiment reports, include a dedicated Eval Setup section directly after Experiment Definition
Eval Setup must describe eval-time behavior in plain English (what checkpoint is evaluated, what is fixed vs randomized, and why the comparison is or is not apples-to-apples)
never use the heading Eval Setup Clarified; use Eval Setup
reports and report-generation helpers are ephemeral artifacts:
- keep under plan/
- never commit under tracked code paths (tools/, src/, experiments/, docs/)
traceability must be shared-reader friendly:
- when relevant, include repository, branch, PR, and commit references so another reader can retrace the work
- never include personal machine paths or other user-specific local filesystem references in the report body
- when file/path references are useful, write them relative to the repository root (for example src/train.py, not repo-root/src/train.py)
- when a GitHub remote exists, prefer clickable GitHub links for repo, PR, commit, and file references
- if no GitHub link can be derived safely, fall back to repo name + branch + commit hash + repo-root-relative paths
keep report presentation human-facing:
- reports should read as human-authored and human-consumable
- do not add Codex/Claude labels in report titles or main body sections
- keep exactly one small attribution footer note at the end containing Prepared with support from Codex and Claude. codex-managed: true
- if any extra Codex/Claude references are found during refinement, remove them before publish
Notion-native formatting is mandatory:
- do not leave literal markdown control tokens in published Notion text blocks (for example `code` or **bold**)
- convert emphasis/code intent to Notion rich-text annotations before finalizing
- after publish, run a token-hygiene verification pass and normalize any residual markdown markers in place
keep primary evidence expanded by default:
- do not hide core results, primary plots, or key findings inside collapsible/toggle blocks
- use toggle/collapsible blocks only when the user explicitly requests that layout
if original numeric artifacts still exist, regenerate publication-grade plots from source data; do not use lightweight Notion-native chart specs (for example Mermaid xychart-beta) for quantitative findings
do not upload report artifacts to third-party public/temporary file hosts (for example tmpfiles.org, catbox, imgur, file.io) unless the user explicitly approves
default to Notion-managed file/image uploads for report visuals (prefer notion-upload-local when available); if upload is unavailable in the current toolchain, add Artifacts to attach placeholders rather than using unapproved external hosts
hide local paths, hostnames, tokens, and secrets in report body text
define acronyms on first use
write for first-time readers with no project background; do not assume internal context
minimize project-specific jargon in headings, captions, and body text; prefer plain everyday wording
when technical terms are required for scientific precision, define them in simple language at first use
keep section headings and plot labels self-explanatory so an outside reader can infer the purpose immediately
define short encoded labels at first use and near the plot that uses them (for example n4, r8, or similar run-scale labels); do not assume the reader knows local shorthand
define environment family names in plain English near first use; if the report uses internal environment labels, add a short reader-facing explanation beside them or in an adjacent definitions block

run multi-pass reader-centric refinement:
- pass 1: structure and narrative arc
- pass 2: skeptical-reader questions and caveats
- pass 3: clarity and emphasis hierarchy
in every pass, check for and remove duplicate information, stale wording that no longer matches the page structure, and headings that are less clear than a simpler plain-English alternative
after each pass: update same canonical page and re-fetch to verify landing
done when:
- required sections present
- opener clearly states the main question and answer, or the main objective and outcome, whichever better matches the report
- opener framing is unambiguous to a cold reader
- top-of-page readback is clean: no duplicate summary headings, no repeated intro blocks, and the opening screen reads naturally in plain English
- when multiple questions are present, Q/A ordering is explicit and adjacent (Question n then Answer n)
- when Question Decomposition is present, each Question n has adjacent Answer n and Status n lines, and each Answer n includes one concrete empirical evidence line
experiment/comparison reports include a dedicated Eval Setup section immediately after Experiment Definition
Eval Setup includes explicit fixed-vs-randomized behavior and an apples-to-apples statement (or explicit mismatch caveat)
empirical/comparison report section order is explicit and preserved: Top Takeaways -> Experiment Definition -> Eval Setup -> Early Results Section -> Definitions and Methodology -> Limitations / Reliability (+ Run Spotlight when applicable)
when both broad aggregate plots and full-detail run plots are present, the page should make clear which section is the quick scan and which section is the main evidence
short run-scale labels, environment names, timing-metric meanings, and the compute shape behind the compared points are explained at first use and near the relevant plots
when standard single-run plots exist for the project, the report includes them or explicitly says why they were skipped
experiment/search reports include a final Provenance Appendix with machine-facing identifiers
where repo context matters, traceability references are present and shareable: repo, branch, PR, commit, and repo-root-relative path references with GitHub links when available
claims are evidence-grounded
formatting hygiene checks pass (no residual literal markdown markers like `...` or **...** in published block text)
visuals are in their intended section context (usually Executive Visual Snapshot), not detached at page end
- visual/table labeling standards met
- no privacy leakage
- no actionable quality gaps remain
- for collection-scope tasks, every in-scope report page was individually checked and opener clarity is verified

Top Takeaways is first
opener uses the clearest framing for the report: explicit Question + direct answer when question-driven, otherwise explicit objective/outcome framing
opener includes explicit question text when question-driven, or explicit objective text when objective-driven
when Question Decomposition exists, it uses strict Question n -> Answer n -> Status n ordering with one empirical evidence line per answer
vague status-only opener answers (for example mixed, answered, improved) are non-compliant unless mapped to explicit direct-answer tokens
opener framing is self-contained plain language (no shorthand requiring external context)
report naming is human-readable (<clear human title>) and avoids opaque shorthand/acronym-heavy labels
report date is stored in the Date property when using Quant Blog
report titles do not include hash-like run/search IDs
machine-facing traceability IDs are captured in a final Provenance Appendix (for experiment/search reports)
when repo context matters, reports include shareable repository references: repo, branch, PR, commit, and repo-root-relative paths; clickable GitHub links are preferred when available
rerun appears in a title only when the report links to the prior report/experiment it reruns
ownership is verified for any existing mutation target ([email protected] or Ollie (MCP))
failure-rate metrics are present for reliability context but are not headline focal points unless failure behavior is the explicit research question
no vague status-only opener wording (for example answered) without explicit question text
legacy first-summary sections (Key Takeaways/Executive Summary) apply the same explicit opener rule when Top Takeaways is absent
opener emphasis hierarchy is correct
no duplicate summary headings or repeated intro paragraphs remain after refinement
Experiment Definition includes what varied/fixed/how/why/answer
Eval Setup appears immediately after Experiment Definition and before results/plots sections
heading uses Eval Setup (never Eval Setup Clarified)
the main results section uses plain-language headings and a plot order that is easy to follow from simple to more complex
broad aggregate plots do not dominate the report when they are too coarse to support the main conclusions
duplicate sections, repeated intros, and stale wording are removed
short plot labels like n, r, or local environment names are explained clearly next to or on the relevant plots
timing plots clearly distinguish raw seconds from normalized rates
timing plots state the compute shape behind each point when that shape matters for interpretation (for example CPU-only vs GPU, CPUs, memory, wall-clock limit, or same-job-shape note)
standard single-run plots are present when available and useful
true individual episode plots are present when available and are clearly separated from training-path or checkpoint-path plots
experiment/comparison reports include explicit eval-time setup details:
- checkpoint selection at eval
- eval composition/overrides (for example learned-agent count/opponent setup)
- domain-randomization behavior for key variables (for example fees/starting capital) at eval
- explicit apples-to-apples comparability statement
Executive Visual Snapshot is present for quantitative reports and appears after Eval Setup
captions use Takeaway: lead-ins
captions are human-readable and do not default to raw uploaded file names
axis labels/units/legends are explicit and readable
plotted series are not downsampled (or any unavoidable downsampling is explicitly disclosed with method/factor/why)
time-series plots keep native artifact cadence (or any aggregation is explicitly disclosed with method/factor/why)
empirical charts are regenerated from original artifacts (no low-fidelity Notion-native chart DSL when source data exists)
limitations/reliability caveats are present and proportional
report is descriptive-only unless user explicitly requested recommendations
no table-of-contents block
no local path/hostname leakage
no personal machine paths in title/body/appendix; use repo-root-relative paths instead
exactly one footer attribution marker is present (Prepared with support from Codex and Claude. codex-managed: true) and there are no other Codex/Claude references in title/body
no literal markdown emphasis/code marker text remains in published Notion blocks (**...**, `...`)
visuals are positioned in the main report flow (typically under Executive Visual Snapshot) and not appended after appendix/footer
primary evidence sections are expanded in-place and not hidden behind toggles/collapsible sections unless explicitly requested
no third-party public file host usage unless explicitly approved by the user
image/file blocks are Notion-managed by default and resolve to non-empty payloads at verification time
if visuals are expected, upload capability is verified before publish (or placeholders are used when unavailable)
when run-level traces exist, Run Spotlight includes separate performance-path and inventory/book-shape visuals rather than only a single overview image
for collection-scope tasks, all in-scope report pages were audited one-by-one (no sampling)

Notion Report

Multi-agent collaboration

Proactive autonomy and knowledge compounding

Notion Report

Multi-agent collaboration

Proactive autonomy and knowledge compounding

Long-task checkpoint cadence

Interruption handoff (required for multi-pass report work)

Terminal state contract (must follow)

Terminal state examples (adapt to skill)

Core principles (must follow)

Local-first mode (only when user explicitly requests)

Allowed-owner existing pages only (must follow)

Parent/location routing (must follow)

Scope and input handling

Collection audit mode (must follow when requested)

Scientific/empirical report structure (default)

1) Top Takeaways (must be first section)

2) Experiment Definition (for experiment/search reports)

3) Eval Setup (for experiment/search reports)

4) Early Results Section

5) Definitions and Methodology

6) Limitations / Reliability

7) Run Spotlight (when run-level time-series artifacts exist)

8) Provenance Appendix (for experiment/search reports)

Evidence discipline (must follow)

Visual and table standards (must follow)

Cross-project refinement loop (must follow)

Optional domain overlays

Trading/market experiments (optional overlay)

Cluster-backed artifacts (optional overlay)

Claude-assisted drafting

Notion MCP workflow (must follow)

Image embedding constraints (Notion reality)

Iteration loop and finish criteria

Quality checklist (core)

Quality checklist (optional overlays)

Brand Writer

Brand Writer

Beautiful Prose

Writer Memory

Writing Clearly And Concisely

Writing Coach

1) `Top Takeaways` (must be first section)

2) `Experiment Definition` (for experiment/search reports)

3) `Eval Setup` (for experiment/search reports)

5) `Definitions and Methodology`

6) `Limitations / Reliability`

7) `Run Spotlight` (when run-level time-series artifacts exist)

8) `Provenance Appendix` (for experiment/search reports)