Use when reviewing prose in markdown files and Jupyter notebook markdown cells. Checks formatting, style, clarity, and pedagogical effectiveness.
Review prose in markdown files (.md) and Jupyter notebook markdown cells for formatting, style, clarity, and pedagogical effectiveness.
Exclude: _external/ directory.
Use backticks with parentheses:
Good: `simulate()`
Good: `load_job_results()`
Bad: simulate()
Bad: simulate
Use backticks:
Good: The `job_info` object contains...
Good: Pass the `results` to the next function...
Bad: The job_info object contains...
Use backticks:
Good: The `effect_size` parameter controls...
Good: Set `num_products` to 100...
Bad: The effect_size parameter controls...
Use backticks for types:
Good: Returns a `JobInfo` object
Good: Returns a `DataFrame`
Bad: Returns a JobInfo object
Use backticks with quotes:
Good: `"config_simulation.yaml"`
Bad: config_simulation.yaml
Bad: "config_simulation.yaml"
Use backticks:
Good: `products_rule_based.py`
Good: `online_retail_simulator`
Bad: products_rule_based.py
Use backticks with trailing slash:
Good: `output/`
Good: `src/simulate/`
Bad: output
Use backticks for column/field names:
Good: Each product has a unique `product_identifier`, a `category`, and a `price`.
Good: The `impressions` column tracks how many times a product was shown.
Bad: Each product has a unique product_identifier (missing formatting)
Bad: The **impressions** column (bold alone - use backticks instead)
Use on first introduction:
Good: The **conversion funnel** tracks customer behavior. The conversion funnel includes...
Bad: The conversion funnel tracks... (not bold on first use)
Use sparingly for meta-comments:
Good: Describe *what* you want in natural language
Good: The goal is not perfectly polished code—it's *rapid insight generation*
Format important questions in bold:
Good: **Does improving product content quality increase sales?**
Good: How do customers move through the purchase journey?
Plain text for readability:
Good: Simulate 100 products
Good: A 50% increase
Bad: Simulate `100` products (over-formatted)
Use code formatting:
Good: Set `effect_size: 0.5`
Good: The default `num_products: 100`
Bad: Set effect_size: 0.5 (not formatted)
Format package/tool names with link on first mention:
Good: The [**Online Retail Simulator**](https://github.com/eisenhauerIO/tools-catalog-generator)
Good: We use [GitHub Copilot](https://github.com/features/copilot) to generate code.
Use plain bold or plain text:
Good: The **Online Retail Simulator** generates...
Good: The simulator generates...
Bad: The [Online Retail Simulator] generates... (over-linked)
Use em-dash for inline examples:
Good: `**category**` (such as Electronics, Clothing, or Books)
Good: Brand names get premium suffixes ("Elite", "Pro")
Bad: **category** like Electronics, Clothing, or Books
Use quotes for strings:
Good: `"2024-11-01"`
Good: `seed: 42`
Bad: 2024-11-01 (not quoted)
# Main Title (once per document)
## Major Section
### Subsection
#### Rare: only for deeply nested content
All headers use sentence case — capitalize the first word and proper nouns only:
Good: ## Deterministic scoring
Good: ### How is revenue distributed across categories?
Good: ## The evaluation harness
Bad: ## Deterministic Scoring (title case)
Bad: ## The Evaluation Harness (title case)
Exception: acronyms and proper nouns retain standard capitalization (e.g., "YAML", "Ollama", "Part I").
Good: ## Exploring the generated data
Good: ### How is revenue distributed across categories?
Bad: ### Revenue Distribution (less engaging)
Do not add structured summary sections at the end of documents. Avoid:
Let the material speak for itself.
Good: The simulator generates a product catalog
Bad: A product catalog is generated by the simulator
Good: The function writes the DataFrames to disk
Bad: The function will write the DataFrames to disk
Good: Let's start by simulating 100 products
Bad: Now we're going to simulate some products (too casual)
Bad: It is necessary to simulate products (too formal)
| Element | Format | Example |
|---|---|---|
| Column name | `name` | product_identifier |
| Function | `function()` | simulate() |
| Variable | `variable` | job_info |
| Parameter | `parameter` | effect_size |
| Config file | `"file.yaml"` | "config_simulation.yaml" |
| Object type | `Type` | DataFrame |
| Directory | `dir/` | output/ |
Locate the project's documentation guidelines file. Look for:
documentation/GUIDELINES.mddocs/GUIDELINES.mddocs/source/GUIDELINES.mdIf no guidelines file exists, report this as the first finding and suggest creating one.
Read the guidelines and verify the actual docs match:
If the guidelines include a text formatting table, audit every doc page against it:
If docs use Sphinx or another build system, inspect the built HTML for rendering problems:
href="# patterns that look like failed relative links.Verify standard docs tooling is in place:
| Check | What to look for |
|---|---|
| Sphinx builds clean | Run the docs build command, check for warnings |
| nbstripout | Pre-commit hook configured to strip notebook outputs |
| nbmake | Notebooks tested via pytest in pre-commit or CI |
| Matplotlib config | Consistent plot rendering config (matplotlibrc) |
| CI/CD | Docs build workflow in .github/workflows/ |
| Pre-commit | .pre-commit-config.yaml includes docs-related hooks |
`column_name` format`function_name()` format`"filename.ext"` format`DataFrame`, `JobInfo`)When reviewing, ask yourself:
Would a reader understand this? Not just follow along, but actually grasp the concept.
Is the "why" explained? Not just what the method does, but why it works and when to use it.
Are assumptions made explicit? Readers should know what conditions must hold.
What would confuse someone? Identify potential stumbling blocks before readers hit them.
For each issue found:
Use BOLD UPPERCASE in prose:
Good: The **PRODUCTS** section generates...
Good: The **PARAMS** subsection controls...
Bad: The PRODUCTS section generates...
Bad: The products section generates...
Use backticks in technical context:
Good: The `effect_size` parameter controls...
Good: Set `enrichment_fraction` to 1.0...
Bad: The effect_size parameter controls...
Use bold lowercase:
Good: "the **products** phase"
Good: "the **product_details** phase"
Bad: "the products phase" (not bold)
Bad: "the PRODUCTS phase" (wrong case)
Use bold uppercase:
Good: "The **PRODUCTS** section in the YAML config..."
Bad: "The products section in the YAML..." (lowercase)
Rule: Phase names (concepts) = lowercase bold. YAML sections = uppercase bold.
For Measure Impact lectures:
Every lecture ends with an ## Additional resources section (lowercase "r"). This is the final section of the notebook — nothing follows it. Format: bullet points with Author (Year). Title. Journal, volume(issue), pages.
| Element | Format | Example |
|---|---|---|
| Phase (concept) | phase | products |
| YAML section | SECTION | PRODUCTS |
After making changes, always run:
git status
hatch run ruff format .
hatch run ruff check .
hatch run build
This checks for untracked files, formats code, checks for linting issues, builds the documentation and executes all notebooks, confirming that changes don't break anything.
Success criteria:
git status shows no untracked files that should be committed)hatch run ruff format . completes without changeshatch run ruff check . passes with no errorshatch run build completes successfully