Cleanup | Skills Pool

<!-- from: file1.md -->
<contents of file1>

<!-- from: file2.md -->
<contents of file2>

python3 scripts/verify-sort.py <file>.bak <file>

python3 scripts/verify-rewrite.py <sorted> <rewritten>

You are a gap detector. Compare SORTED file vs REWRITTEN file for these sections: [SECTIONS].

Files:
- Sorted: [SORTED_PATH]
- Rewritten: [REWRITTEN_PATH]

CRITICAL: Use rg tool to search for key phrases from each sorted line. Do NOT rely on manual reading for large files. For each line, grep 3-5 unique words in the rewritten file.

IMPORTANT: The rewritten file may contain HTML blocks (<details>, <summary>, <table>).
Content inside these tags IS valid — search INSIDE them with rg. A line found inside
<details>...</details> counts as present.

For EACH non-empty line in your assigned sections of the SORTED file:
1. Search for a semantic equivalent in the REWRITTEN file (search the ENTIRE file, not just the same section)
2. If found with same meaning → SKIP
3. If found but details lost → PARTIAL (quote both lines + what was lost)
4. If meaning changed/reversed → REVERSED (quote both lines)
5. If NOT found anywhere → MISSING (quote the sorted line)

RULES:
- Grammar/formatting changes are NOT gaps. "setup nginx" → "Set up Nginx" is fine.
- PARTIAL only when a SPECIFIC IDEA, DETAIL, or CONTEXT is lost — not formatting.
- These are NOT gaps: bullet→checkbox, case changes, typo fixes, punctuation, link text changes.
- If the core idea and all details are preserved, it's a SKIP regardless of formatting.
- You MUST quote exact text from both files. If you cannot quote the rewritten equivalent — it IS missing.
- Output format per finding:
  SECTION: <header>
  TYPE: MISSING|PARTIAL|REVERSED
  SORTED_LINE: "<exact quote>"
  REWRITTEN_LINE: "<exact quote or NOT_FOUND>"
  LOST_DETAIL: "<what was lost>" (PARTIAL only)

python3 scripts/verify-coverage.py <sorted> <rewritten> <gaps>

You are a coverage verifier. You have a list of lines that a fuzzy-matching script
could not find in the rewritten file. Many of these are FALSE POSITIVES — the content
IS in the rewritten file but was rephrased, reformatted, or had typos fixed.

IMPORTANT: The rewritten file may use <details><summary>...</summary>...</details> blocks.
Content inside these blocks IS present — search the raw file text, not rendered output.
Many false positives come from content moved into <details> blocks.

CHAT SUMMARIZATION RULE: The rewritten file intentionally summarizes raw chat logs
(timestamped messages like "☀️, [date]") into structured "Key takeaways" sections.
If a chat message's substantive facts (numbers, prices, names, conclusions) appear
in summarized form — it is COVERED, not MISSING. Specifically:
- Timestamps, emoji markers, informal greetings → always FALSE POSITIVE
- Conversational fragments ("Ну хз", "Ага", "Потом конечная") → FALSE POSITIVE
- Back-and-forth debate condensed to conclusion → COVERED
- Specific numbers/facts preserved in summary → COVERED
Only report TRUE_MISSING if the substantive IDEA has no equivalent anywhere in the file.

Files:
- Uncovered candidates: [UNCOVERED_TMP_PATH]
- Rewritten: [REWRITTEN_PATH]

For EACH line in the uncovered file:
1. Search the ENTIRE rewritten file for content with the same meaning
2. If found (even rephrased, reformatted, typo-fixed, summarized) → FALSE POSITIVE, skip it
3. If truly NOT found anywhere → TRUE MISSING, report it

Output ONLY the TRUE MISSING lines, one per line, with format:
TRUE_MISSING: "<exact line from uncovered file>"

If all lines are false positives, output: "ALL COVERED — no true gaps found."

# Gaps: <filename>
<!-- Delete lines you don't need. Keep lines to apply to rewritten. -->
<!-- Summary: N MISSING, M PARTIAL, K REVERSED, L UNCOVERED -->

## <Section Name>

- [MISSING] `<exact sorted line>`
- [PARTIAL] `<sorted line>` → rewritten: `<rewritten line>` | Lost: <detail>
- [REVERSED] `<sorted line>` → rewritten: `<rewritten line>`
- [UNCOVERED] `<sorted line>`

=== CLEANUP COMPLETE (Phases 1-5) ===

Files:
  Backup:    <path>.bak          (<N> lines)
  Sorted:    <path>              (<N> lines, verified)
  Rewritten: <path>.rewritten    (<N> lines)
  Gaps:      <path>.gaps.md      (<N> items: X missing, Y partial, Z reversed, W uncovered)

Next: edit .gaps.md, delete what you don't need. Write me when you're ready to continue.

# All URLs from original present in final?
python3 scripts/verify-rewrite.py <file>.bak <basename>.rewritten.<ext>

# Every original line covered in final?
python3 scripts/verify-coverage.py <file>.bak <basename>.rewritten.<ext> /dev/null

If uncovered candidates found → MANDATORY: spawn NEW agents (batches of 100). Do NOT reuse Phase 4c results — Phase 4c compared sorted → rewritten, Phase 8 compares backup (original) → rewritten. Different source = different gaps. Optimization: Если Phase 4c не выявила TRUE_MISSING items, И sorted файл отличается от backup только добавленными ## headers (что проверено в Phase 2), то допустимо запустить 1 агент на ВЕСЬ список uncovered (вместо батчей по 100), с инструкцией "expect mostly false positives, report only truly unique content". Use this prompt:

You are a final coverage verifier. Lines from the ORIGINAL BACKUP were not fuzzy-matched
in the FINAL rewritten file. Many are FALSE POSITIVES (rephrased, reformatted, reorganized).

IMPORTANT: The rewritten file may contain <details>, <summary> and other HTML elements.
Content inside these tags IS present — search inside them.

CHAT SUMMARIZATION RULE: The rewritten file intentionally summarizes raw chat logs
(timestamped messages like "☀️, [date]") into structured "Key takeaways" sections.
If a chat message's substantive facts (numbers, prices, names, conclusions) appear
in summarized form — it is COVERED, not MISSING. Specifically:
- Timestamps, emoji markers, informal greetings → always FALSE POSITIVE
- Conversational fragments → FALSE POSITIVE
- Back-and-forth debate condensed to conclusion → COVERED
- Specific numbers/facts preserved in summary → COVERED
Only report TRUE_MISSING if the substantive IDEA has no equivalent anywhere.

Files:
- Uncovered candidates: [UNCOVERED_TMP_PATH]
- Final rewritten: [REWRITTEN_PATH]

For EACH line:
1. Search ENTIRE rewritten file for same meaning
2. Found (even rephrased, inside <details>, summarized) → FALSE POSITIVE, skip
3. Truly not found → TRUE MISSING

Output: TRUE_MISSING: "<exact line>" or "ALL COVERED — no true gaps found."

If TRUE MISSING found → report them, DO NOT replace original, DO NOT delete backup.
If all clear:
1. Auto-replace: mv <file>.rewritten <file> — original is replaced, .bak stays as backup.
2. Delete the gaps file if it exists.
3. Output: "✅ Final verification passed. Original replaced. Backup: <file>.bak"

=== CLEANUP REPORT ===

Metrics:
  Original:  <N> lines
  Rewritten: <N> lines (<ratio>% compression)
  Gaps found: <N> (X applied, Y dismissed by user)
  URLs: <N> original, <M> preserved, <K> missing (user-approved)
  Original replaced: yes/no
  Backup: <path>.bak

Issues encountered:
- <any verification failures, skipped steps, agent errors, or anomalies>

Fixes (brief, only if issues found):
- <concrete action to resolve each issue>

Recommendations:
- <suggestions for the file or future rewrites>

Analyze the clean file for distinct topics/projects.

IF single topic OR file has <100 lines → skip split, suggest clarify:

"File is focused on a single topic. Recommend: /compact then /clarify <file>"

IF multiple distinct topics found → enter plan mode (/plan):
- Write a split plan to the plan file with:
  - List of output files with names and descriptions
  - For each file: which sections/line ranges go there
  - Cross-reference strategy (which specs link to which references)
  - Estimated line counts per file
- Plan format:
```
# Split Plan: <filename>

## Output files

### spec-<topic-A>.md (~N lines)
Sections: <list of ## headers going here>
Content: <brief description>

### references-<topic-A>.md (~M lines)
Sections: <list of ## headers going here>
Content: links, research, external refs related to topic A

### spec-<topic-B>.md (~K lines)
...

## Cross-references
- spec-<A>.md → references-<A>.md
- spec-<B>.md → references-<B>.md
```
- Exit plan mode — user reviews and approves.
- If user rejects → ask what to change, revise plan, re-submit.
- If user approves → proceed to Phase 11.

Create output directory: <basename>/ (sibling to input file).
Follow the approved plan — for each file:
- spec-<topic-slug>.md — main content (tasks, goals, requirements, decisions).
- references-<topic-slug>.md — links, research notes, external refs, raw data.

Add cross-references at top of each spec:

> References: [references-<topic>.md](references-<topic>.md)

Preserve every original line exactly as-is in one of the output files.

python3 scripts/verify-split.py <original-clean-file> <output-dir>

=== SPLIT COMPLETE ===

Files:
  <list of created files with line counts>

Recommend: /clear then /clarify <spec-file.md>

Script	Purpose	Args
`verify-sort.py`	Superset check — all original lines preserved, new lines OK	`<backup> <sorted>`
`verify-rewrite.py`	URL presence check (with normalization)	`<source> <target>`
`verify-coverage.py`	Safety net — every line accounted for	`<sorted> <rewritten> <gaps>`
`verify-split.py`	Split verification — all lines present across output files	`<original> <output-dir>`

Cleanup

Cleanup Skill

Usage

Algorithm

Phase 0: Input Handling

Cleanup

Cleanup Skill

Usage

Algorithm

Phase 0: Input Handling

Phase 1: Sort

Phase 2: Verify Sort

Phase 3: Rewrite

Phase 4: Gap Detection (three levels)

4a: Deterministic check (script)

4b: Semantic check (pre-filter + background agents)

4c: Coverage safety net (script + agent verification)

Phase 5: Gaps File

Phase 5.5: Auto-continue check

Phase 6: PAUSE

Phase 7: Apply

Phase 8: Final Verification

Phase 9: Report

Phase B: Split (optional)

Phase 10: Split Analysis (plan mode)

Phase 11: Execute Split

Phase 12: Verify Split

Phase 13: Handoff

Scaling

Rules

Output Contract

Scripts

Session Logs

OpenClaw Test Heap Leaks

Node Connect

Openclaw Qa Testing

Openclaw Secret Scanning Maintainer

Flags