Name: Extraction Compare
Author: michaelosbornegit

Extraction Compare | Skills Pool

nix-shell -p python312Packages.openpyxl python312Packages.pyxlsb --run "python3 << 'PYEOF'
import os, sys
from pathlib import Path

OUTPUT_DIR = '/tmp/compare'
os.makedirs(f'{OUTPUT_DIR}/ours', exist_ok=True)
os.makedirs(f'{OUTPUT_DIR}/answer', exist_ok=True)

OUR_FILE = '{{OUR_FILE}}'
ANSWER_FILE = '{{ANSWER_FILE}}'

TABS = ['IS Input', 'BS Input', 'FFS Input', 'Payor Input', 'Detail', 'NRP Adj.']

def dump_xlsx(filepath, out_dir, tabs):
    from openpyxl import load_workbook
    wb = load_workbook(filepath, data_only=True, read_only=True)
    available = wb.sheetnames
    for tab in tabs:
        if tab not in available:
            print(f'  SKIP: {tab} (not in file)')
            continue
        ws = wb[tab]
        lines = []
        for row in ws.iter_rows(values_only=True):
            cells = [str(c) if c is not None else '' for c in row]
            lines.append('|'.join(cells))
        safe_name = tab.replace(' ', '_').replace('.', '_').replace('/', '_')
        out_path = f'{out_dir}/{safe_name}.txt'
        with open(out_path, 'w') as f:
            f.write('\n'.join(lines))
        print(f'  OK: {tab} -> {out_path} ({len(lines)} rows)')
    wb.close()

def dump_xlsb(filepath, out_dir, tabs):
    from pyxlsb import open_workbook
    with open_workbook(filepath) as wb:
        available = wb.sheets
        for tab in tabs:
            if tab not in available:
                print(f'  SKIP: {tab} (not in file)')
                continue
            lines = []
            with wb.get_sheet(tab) as ws:
                for row in ws.rows():
                    cells = [str(c.v) if c.v is not None else '' for c in row]
                    lines.append('|'.join(cells))
            safe_name = tab.replace(' ', '_').replace('.', '_').replace('/', '_')
            out_path = f'{out_dir}/{safe_name}.txt'
            with open(out_path, 'w') as f:
                f.write('\n'.join(lines))
            print(f'  OK: {tab} -> {out_path} ({len(lines)} rows)')

def dump_file(filepath, out_dir, tabs):
    ext = Path(filepath).suffix.lower()
    print(f'Parsing {filepath} ({ext}):')
    if ext in ('.xlsx', '.xlsm'):
        dump_xlsx(filepath, out_dir, tabs)
    elif ext == '.xlsb':
        dump_xlsb(filepath, out_dir, tabs)
    else:
        print(f'ERROR: Unsupported extension {ext}')
        sys.exit(1)

dump_file(OUR_FILE, f'{OUTPUT_DIR}/ours', TABS)
dump_file(ANSWER_FILE, f'{OUTPUT_DIR}/answer', TABS)
print('Done.')
PYEOF"

Compare these two Excel tab dumps and report differences. This is tab "{{TAB_NAME}}" from a financial model.

**Our output (system-generated):**
File: /tmp/compare/ours/{{SAFE_TAB_NAME}}.txt

**Answer key (analyst reference):**
File: /tmp/compare/answer/{{SAFE_TAB_NAME}}.txt

Read both files, then analyze:

1. **Missing Sections** — sections/groups in the answer key but absent from our output
2. **Missing Rows** — rows with data in the answer key but empty/missing in our output
3. **Value Differences** — cells where both files have data but values differ. Report as: Row identifier | Our value | Expected value. Ignore rounding differences ≤$1. Ignore blank vs zero differences.
4. **Extra Data** — data in our output that doesn't appear in the answer key
5. **Summary** — overall accuracy assessment (percentage match estimate)

IMPORTANT context:
- Column positions may differ between files (our output may have different column offsets than the answer key)
- Use row labels/identifiers to match rows, not position
- Financial values: treat 1234 and 1,234 and $1,234 and 1234.00 as equivalent
- "Open" slots in specialty sections are expected for unused groups — don't report these as differences
- Focus on MEANINGFUL differences that affect the financial model accuracy

Keep your response concise — bullet points, not paragraphs. Only report actual differences, not things that match.

Tab Name	Safe Filename
IS Input	IS_Input.txt
BS Input	BS_Input.txt
FFS Input	FFS_Input.txt
Payor Input	Payor_Input.txt
Detail	Detail.txt
NRP Adj.	NRP_Adj_.txt

Extraction Compare

Extraction Compare — Parallel Subagent Excel Comparison

Prerequisites

Step 1: Dump both Excel files to text

Extraction Compare

Extraction Compare — Parallel Subagent Excel Comparison

Prerequisites

Step 1: Dump both Excel files to text

Step 2: Launch parallel subagents

Step 3: Present results

Tab name → safe filename mapping

Cost

Session Logs

OpenClaw Test Heap Leaks

Node Connect

Openclaw Qa Testing

Openclaw Secret Scanning Maintainer

Flags