Audit generated Excel reports by enumerating populated output rows, exposing formulas, and checking that each subtotal range includes all visible detail rows.
Use this workflow after generating or modifying an Excel report to catch silent spreadsheet defects that simple file-exists or sheet-exists checks will miss.
The core idea is:
This is especially useful for generated financial statements, operational summaries, rollups, grouped exports, and any worksheet where formulas summarize nearby detail rows.
Use this skill when:
Do not stop at "the report was written successfully."
Instead, prove that:
Always audit the workbook after writing it, using a fresh read from disk if possible. This verifies the actual persisted artifact rather than in-memory assumptions.
Checklist:
Why: a writer may have inserted the wrong range, overwritten cells, or shifted rows during formatting.
Print a row-by-row audit log of all populated rows in the report area.
For each row, print:
This creates a human- and machine-inspectable trace of what the generator actually produced.
A useful printed line looks like:
The exact format does not matter as much as consistency and inclusion of formulas.
A row is populated if at least one relevant output cell contains:
Prefer auditing a defined report column range rather than the entire sheet.
Identify rows that represent subtotals, totals, or section summaries.
Common signals:
For each subtotal row, capture:
Do not assume the formula is correct just because a subtotal row exists.
This is the key step.
For each section, determine which visible detail rows should contribute to the subtotal without using the subtotal formula itself as the source of truth.
Possible ways to derive the expected rows:
The important rule:
The expected detail set must be computed independently from the formula being checked.
Otherwise, the audit only repeats the original mistake.
If the report hides rows, collapses groups, or applies filters, inspect whether the subtotal is supposed to summarize:
This skill specifically targets the case where you want to ensure the subtotal includes all visible detail rows in the section.
If your report semantics differ, state them explicitly and audit against those rules.
Parse each subtotal formula and compare its referenced range against the independently derived detail rows.
Look for defects such as:
A generated subtotal like:
=SUM(B8:B11)
may look valid, but if the section's visible detail rows are actually 8 through 12, row 12 has been silently omitted.
This is exactly the kind of defect this audit should catch.
If the independently derived detail rows and formula-covered rows do not match, treat it as a generation failure.
Report:
Example failure message:
Subtotal audit failed on sheet "P&L", row 24 ("Total Operating Expenses"): formula =SUM(C18:C22) but visible detail rows are 18-23; missing row 23.
Do not soften these findings into warnings if correctness matters.
Preserve the printed populated-row trace and subtotal comparison results in logs or artifacts.
This helps with:
For the target columns, collect for each row:
If classification is ambiguous, use explicit report rules rather than guessing.
Even with automated checks, print the full row trace so a reviewer can spot:
You can implement this in any language with an Excel reader. The audit logic matters more than the library.
Useful capabilities:
If formulas are complex, start by auditing common subtotal patterns first, such as:
Then extend as needed.
Pseudocode:
function audit_sheet(sheet): rows = collect_rows(sheet) print_populated_rows(rows)
subtotals = [r for r in rows if r.type == "subtotal"]
for subtotal in subtotals: expected_rows = derive_visible_detail_rows(rows, subtotal.section) actual_rows = rows_referenced_by_formula(subtotal.formula, subtotal.value_column)
if expected_rows != actual_rows:
raise Error(
"Subtotal mismatch at row "
+ subtotal.row_number
+ ": expected "
+ repr(expected_rows)
+ " but formula covers "
+ repr(actual_rows)
)
This example is intentionally generic and should be adapted to your workbook structure.
from openpyxl import load_workbook import re
def is_populated(cells): for cell in cells: if cell.value not in (None, ""): return True return False
def formula_text(cell): return cell.value if isinstance(cell.value, str) and cell.value.startswith("=") else None
def parse_simple_sum_rows(formula, target_col_letter): if not formula: return [] m = re.fullmatch(rf"=SUM({target_col_letter}(\d+):{target_col_letter}(\d+))", formula.replace("$", "")) if not m: return None start, end = map(int, m.groups()) return list(range(start, end + 1))
def audit_sheet(path, sheet_name, start_row, end_row, cols, label_col_idx, value_col_letter): wb = load_workbook(path, data_only=False) ws = wb[sheet_name]
inventory = []
for r in range(start_row, end_row + 1):
cells = [ws[f"{col}{r}"] for col in cols]
if not is_populated(cells):
continue
hidden = ws.row_dimensions[r].hidden is True
values = [c.value for c in cells]
formulas = [formula_text(c) for c in cells]
print(f"Row {r} hidden={hidden} values={values} formulas={formulas}")
label = ws[f"{cols[label_col_idx]}{r}"].value
value_formula = formula_text(ws[f"{value_col_letter}{r}"])
row_type = "detail"
label_text = str(label).strip().lower() if label is not None else ""
if "total" in label_text or "subtotal" in label_text:
row_type = "subtotal"
inventory.append({
"row": r,
"hidden": hidden,
"label": label,
"row_type": row_type,
"value_formula": value_formula,
})
current_section = []
for i, row in enumerate(inventory):
if row["row_type"] != "subtotal":
current_section.append(row)
continue
expected_rows = [x["row"] for x in current_section if x["row_type"] == "detail" and not x["hidden"]]
actual_rows = parse_simple_sum_rows(row["value_formula"], value_col_letter)
if actual_rows is None:
raise ValueError(f"Unsupported formula at row {row['row']}: {row['value_formula']}")
if expected_rows != actual_rows:
raise AssertionError(
f"Subtotal mismatch at row {row['row']} label={row['label']!r}: "
f"expected visible detail rows {expected_rows}, formula covers {actual_rows}"
)
current_section = []
This sketch is deliberately simple. In real use, strengthen row classification and section detection.
A formula can be syntactically valid and still semantically wrong.
A subtotal value might look plausible even when the range is wrong.
Checking "there is a total row" is insufficient.
The audit should derive expected detail membership from report structure or source mapping, not from the formula under test.
Manual workbook opening can miss subtle omissions.
Run it immediately after writing the workbook so defects are caught before delivery.
A generated report passes this skill only if:
Adjust the workflow for:
Even when full formula parsing is hard, the row-by-row printed audit remains valuable and should still be kept.
The audit is complete when you can answer, for every subtotal row:
If not, the report is not verified.38:["$","$L3f",null,{"content":"$40","frontMatter":{"name":"excel-subtotal-audit","description":"Audit generated Excel reports by enumerating populated output rows, exposing formulas, and checking that each subtotal range includes all visible detail rows."}}]