Analyse an Excel (.xlsx/.xls) or CSV file — sheets, column types, stats, and data quality
Analyse the spreadsheet file at {args}.
Check that {args} is not empty and the file exists using the glob or shell tool.
If the file does not exist or no path was given, stop immediately and tell the user:
"Please provide a valid file path. Usage:
/xlsx-analyser <path/to/file.xlsx>"
Supported formats: .xlsx, .xls, .xlsm, .csv
Use the shell tool to execute:
python {skill_dir}/scripts/analyse.py "{args}"
The script outputs a JSON report to stdout. Capture this output — it is the source of truth for everything below.
If the script exits with "openpyxl is not installed", install it first:
pip install openpyxl
Then re-run the analysis script.
The JSON will contain a sheet_names array listing all sheets. Analyse all sheets by default.
If the user specified a particular sheet, re-run with the --sheet flag:
python {skill_dir}/scripts/analyse.py "{args}" --sheet "SheetName"
Parse the JSON and present the analysis in the following markdown structure. Do not show raw JSON to the user.
{args}| Metric | Value |
|---|---|
| File | {args} |
| Format | (from JSON: format) |
| Sheets | (count and names) |
| Total rows | (sum across all sheets) |
| Total columns | (per sheet if multiple) |
For each sheet in the workbook, render the following sections:
<sheet_name><dimensions from JSON>
For every column, render:
<column_name> · <inferred_type> <flags>| Property | Value |
|---|---|
| Non-empty | X / Y (Z%) |
| Nulls | X (Z%) |
| Unique values | N |
| (type-specific stats) | (see rules below) |
Type-specific stat rows to add:
integer / float: Min, Max, Mean, Median, Std Devboolean: True count, False counttext: Avg length, Min length, Max length, Top valuesdate: Earliest, LatestIf unique_values is not null (≤ 5 unique values), add after the table:
Distinct values:
val1,val2,val3
Flag rendering:
high_nulls → append ⚠ high nulls after the typeconstant_column → append ⚠ constant after the typeunnamed_column → append ⚠ unnamed after the typeIf the sheet has entries in issues, list them as bullet points under this heading.
If no issues, write: No issues detected.
After all sheets are profiled:
identifier, categorical, continuous, boolean, datetime, or texterror is set in the JSON, report the error clearly and suggest a fix