This skill should be used when working with CSV files to create interactive data visualizations, generate statistical plots, analyze data distributions, create dashboards, or perform automatic data profiling. It provides comprehensive tools for exploratory data analysis using Plotly for interactive visualizations.
This skill enables comprehensive data visualization and analysis for CSV files. It provides three main capabilities: (1) creating individual interactive visualizations using Plotly, (2) automatic data profiling with statistical summaries, and (3) generating multi-plot dashboards. The skill is optimized for exploratory data analysis, statistical reporting, and creating presentation-ready visualizations.
Invoke this skill when users request:
Create specific chart types for detailed analysis using the script.
visualize_csv.pyAvailable Chart Types:
Statistical Plots:
# Histogram - distribution of numeric data
python3 scripts/visualize_csv.py data.csv --histogram column_name --bins 30
# Box plot - show quartiles and outliers
python3 scripts/visualize_csv.py data.csv --boxplot column_name
# Box plot grouped by category
python3 scripts/visualize_csv.py data.csv --boxplot salary --group-by department
# Violin plot - distribution with probability density
python3 scripts/visualize_csv.py data.csv --violin column_name --group-by category
Relationship Analysis:
# Scatter plot with automatic trend line
python3 scripts/visualize_csv.py data.csv --scatter height weight
# Scatter plot with color and size encoding
python3 scripts/visualize_csv.py data.csv --scatter x y --color category --size value
# Correlation heatmap for all numeric columns
python3 scripts/visualize_csv.py data.csv --correlation
Time Series:
# Line chart for single variable
python3 scripts/visualize_csv.py data.csv --line date sales
# Multiple variables on same chart
python3 scripts/visualize_csv.py data.csv --line date "sales,revenue,profit"
Categorical Data:
# Bar chart (counts categories automatically)
python3 scripts/visualize_csv.py data.csv --bar category
# Pie chart for composition
python3 scripts/visualize_csv.py data.csv --pie region
Output Formats: Specify output file with desired format extension:
# Interactive HTML (default)
python3 scripts/visualize_csv.py data.csv --histogram age -o output.html
# Static image formats
python3 scripts/visualize_csv.py data.csv --scatter x y -o plot.png
python3 scripts/visualize_csv.py data.csv --correlation -o heatmap.pdf
python3 scripts/visualize_csv.py data.csv --bar category -o chart.svg
Generate comprehensive data quality and statistical reports using the data_profile.py script.
Text Report (default):
python3 scripts/data_profile.py data.csv
HTML Report:
python3 scripts/data_profile.py data.csv -f html -o report.html
JSON Report:
python3 scripts/data_profile.py data.csv -f json -o profile.json
What the Profiler Provides:
When to Use Profiling: Always recommend running data profiling BEFORE creating visualizations when:
Create comprehensive dashboards with multiple visualizations using the create_dashboard.py script.
Automatic Dashboard: Analyzes data types and automatically creates appropriate visualizations:
python3 scripts/create_dashboard.py data.csv
Custom output location:
python3 scripts/create_dashboard.py data.csv -o my_dashboard.html
Control number of plots:
python3 scripts/create_dashboard.py data.csv --max-plots 9
Custom Dashboard from Config: Create a JSON configuration file specifying exact plots:
python3 scripts/create_dashboard.py data.csv --config config.json
Dashboard Config Format:
{
"title": "Sales Analysis Dashboard",
"plots": [
{"type": "histogram", "column": "revenue"},
{"type": "box", "column": "revenue", "group_by": "region"},
{"type": "scatter", "column": "advertising", "group_by": "revenue"},
{"type": "bar", "column": "product_category"},
{"type": "correlation"}
]
}
Dashboard Plot Types:
histogram: Distribution of numeric columnbox: Box plot, optionally grouped by categoryscatter: Relationship between two numeric columnsbar: Count of categorical valuescorrelation: Heatmap of numeric correlationsUse this decision tree to determine the appropriate approach:
User provides CSV file
│
├─ "Profile this data" / "Analyze this data" / Unfamiliar dataset
│ └─> Run data_profile.py first
│ Then offer visualization options based on findings
│
├─ "Create dashboard" / "Overview of the data" / Multiple visualizations needed
│ ├─ User knows exact plots wanted
│ │ └─> Create JSON config → run create_dashboard.py with config
│ └─ User wants automatic dashboard
│ └─> Run create_dashboard.py (auto mode)
│
└─ Specific visualization requested ("histogram", "scatter plot", etc.)
└─> Use visualize_csv.py with appropriate flag
python3 scripts/data_profile.py data.csvConsult references/visualization_guide.md for detailed guidance. Quick reference:
The scripts require these Python packages:
pip install pandas plotly numpy
For static image export (PNG, PDF, SVG), also install:
pip install kaleido
# 1. Profile the data
python3 scripts/data_profile.py sales_data.csv -f html -o profile.html
# 2. Create automatic dashboard
python3 scripts/create_dashboard.py sales_data.csv -o dashboard.html
# 3. Dive deeper with specific plots
python3 scripts/visualize_csv.py sales_data.csv --scatter price sales --color region
python3 scripts/visualize_csv.py sales_data.csv --boxplot revenue --group-by product
# Create specific visualizations for report
python3 scripts/visualize_csv.py data.csv --histogram age -o fig1_distribution.png
python3 scripts/visualize_csv.py data.csv --scatter income age -o fig2_correlation.png
python3 scripts/visualize_csv.py data.csv --bar category -o fig3_categories.png
# Generate data summary
python3 scripts/data_profile.py data.csv -f html -o data_summary.html
# Create custom dashboard for presentation
# 1. First, create config.json with desired plots
# 2. Generate dashboard
python3 scripts/create_dashboard.py data.csv --config config.json -o presentation_dashboard.html
"Column not found" errors:
Empty or incorrect visualizations:
Script execution errors:
pip list | grep plotlypip install kaleidovisualize_csv.py: Main visualization script with all chart typesdata_profile.py: Automatic data profiling and quality analysiscreate_dashboard.py: Multi-plot dashboard generatorvisualization_guide.md: Comprehensive guide for choosing appropriate chart types, best practices, and common patterns