Name: Storytelling with Data
Author: niits

搵技能.../

Storytelling with Data | Skills Pool

[ ] I can name my primary audience
[ ] I have written a one-sentence Big Idea
[ ] I know what action I want them to take
[ ] I know if this is a live presentation, document, or dashboard

Relationship	Best Chart	Avoid
Change over time	Line plot	Bar chart for many time points
Comparison (few categories)	Bar chart (vertical)	3D bar, exploded pie
Comparison (many categories)	Horizontal bar	Vertical bar with rotated labels
Part of a whole	Stacked bar, 100% bar	Pie chart, donut chart
Distribution	Histogram, box plot
Correlation / relationship	Scatter plot
Two variables over time	Connected scatter	Dual-axis line
Single number that matters	Big number + context	Table

# Databricks: standard clutter elimination
import matplotlib.pyplot as plt

def declutter(ax):
    """Remove visual noise from a matplotlib axes."""
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)
    ax.spines['left'].set_color('#CCCCCC')
    ax.spines['bottom'].set_color('#CCCCCC')
    ax.tick_params(colors='#666666')
    ax.yaxis.grid(True, color='#EEEEEE', linewidth=0.8, zorder=0)
    ax.set_axisbelow(True)
    return ax

Attribute	Best for	How to apply
Color (hue)	Categorical difference	1 accent color; rest in gray
Color (intensity)	Magnitude / importance	Darker = more important
Size	Magnitude	Larger = more important
Position	Comparison	Align baselines
Bold / weight	Text emphasis	Bold the key number
Enclosure	Grouping	Box / shading around a region

# Storytelling with Data color pattern
GRAY_LIGHT  = '#CCCCCC'
GRAY_MED    = '#888888'
GRAY_DARK   = '#444444'
ACCENT      = '#E8664A'   # warm coral — stands out, not alarming
ACCENT_BLUE = '#1A77B5'   # for positive / neutral emphasis

def apply_swd_palette(ax, highlight_index, bars):
    """Color one bar accent, all others gray."""
    for i, bar in enumerate(bars):
        bar.set_color(ACCENT if i == highlight_index else GRAY_LIGHT)

Principle	Application
Proximity	Group related series / labels close together
Similarity	Same color = same category across charts
Enclosure	Shaded region = this area is different / important
Continuity	Line implies connection; use only when data is continuous
Figure-ground	Accent color pops; gray recedes

Setup (tension)           → Conflict (complication)      → Resolution (call to action)
"Here is what we expected"  "Here is what actually happened"  "Here is what we should do"

# Annotating the key moment in a line chart
ax.annotate(
    'Policy change\ncaused spike',
    xy=(highlight_x, highlight_y),
    xytext=(highlight_x + 2, highlight_y + 5),
    arrowprops=dict(arrowstyle='->', color=GRAY_DARK, lw=1.2),
    fontsize=9, color=GRAY_DARK,
    ha='left'
)

import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import numpy as np

# SWD color constants
GRAY_LIGHT  = '#CCCCCC'
GRAY_MED    = '#888888'
ACCENT      = '#E8664A'

# --- Data ---
categories = ['Product A', 'Product B', 'Product C', 'Product D', 'Product E']
values     = [42, 78, 55, 91, 63]
highlight  = 3   # index of the bar we want the audience to focus on

# --- Build chart ---
fig, ax = plt.subplots(figsize=(6, 3.5))

colors = [ACCENT if i == highlight else GRAY_LIGHT for i in range(len(categories))]
bars = ax.barh(categories, values, color=colors, height=0.55)

# Eliminate clutter
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['bottom'].set_visible(False)
ax.xaxis.set_visible(False)
ax.tick_params(left=False)

# Direct labels — no axis needed
for bar, val in zip(bars, values):
    ax.text(bar.get_width() + 1, bar.get_y() + bar.get_height() / 2,
            f'{val}%', va='center', fontsize=9,
            color=ACCENT if val == values[highlight] else GRAY_MED,
            fontweight='bold' if val == values[highlight] else 'normal')

# Title IS the insight
ax.set_title('Product D leads with 91% satisfaction — 13 pts above average',
             fontsize=11, fontweight='bold', color='#222222', loc='left', pad=12)

display(fig)
plt.close(fig)

Tier	Who	What They Need	Chart Style
Executive	Head of division, business owner	Business impact in $ or %, single KPI trend	Bullet chart, big number + sparkline, 1-chart slides
Risk / Model Committee	Risk managers, validators	Model performance, stability, concentration	KS curve, vintage, migration matrix, PSI
Regulator / Auditor	External / internal audit	Methodology transparency, non-misleading scales	Conservative styling, full annotation, axis at 0
Fellow Practitioner	Data scientists, MLEs	Diagnostic depth, calibration, attribution	ROC, calibration plot, SHAP, confusion matrix

Task	Primary Chart	Reference
Scorecard discrimination	KS curve + score distribution overlay	`credit-risk-charts.md`
Portfolio cohort health	Vintage curve	`credit-risk-charts.md`
Credit state transitions	Migration matrix	`credit-risk-charts.md`
Risk concentration	Risk heatmap (segment × metric)	`credit-risk-charts.md`
KPI vs target	Bullet chart	`credit-risk-charts.md`
Model drift monitoring	PSI bar chart	`credit-risk-charts.md`

Task	Primary Chart	Reference
Discrimination	ROC + KS	`model-evaluation-viz.md`
Class imbalance performance	Precision-recall curve	`model-evaluation-viz.md`
Probability trustworthiness	Calibration / reliability diagram	`model-evaluation-viz.md`
Global feature attribution	Feature importance bar	`model-evaluation-viz.md`
Individual decision explanation	SHAP waterfall	`model-evaluation-viz.md`
Business value of model	Lift / gain chart	`model-evaluation-viz.md`

Task	Primary Chart	Reference
Volume + anomaly window	Time series + enclosure	`fraud-detection-charts.md`
Volume vs rate trend	Dual panel (no dual axis)	`fraud-detection-charts.md`
Suspicious entity pattern	Scatter: amount × frequency	`fraud-detection-charts.md`
Time-of-day concentration	Calendar heatmap	`fraud-detection-charts.md`
Alert context	Rolling z-score with ±σ band	`fraud-detection-charts.md`

Task	Primary Chart	Reference
Retention by cohort	Cohort retention heatmap	`customer-analytics-charts.md`
Onboarding / product funnel	Center-aligned funnel chart	`customer-analytics-charts.md`
Segment comparison	Small multiples bar (not radar)	`customer-analytics-charts.md`
Churn risk sizing	Probability distribution + bands	`customer-analytics-charts.md`
CLV comparison	Horizontal box plot by segment	`customer-analytics-charts.md`
A/B test / campaign result	Effect size with CI	`customer-analytics-charts.md`

Task	Primary Chart	Reference
Treatment effect estimate	Coefficient plot with CI	`causal-inference-charts.md`
Parallel trends assumption	Pre/post trend overlay	`causal-inference-charts.md`
DiD heterogeneity	Subgroup effect plot	`causal-inference-charts.md`
RDD discontinuity	Binned scatter + regression	`causal-inference-charts.md`
Propensity overlap	Overlapping histogram	`causal-inference-charts.md`

Anti-Pattern	Problem	Fix
Rainbow colors for categories	Eye jumps everywhere, no focal point	1 accent color + gray for the rest
Pie chart with 6+ slices	Cannot compare angles accurately	Horizontal bar chart
Dual y-axis	Misleading — scales are arbitrary	Two separate charts, or index to 100
Title describes the chart	Audience must figure out the point themselves	Title states the insight
Legend instead of direct labels	Forces back-and-forth eye movement	Label lines/bars directly
Y-axis not starting at 0 (bar charts)	Exaggerates differences	Always start at 0 for bars
Sorted alphabetically	Ranking is invisible	Sort by value (descending)
All data equally visible	No signal, only noise	Gray everything except the key series

Storytelling with Data

Overview

The Six-Step Framework

Step 1 — Understand the Context

Storytelling with Data

Overview

The Six-Step Framework

Step 1 — Understand the Context

Step 2 — Choose an Appropriate Display

Step 3 — Eliminate Clutter

Step 4 — Focus Attention

Step 5 — Think Like a Designer

Step 6 — Tell a Story

Workflow

Databricks Quick Start

Domain Context: Credit and Risk Analytics

Audience Tiers

Trustworthiness in Regulated Environments

Task Map by Domain

Common Anti-Patterns and Fixes

Resources

references/principles/

references/domain/

scripts/

Llm Trading Agent Security

Energy Procurement

Council

Carrier Relationship Management

Market Research

Market Research