Analyze A/B test results for statistical significance with ship, extend, or kill recommendations
Analyzes A/B test results for statistical significance. Takes variant data (visitors and conversions), calculates conversion rates, lift, p-value, confidence interval, statistical power, and required sample size. Provides a clear recommendation to ship, extend, or kill the test based on rigorous statistical analysis.
agency.config.json at repo root (for context on business goals and CRO service framing)agency.config.json from the project root.services[] -- CRO context for framing recommendationscase_studies[] -- benchmark conversion data if relevanttest_name -- descriptive name for the testcontrol -- control variant data: { "name": "Control", "visitors": N, "conversions": N }variants -- array of variant data: [{ "name": "Variant B", "visitors": N, "conversions": N }]test_type -- what is being tested: headline | cta | layout | pricing | image | copy | color | other (default: other)page -- which page the test runs on (default: unknown)confidence_level -- desired confidence threshold (default: 0.95 for 95%)minimum_detectable_effect -- smallest meaningful lift % (default: 5%)daily_traffic -- average daily visitors to the test page (for duration estimates)revenue_per_conversion -- average value per conversion (for revenue impact)test_duration_days -- how long the test has been running (for novelty effect check)conversion_rate = conversions / visitors
standard_error = sqrt(conversion_rate * (1 - conversion_rate) / visitors)
confidence_interval_95 = [conversion_rate - 1.96 * SE, conversion_rate + 1.96 * SE]
absolute_lift = variant_rate - control_rate
relative_lift = (variant_rate - control_rate) / control_rate * 100
pooled_rate = (control_conversions + variant_conversions) / (control_visitors + variant_visitors)
pooled_SE = sqrt(pooled_rate * (1 - pooled_rate) * (1/control_visitors + 1/variant_visitors))
z_score = (variant_rate - control_rate) / pooled_SE
p_value = 2 * (1 - normal_cdf(abs(z_score))) // two-tailed test
p_value < (1 - confidence_level)effect_size = abs(variant_rate - control_rate)
power = probability of detecting this effect size given sample sizes
Estimate power using:
For the specified minimum_detectable_effect:
required_per_variant = (z_alpha + z_beta)^2 * 2 * p * (1-p) / delta^2