Your product manager asks you to explain what a p-value actually means, and why you keep talking about "power" and "Type I/II errors." Give a clear, intuitive explanation — no jargon.
Now a harder question: your A/B test gives p = 0.04. Your PM says "great, the feature is proven to work, let's ship." What's wrong with that interpretation, and what would you say?
tldr
P-value = probability of seeing data this extreme IF the null is true (not the probability the null is true). Type I error = false positive (ship useless feature). Type II error = false negative (kill working feature). Power = ability to detect real effects, controlled by sample size. Statistical significance ≠ practical significance — always report effect sizes and confidence intervals, not just p-values.
follow-up
- If you could only report one thing from an A/B test to a PM, would you report the p-value or the confidence interval? Why?
- Your experiment has p = 0.06. Some stakeholders want to ship anyway. How do you advise them?
- Explain the difference between "no evidence of effect" and "evidence of no effect." When does each conclusion apply?