Product Analytics - A/B Testing

How to answer A/B Testing questions?

#1 Clearly define null and alternative hypothesis

Null Hypothesis: In A/B testing, the null hypothesis states that there is no difference between the control and variant group.
Alternative Hypothesis: The alternative hypothesis states that there is a measurable difference between the control and variant group.

The goal of the test is to determine whether to reject the null in favor of the alternative with statistical significance

#2 State the methodology: For a scaled product, there are 10s of A/B tests performed at a given point in time, it's critical to keep a clear record of different tests. You can use the PICOT framework, commonly used in healthcare research:

Population: Target audience for the test(e.g. website visitors)
Intervention: The change or variant being introduced and tested relative to the control. This should be a measurable product change such as altering the color of a call-to-action button and modifying the checkout process.
Comparison: Existing product without any change (the control)
Outcome: The key metrics that define the impact of the intervention. These become the main results to evaluate and often the most critical part of the interview.
Time: The duration over which the experiment will run before analyzing the data and reaching a conclusion

#3. Watch for biases and statistical significance: Watch for biases and statistical significance when analyzing A/B tests:

Novelty Effect: Users engage more with new features out of curiosity. But this spike may not last after the novelty wears off. Look for lasting changes over time, not just short spikes.
Primacy Effect: Users prefer and stick to the original version. They may resist changes at first since they are used to the old way. Long-time users especially can temporarily react negatively.
Interface Issues: Make sure people in the test group (trying the new thing) can't influence the control group (using the old thing).
Statistical Significance: Check if results are truly significant, not just different. P-value represents probability of extreme results occurring by chance.By convention, p-value under 5% is a statistically significant difference.

#4: Get hands-on practical: Products such Mixpanel, Amplitude, and Optimizely make it easy to conduct A/B testing. If you haven't done A/B testing before, try one of these tools in your personal projects or watch a demo. You can also speak to your friends in the PM and Data Science world who regularly perform A/B tests as a part of their day-to-day job. Getting practical experience will make you more confident in answering in-depth questions. Also, you'll learn key considerations that come up when running actual tests.

Complete and Continue