Many A/B testing problems come from using statistical methods without checking if they fit the situation. The three most common mistakes are: (1) using the MannMany A/B testing problems come from using statistical methods without checking if they fit the situation. The three most common mistakes are: (1) using the Mann

Three A/B Testing Mistakes I Keep Seeing (And How to Avoid Them)

2025/12/24 12:33
4 min read
For feedback or concerns regarding this content, please contact us at crypto.news@mexc.com

Over the past few years, I have observed many common errors people make when designing A/B tests and performing post-analysis. In this article, I want to highlight three of these mistakes and explain how they can be avoided.

Using Mann–Whitney to compare medians

The first mistake is the incorrect use of the Mann–Whitney test. This method is widely misunderstood and frequently misused, as many people treat it as a non-parametric “t-test” for medians. In fact, the Mann–Whitney test is designed to determine whether there is a shift between two distributions.

\

When applying the Mann–Whitney test, the hypotheses are defined as follows:

\ We must always consider the assumptions of the test. There are only two:

  • Observations are i.i.d.
  • The distributions have the same shape

\ How to compute the Mann–Whitney statistic:

  1. Sort all observations by magnitude.
  2. Assign ranks to all observations.
  3. Compute the U statistics for both samples.

\

  1. Choose the minimum from these two values
  2. Use statistical tables for the Mann-Whitney U test to find the probability of observing this value of U or lower.

**Since we now know that this test should not be used to compare medians, what should we use instead?

\ Fortunately, in 1945 the statistician Frank Wilcoxon introduced the signed-rank test, now known as the Wilcoxon Signed Rank Test.

The hypotheses for this test match what we originally expected:

How to calculate the Wilcoxon Signed Rank test statistic:

  1. For each paired observation, calculate the difference, keeping both its absolute value and sign.

  2. Sort the absolute differences from smallest to largest and assign ranks.

  3. Compute the test statistic:

    \

  4. The statistic W follows a known distribution. When n is larger than roughly 20, it is approximately normally distributed. This allows us to compute the probability of observing W under the null hypothesis and determine statistical significance.

    \ Some intuition behind the formula:

Using bootstrapping everywhere and for every dataset

The second mistake is applying bootstrapping all the time. I’ve often seen people bootstrap every dataset without first verifying whether bootstrapping is appropriate in that context.

The key assumption behind bootstrapping is

==The sample must be representative of the population from which it was drawn.==

If the sample is biased and poorly represents the population, the bootstrapped statistics will also be biased. That’s why it’s crucial to examine proportions across different cohorts and segments.

For example, if your sample contains only women, while your overall customer base has an equal gender split, bootstrapping is not appropriate.

Always using default Type I and Type II error values

Last but not least is the habit of blindly using default experiment parameters. In about 95% of cases, 99% of analysts and data scientists at 95% of companies stick with defaults: a 5% Type I error rate and a 20% Type II error rate (or 80% test power).

\ Let’s start with why don’t we just set both Type I and Type II error rates to 0%?

==Because doing so would require an infinite sample size, meaning the experiment would never end.==

Clearly, that’s not practical. We must strike a balance between the number of samples we can collect and acceptable error rates.

I encourage people to consider all relevant product constraints.

The most convenient way to do it , create the table ,that you see below, and discuss it with product managers and people who are responsible for the product.

\

For a company like Netflix, even a 1% MDE can translate into substantial profit. For a small startup, that’s not true. Google, on the other hand, can easily run experiments involving tens of millions of users, making it reasonable to set the Type I error rate as low as 0.1% to gain higher confidence in the results.

\


Our path to excellence is paved with mistakes. Let’s make them!

Market Opportunity
B Logo
B Price(B)
$0.18118
$0.18118$0.18118
+0.32%
USD
B (B) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact crypto.news@mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

IP Hits $11.75, HYPE Climbs to $55, BlockDAG Surpasses Both with $407M Presale Surge!

IP Hits $11.75, HYPE Climbs to $55, BlockDAG Surpasses Both with $407M Presale Surge!

The post IP Hits $11.75, HYPE Climbs to $55, BlockDAG Surpasses Both with $407M Presale Surge! appeared on BitcoinEthereumNews.com. Crypto News 17 September 2025 | 18:00 Discover why BlockDAG’s upcoming Awakening Testnet launch makes it the best crypto to buy today as Story (IP) price jumps to $11.75 and Hyperliquid hits new highs. Recent crypto market numbers show strength but also some limits. The Story (IP) price jump has been sharp, fueled by big buybacks and speculation, yet critics point out that revenue still lags far behind its valuation. The Hyperliquid (HYPE) price looks solid around the mid-$50s after a new all-time high, but questions remain about sustainability once the hype around USDH proposals cools down. So the obvious question is: why chase coins that are either stretched thin or at risk of retracing when you could back a network that’s already proving itself on the ground? That’s where BlockDAG comes in. While other chains are stuck dealing with validator congestion or outages, BlockDAG’s upcoming Awakening Testnet will be stress-testing its EVM-compatible smart chain with real miners before listing. For anyone looking for the best crypto coin to buy, the choice between waiting on fixes or joining live progress feels like an easy one. BlockDAG: Smart Chain Running Before Launch Ethereum continues to wrestle with gas congestion, and Solana is still known for network freezes, yet BlockDAG is already showing a different picture. Its upcoming Awakening Testnet, set to launch on September 25, isn’t just a demo; it’s a live rollout where the chain’s base protocols are being stress-tested with miners connected globally. EVM compatibility is active, account abstraction is built in, and tools like updated vesting contracts and Stratum integration are already functional. Instead of waiting for fixes like other networks, BlockDAG is proving its infrastructure in real time. What makes this even more important is that the technology is operational before the coin even hits exchanges. That…
Share
BitcoinEthereumNews2025/09/18 00:32
StakeStone STO Surges 128% in 24 Hours: What $955M Volume Tells Us

StakeStone STO Surges 128% in 24 Hours: What $955M Volume Tells Us

StakeStone's STO token recorded a staggering 128% price increase in 24 hours, accompanied by $955.8 million in trading volume—nearly seven times its $141 million
Share
Blockchainmagazine2026/04/02 18:06
Q2 Market Insights: Bitcoin regains dominance in risk-averse environment, ETFs remain critical to market structure

Q2 Market Insights: Bitcoin regains dominance in risk-averse environment, ETFs remain critical to market structure

The market will show a downward trend in the short term, and then rebound and set new highs in the second half of the year.
Share
PANews2025/04/28 19:40

$30,000 in PRL + 15,000 USDT

$30,000 in PRL + 15,000 USDT$30,000 in PRL + 15,000 USDT

Deposit & trade PRL to boost your rewards!