Mann-Whitney U Test(Go to the calculator)

The Mann-Whitney U test is a non-parametric test, it checks continuous or ordinal data for a significant difference between two independent groups. The test merge the data from the two groups. Then, it sorts the data by the value.

When to use?

The test usually comes as a failsafe option to a Two Sample T-test or Two sample Z-test when the test doesn't meet the normality assumption, or contains many outliers.
A t-test compares the means of the two groups, while a Mann-Whitney U test compares the entire distributions. If the two groups have a similar distribution curve, the test will also compare the medians of the two groups.
A Two Sample T-test is slightly stronger than a Mann-Whitney U Test. A Mann-Whitney U test has 95% efficiency in comparison to a Two sample T-test. If the population is similar to a normal distribution and reasonably symmetric, it is better to use a Two Sample T-test. A Two Sample T-test compares the means of the two groups

Assumptions

Calculate U

Critical Value

When n is small, the tool will use the exact value from tables, the exact critical value is accurate (for common significant levels ), while the p-value is usually interpolated from two values in the table. The tool uses a log interpolation which is more accurate with small p-values and less accurate with large p-values. (the common significant levels are small: 0.05, 0.01).
There is no consensus about what is a small n. When Method="automatic" the tool uses the Exact method when n1 ≤ 25 and n2 ≤ 25,otherwise the tool will use the normal approximation.
When Method="z approximation" the tool uses only the normal approximation

Statistical tables

Calculated the critical U from a statistical table.

Corrected normal approximation

To get more accurate results, the tool uses continuity correction and ties corrections. ties is a group of observations with the same value) $$ z = \frac { U_2 - \mu_u + C_{continuity}} {\sigma_u}$$ $$ \mu= \frac {n_1 n_2} {2}$$ $$ \sigma^2= \frac {n_1 n_2(n_1 + n_2 + 1)} {12} (1 - C_{ties})$$

Ties correction

$$n = n_1 + n_2
\\ C_{ties} = \sum_{i=1}^{t}{\frac{f_t^3-f_t}{n^3-n}}$$ t - group number of ties.
f_t - number of values in group t.

Continuity correction

When using discrete data to continuous distribution it is better to use the c=0.5 continuity correction.
If  U > μ , Ccontinuity = - 0.5
If  U < μ , Ccontinuity = 0.5

When using a continuous data, Ccontinuity = 0.

Example

In the following example, we check the number of questions answered correctly in two independent groups, one group have completed a training before performed the test while the other group haven't complete the training.
the significant level (α) is 0.05.

Following the test results. The sample sizes: n1=8, n2=10

wilcoxon sign rank example data and calculation

Indirect method

I prefer the indirect method which I find easier with big samples, and not much more complex with small samples.

Direct method

Statistical tables

Two tailed (H0: Group a = Group b)

Left tail (H0: Group a ≥ Group b)

Right tail (H0: Group a < Group b)

Corrected normal approximation

Two tailed (H0: Group a = Group b)

Left tail (H0: Group a ≥ Group b)

Right tail (H0: Group a < Group b)