Proportion Test

One sample proportion test calculator Two sample proportion test calculator

The proportion test compares the sample's proportion to the population's proportion or compares the sample's proportion to the proportion of another sample.

One sample proportion test (Go to the calculator)

We use this test to check if the known proportion is statistically correct, based on the sample proportion and the sample size.
the null hypothesis assumes that the known proportion is correct. The statistical decision will be based on the difference between the know proportion and the sample proportion.

You may choose between the binomial test, which is more accurate, especially for the small sample size and the normal approximation.
We recommend using only the binomial test. If the tool won't be able to calculate the binomial distribution it will automatically calculate base on the normal approximation. depend on the sample size and how close is x to np. for a sample size smaller than 1000 any combination will be calculate based on the binomial distribution (when choosing the binomial test).

Example: It is known that the proportion of newborn males in the human race is 0.5122. The residence of Brobdingnag claims that in their country the proportion is smaller.

Assumptions

Required sample data

Calculated based on a random sample from the entire population.

Test statistic

Normal approximation
x distribution is binomial.
The binomial mean is μ = np, and the binomial standard deviation is: $$\sigma_x=\sqrt{np(1-p)}$$ The proportion p distributes with a mean of p0 and the following standard deviation: $$\sigma_p=\sqrt{\frac{p_0(1-p_0)}{n}}$$ Following the normal statistic: $$z=\frac{(\hat{p}-p_0)+c}{\sqrt{\frac{p_0(1-p_0)}{n}}}$$ Since the normal distribution is continuous and the binomial distribution is discrete we may use the continuous correction to improve a bit the result.
$$ p>p_0:\qquad\quad c=-\frac{1}{2n}\\ p\lt p_0:\qquad\quad c=-\frac{1}{2n}\\ |p-p_0|\lt \frac{1}{2n}:\;c=0 $$ Exact test - binomial distribution
When using the binomial distribution the test statistic is the number of successes: X.

Since the distribution is discrete there is a big difference between lower/greater or lower equal/greater equal (unlike continuous distribution). It is also more complicated to calculate the 2 tailed p-value as the distribution is not symmetrical, and you can't get the exact the same value in the opposite tail.
The following will use the example of n = 8 and p = 0.25

xp(X=x)p(X≤x)p(X≥x)
00.1001129150.1001129151
10.2669677730.3670806880.899887085
20.3114624020.6785430910.632919312
30.2076416020.8861846920.321456909
40.0865173340.9727020260.113815308
50.0230712890.9957733150.027297974
60.0038452150.999618530.004226685
70.0003662110.9999847410.00038147
81.52588E-0511.52588E-05
When x = np, x equals exactly the mean, x located in the middle of the distribution and p-value equals 1.
Left-tailed
$$p-value=p(X\le x)=\sum_{i=0}^{x}\binom{n}{x}p^xq^{n-x}$$ Example: x=1.
Since x < np, x located on the left side of the distribution. (1<8*0.25)
p-value=p(X≤1)=0.367081.
binomial left tail
Right-tailed
$$p-value=p(X\ge x)=\sum_{i=x}^{n}\binom{n}{x}p^xq^{n-x}$$ Example: x=4.
Since x > np, x located in the right side of the distribution. (4>8*0.25)
p-value=p(X ≥ 4)=1-p(X ≤ (4-1))=0.113815.

binomial right tail
Two-tailed
Find the tail in one size based on x.
Find the x' on the opposite tail, with the greater density that is less or equal to the density of x.
For example, if x on the left tail: $$p-value=p(X\le x) + p(X\ge x')\\ p-value=\sum_{i=0}^{x}\binom{n}{x}p^xq^{n-x} + \sum_{i=x}^{n}\binom{n}{x}p^xq^{n-x}$$ Example: x=1.
On the left side: p(X=1)=0.266968
On the right side: p(X=3)=0.207642. p(x=2)=0.311462, so x'=3.
p-value = p(X≤1) + p(X ≥ 3) = 0.367081 + 0.321457 = 0.6885376.
binomial two-tailed

Effect size

The tool calculates the h effect size.
$$\varphi(p)=2arcsine(\sqrt{p})\\ h=\varphi(p̂)-\varphi(P_0)$$ Cohen's interpretation for the h effect size:
Small effect - 0.2.
Medium effect - 0.5.
Large effect - 0.8.

Two Sample proportion test (Go to the calculator)

We use this test to check if the proportion of group1 is the same as the proportion of group2.
The tool's null hypothesis assumes that the known difference between the groups is zero (using only the pooled variance).
Example: compares the proportion of good oranges between two fields, base on a sample from each group. H0 assumes the proportions are identical.

Assumptions

Required sample data

Calculated based on a random sample from the entire population

Test statistic

Normal approximation
X1 and X2 distributions are binomial.
The difference between P1 and P2 assumes to distributes with mean equals 0, and Following the normal statistic: $$z=\frac{(\hat{p_1}-\hat{p_2})-0}{\sigma}\\$$ There are two ways to calculates the standard deviation σ based on the null assumption.

Pooled variance
When H0 assumes p1 - p2 = 0, since the standard deviation of the binomial distribution is based on p, this assumption also includes the assumption that the standard deviation is identical for the two samples, hence we should calculate the pooled variance. based on the two samples together. $$ \hat{p}=\frac{x_1+x_2}{n_1+n_2}=\frac{n_1\hat{p}_1+n_2\hat{p}_2}{n_1+n_2}\\ Var_{pooled}=Var_1+Var_2\\ Var_{pooled}=\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}\\ $$ Since we assume that p1=p2: $$ \hat{p_1}=\hat{p_2}=\hat{p}\\ Var_{pooled}=\frac{\hat{p}(1-\hat{p})}{n_1}+\frac{\hat{p}(1-\hat{p})}{n_2}\\ \sigma_{pooled}=\sqrt{\hat{p}(1-\hat{p})(\frac{1}{n_1}+\frac{1}{n_2})}\\ z_{pooled}=\frac{(\hat{p_1}-\hat{p_2})+c}{\sqrt{\hat{p}(1-\hat{p})(\frac{1}{n_1}+\frac{1}{n_2})}}\\ $$ Unpooled variance
The tool doesn't calculate the unpooled variance.
When H0 assumes p1 - p2 = d, since the standard deviation of the binomial distribution is based on p, this assumption also includes the assumption that the standard deviation is not identical for the two samples, so we need to calculate the accumulate variance of two independent random variables: $$ Var_{unpooled}=Var_1+Var_2\\ \sigma_{unpooled}=\sqrt{\frac{\hat{p_1}(1-\hat{p_1})}{n_1}+\frac{\hat{p_2}(1-\hat{p_2})}{n_2}}\\ z_{unpooled}=\frac{(\hat{p_1}-\hat{p_2})+c}{\sqrt{\frac{\hat{p_1}(1-\hat{p_1})}{n_1}+\frac{\hat{p_2}(1-\hat{p_2})}{n_2}}}\\ $$ Since the normal distribution is continuous and the binomial distribution is discrete we may use the continuous correction to improve a bit the result.
$$ Right-tailed\;or\;Two\;tailed:\; c=\frac{1}{2n_1}+\frac{1}{2n_2}\\ Left-tailed\;or\;Two\;tailed:\; c=-\frac{1}{2n_1}+\frac{1}{2n_2}\\ $$

Effect size

The tool calculates the h effect size.
$$\varphi(p)=2arcsine(\sqrt{p})\\ h=\varphi(p̂_1)-\varphi(p̂_2)$$ Cohen's interpretation for the h effect size:
Small effect - 0.2.
Medium effect - 0.5.
Large effect - 0.8.