# Confidence interval

When using the sample data we know the sample's statistic but we don't know the true value of the population's measures.
Instead, we may treat the population's measure as a random variable.
The confidence interval is the range that is likely to contain the true value with a probability of the confidence level.

## Mean confidence interval (go to calculator)

The mean's confidence interval is based on the sample average.
When you know the population's standard deviation (σ) you should use the normal distribution
$$\bar{x}\pm Z_{1-\frac{\alpha}{2}}\frac{\sigma}{\sqrt{n}}$$ When you don't know the population's standard deviation you should use the sample standard deviation (S) and the t distribution
$$\bar{x}\pm T_{(1-\frac{\alpha}{2},n-1)}\frac{S}{\sqrt{n}}$$

## Standard deviation confidence interval (go to calculator)

The population's confidence interval is based on the sample standard deivation.
The following statistic distribute Χ2(n-1) $$\frac{(n-1)S^2}{\sigma^2} \sim \chi^2_{(n-1)}$$ You may extrac the σ based on Χ2 α/2 percentile and 1-α/2 percentile. $$\frac{(n-1)S^2}{\chi^2_{(1-\frac{\alpha}{2},n-1)}}\le \sigma \le \frac{(n-1)S^2}{\chi^2_{(\frac{\alpha}{2},n-1)}}$$

## proportion confidence interval (go to calculator)

The population's confidence interval is based on the sample proportion.

Normal approximation
For large enough sample size the sample proportion distributes normaly: $$\hat{p}=\frac{successes}{n} \sim N(\hat{p},\sqrt{\frac{\hat{p}(1-\hat{p})}{n}})$$ Following the confidence interval formula: $$\hat{p}\pm Z \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$$ The normal approximation dosn't support good result for edge proportions, near 0 or 1.

Wilson score interval
The wilson score interval support better results than the normal approximation, especially for small sample sizes and for edge proportions, near 0 or 1.
$$\frac{\hat{p}+\frac{Z^2}{2n}}{1+\frac{Z^2}{n}}\pm \frac{Z}{1+\frac{Z^2}{n}} \sqrt{\frac{\hat{p}(1-\hat{p}}{n}+\frac{Z^2}{4n^2}}$$