The only way to describe the data fully is to supply the entire data which is not very practical. Instead we can supply several measures that will support a good description of the data. Usually we don't have the entire data, so we collect a random sample data that should represent the entire set.
The population parameters are absolute values that are usually unknown to the sampler. These are the measurements that random samplings hope to approximate.
Name | Description | Formula |
---|---|---|
N | The size of the entire population. | |
Mean(μ) | The average of the population, or the sum of all of the different values in the sample divided by the number of values. | ![]() |
Standard deviation(σ) | This is a standard measure of the spread of the data. It is derived from the square root of the distances between each value in the population and the population mean squared. | ![]() |
Variance | Represent the amount of variation of the data. Also known as, the standard deviation squared. | ![]() |
In most cases we are unable to collect the measures of the entire population. Instead, we take samples and calculate statistics based on these samples.
Each sample statistic is a random variable which attempts to estimate one of the population's parameters.
Name | Description | Formula |
---|---|---|
n | The size of the sample. | |
x | This is the average of the sample and is an approximation for the population's mean (μ). This estimation for the population parameter has better accuracy the larger the samples size, n, gets. | ![]() |
S | Sample Standard Deviation, is the estimation for the Standard Deviation (σ). the bigger the sample size n, the better the estimation. unlike population standard deviation the calculation uses n-1, instead of n. when using n-1 the estimation will not be biased, say the average of the estimation will be the standard deviation | ![]() |
S2 | The spread of the values in the specific sample. | ![]() |