Linear Regression Calculator

Linear regression calculator and prediction interval calculator with step-by-step solution.


You may change the X and Y labels. Separate data by Enter or comma, , or space after each value. The tool ignores non-numeric cells.

Linear regression calculator

The linear regression calculator generates the linear regression equation. It also draws: a linear regression line, a histogram, a residuals QQ-plot, a residuals x-plot, and a distribution chart.
It calculates the R-squared, the R, and the outliers, then testing the fit of the linear model to the data and checking the residuals' normality assumption and the priori power.

What is linear regression?

The linear regression is the linear equation that best fits the points.
There is no one way to choose the best fit ting line, the most common one is the ordinary least squares (OLS). The linear regression describes the relationship between the dependent variable (Y) and the independent variables (X).
The linear regression model calculates the dependent variable (DV) based on the independent variables (IV, predictors).

What is "ordinary least squares"?

The ordinary least squares method chooses the line parameters that minimize the sum of squares of the differences between the observed dependent variables (Y) and the estimated value by the linear regression (Ŷ).

Why do you need linear regression?

We may use linear regression when we want to do one of the following
  • Predict the dependent variable (Ŷ).
  • Estimate the effect of each independent variable (X) on the dependent variable (Y).
  • Calculate the correlation between the dependent variable and the independent variables.
  • Test the linear model significance level.

How to calculate linear regression?

Following the linear regression formula:
Ŷ = b0 +b1x
b0 - the y-intercept, where the line crosses the y-axis.
b1 - the slope, describes the line's direction and incline.
b1 =SPxy=Σ(xi-x̄)(yi-ȳ)
SSxΣ(xi-x̄)2
b0 = ȳ - b1

linear regression prediction

The prediction calculator uses the linear regrssion to predict the depdendent variable based on the independent value. The calculator also creates the confidence interval, and the prediction interval.

Confidence interval of the prediction

The prediction interval for the mean value of the dependent variable.
This is the interval for the equation line, the true value equation will be in this interval. If we would know the true equation then the width of this interval would be zero.
If you would calculate the confidence interval over an infinite number of regressions with the same sample size, 95% (confidence level) of the calculated confidence intervals will contain the mean's true value.
Since this interval is for the mean, the standard error is smaller and the the range is narrower than the range of the prediction interval.

MSresidual = S2residual =Σ( yi - ŷ)2
n - 2
S.E²ci = S²residual(1+(x₀ - x̄)²)
nSSx

Ŷ ± T1-α/2(n-2)*S.Eci

Prediction Interval

The prediction interval for a particular observation of the dependent variable.
This is the interval for any single value.
The prediction inteval takes into consideration the fact that you don't know the true equatio, and the fact the the liner regression explaned only part of the variance (the part is R-squared). Even if we would know the true equation then the width of this interval would be greater than zero.
Since this interval is for a single observation, the standard error is larger and the range is wider than the range of the confidence interval

S.E²prediction = S²residual(1+1+(x₀ - x̄)²)
nSSx

Ŷ ± T1-α/2(n-2)*S.Eprediction

How to calculate R squares?

R squares is the percentage of the variance explain by the regression (SSRegression) from the overall variance (SSTotal).
R2 =SSRegression
SSTotal

Linear regression in calculator

This online calculator supports all the basic functionality and more.

The right-tailed F test checks if the entire regression model is statistically significant. Why only right tail?
For Multiple regression calculator with the stepwise method and assumptions validations: multiple regression calculator

The following statistic checks if the linear regression model supports better results than the average of Y.
Hypotheses
H0: Y = b0
H1: Y = b0+b1X
Test statistic
F =MS(regression)
MS (residual)
F distribution
F distribution right tailed