Posts

Showing posts with the label Linear Regression

Assumptions behind Regression

5 Assumptions of linear regressions are 1. A Line Describes the Data: the relationship really is linear (or, for practical purposes, approximately linear over the range of the population being studied). 2. Homoscedasticity: the standard deviations of the residuals does not vary with the values of the explanatory variables.  In other words, the dispersion of the data around the regression line must be the same along the entire line. 3. Normally Distributed Residuals at a Given X: often difficult to ascertain because there usually isn't enough observations in MMA datasets with the same value of the explanatory variable to get a good look at the distribution of the residuals.  This is typically true, due to the Central Limit Theorem, since the residual term is the total of a myriad of other, unidentified explanatory variables. Typically, this assumption is assessed by examining a histogram of all of the residuals. It must be remembered though that this is not an assessment of the act