Multiple Regression Model Error
is used to predict a single dependent variable (Y). The predicted value of Y is a linear transformation of the X variables such that the sum of squared deviations of the observed and predicted Y is a minimum. The computations are more complex, however, because the interrelationships among all the variables must be taken into account in the weights assigned to the variables. The interpretation of the results of a multiple regression analysis is also more complex for the same reason. With two independent variables the prediction of Y is expressed by the following equation: Y'i = b0 + b1X1i + b2X2i Note that this transformation is similar to the linear transformation of two variables discussed in the previous chapter except that the w's have been replaced with b's and the X'i has been replaced with a Y'i. The "b" values are called regression weights and are computed in a way that minimizes the sum of squared deviations in the same manner as in simple linear regression. The difference is that in simple linear regression only two weights, the intercept (b0) and slope (b1), were estimated, while in this case, three weights (b0, b1, and b2) are estimated. EXAMPLE DATA The data used to illustrate the inner workings of multiple regression will be generated from the "Example Student." The data are presented below: Homework Assignment 21 Example Student PSY645 Dr. Stockburger Due Date
Y1 Y2 X1 X2 X3 X4 125 113 13 18 25 11 158 115 39 18 59 30 207 126 52 50 62 53 182 119 29 43 50 29 196 107 50 37 65 56 175 135 64 19 79 49 145 111 11 27 17 14 144 130 22 23 31 17 160 122 30 18 34 22 175 114 51 11 58 40 151 121 27 15 29 31 161 105 41 22 53 39 200 131 51 52 75 36 173 123 37 36 44 27 175 121 23 48 27 20 162 120 43 15 65 36 155 109 38 19 62 37 230 130 62 56 75 50 162 134 28 30 36 20 153 124 30 25 41 33 The example data can be obtained as a text file and as an SPSS/WIN file from this web page. If a student desires a more concrete description of this data file, meaning could be given the variables as follows: Y1 - A measure of success in graduate school. X1 - A measure of intellectual ability. X2 - A measure of "work ethic." X3 - A second measure of intellectual ability. X4 - A measure of spatial abilix is associated with a value of the dependent variable y. The population regression line for p explanatory variables x1, x2, ... , xp is defined to be y = 0 + 1x1 + 2x2 + ... + pxp. This line describes how the mean response y changes with the explanatory variables. The observed values for y vary about their means y and are assumed to have the same standard deviation . The fitted values b0, b1, ..., bp estimate the parameters 0, 1, ..., p of the population regression line. Since the observed values for y vary about their means y, the multiple regression model includes a term for this variation. In words, the model is expressed as DATA = http://www.psychstat.missouristate.edu/multibook/mlt06m.html FIT + RESIDUAL, where the "FIT" term represents the expression 0 + 1x1 + 2x2 + ... pxp. The "RESIDUAL" term represents the deviations of the observed values y from their means y, which are normally distributed with mean 0 and variance . The notation for the model deviations is . Formally, the model for multiple linear regression, given n observations, is yi = 0 + 1xi1 + 2xi2 + ... pxip + i for i = 1,2, http://www.stat.yale.edu/Courses/1997-98/101/linmult.htm ... n. In the least-squares model, the best-fitting line for the observed data is calculated by minimizing the sum of the squares of the vertical deviations from each data point to the line (if a point lies on the fitted line exactly, then its vertical deviation is 0). Because the deviations are first squared, then summed, there are no cancellations between positive and negative values. The least-squares estimates b0, b1, ... bp are usually computed by statistical software. The values fit by the equation b0 + b1xi1 + ... + bpxip are denoted i, and the residuals ei are equal to yi - i, the difference between the observed and fitted values. The sum of the residuals is equal to zero. The variance ² may be estimated by s² = , also known as the mean-squared error (or MSE). The estimate of the standard error s is the square root of the MSE. Example The dataset "Healthy Breakfast" contains, among other variables, the Consumer Reports ratings of 77 cereals and the number of grams of sugar contained in each serving. (Data source: Free publication available in many grocery stores. Dataset available through the Statlib Data and Story Library (DASL).) A simple linear regression model considering "Sugars" as the explanatory variable and "Rating" as the response variable produced the regression line Rating = 59.3 - 2.40 Sugars, with the square of the corre
1: descriptive analysis · Beer sales vs. price, part 2: fitting a simple model · Beer sales vs. price, part 3: transformations of variables · Beer sales vs. price, part 4: additional http://people.duke.edu/~rnau/regnotes.htm predictors · NC natural gas consumption vs. temperature What to look for in regression output What's a good value for R-squared? What's the bottom line? How to compare models Testing the assumptions of linear regression Additional notes on regression analysis Stepwise and all-possible-regressions Excel file with simple regression formulas Excel file with regression formulas in matrix form If you are a PC Excel multiple regression user, you must check this out: RegressIt: free Excel add-in for linear regression and multivariate data analysis Additional notes on linear regression analysis To include or not to include the CONSTANT? Interpreting STANDARD ERRORS, "t" STATISTICS, and SIGNIFICANCE LEVELS of coefficients Interpreting the F-RATIO Interpreting measures of multicollinearity: CORRELATIONS AMONG COEFFICIENT ESTIMATES and VARIANCE INFLATION FACTORS Interpreting CONFIDENCE INTERVALS TYPES of confidence intervals multiple regression model Dealing with OUTLIERS Caution: MISSING VALUES may cause variations in SAMPLE SIZE MULTIPLICATIVE regression models and the LOGARITHM transformation To include or not to include the CONSTANT? Most multiple regression models include a constant term (i.e., an "intercept"), since this ensures that the model will be unbiased--i.e., the mean of the residuals will be exactly zero. (The coefficients in a regression model are estimated by least squares--i.e., minimizing the mean squared error. Now, the mean squared error is equal to the variance of the errors plus the square of their mean: this is a mathematical identity. Changing the value of the constant in the model changes the mean of the errors but doesn't affect the variance. Hence, if the sum of squared errors is to be minimized, the constant must be chosen such that the mean of the errors is zero.) In a simple regression model, the constant represents the Y-intercept of the regression line, in unstandardized form. In a multiple regression model, the constant represents the value that would be predicted for the dependent variable if all the independent variables were simultaneously equal to zero--a situation