Partitioning of Sums of Squares in Simple Linear Regression George H Olson, Ph. D. Leadership and Educational Studies Appalachian State University The parametric model for the regression of Y on X is given by Yi = α + βXi + εi.. (1) The model for the regression of Y on X in a sample is Yi = a + bXi + ei. (2) Calculation of the constants in the model: the slope is given by x y x i b= i 2 , where xi X i X , (3) i and the intercept by a Y bX . (4) A closer look at the regression equation, Y a bX i leads us to Yˆ a bX i i (Y bX ) bX (5) i Y b( X X ) i Y bX . i Partitioning the Sum of Squares, (Yˆi Y ) 2 . First, consider the following identity Y Y (Yˆ Y ) (Y Yˆ ) . i i i i (6) If we subtract Y from each side of the equation, we obtain Y Y (Yˆ Y ) (Y Yˆ ) i i i (7) i After squaring and summing, we have (Y Y ) [(Yˆ Y ) (Y Yˆ )] (Yˆ Y ) (Y Yˆ ) 2 2 i i i i 2 i i (8) 2 i 2 (Yˆ Y )(Y Yˆ ) i i i or, after simplifying1, Y (Yˆ Y ) (Y Yˆ ) SS SS 2 2 i i 2 (9) i reg res Where SSreg = regression sum of squears, and SSres = residual sum of squares. Dividing (9) through by the total sum of squares, SStot (= SS SS y y y y y 2 ) gives 2 reg i 2 i (10) res 2 i 2 i or 1 SS SS y y reg res 2 2 i (11) i A computational example It is often useful to devise simple computational examples, such as the following: Y 3 1 0 4 5 1 See the appendix to see how the equation simplifies. X 1 0 1 -1 2 The means of the two variables, Y and X are X X 3 .6 n 5 Y Y 13 2.6 n 5 i i Having computed the means, we now compute the deviations, squares of deviations, and cross-products of deviations Y 3 1 0 4 5 Deviations, squares, and cross-products y y2 X x x2 xy .4 .16 1 .4 .16 .16 -1.6 2.56 0 -.6 .36 .96 -2.6 6.76 1 -1.6 2.56 4.16 1.4 1.96 -1 .4 .16 .56 2.4 5.76 2 1.4 1.96 3.36 The sums of squares and cross-products are computed as x 5.2 y 17.2 2 i 2 i x y 9.2 i i and the regression coefficients as b x y 9.2 .769, 5.2 x i i 2 i and a Y bX = 2.6 – 1.769 * .6 = 1.061 = 1.539 The regression equation can now be written as Y 1.539 1.769 X i i From an earlier equation (9) we obtain SS (Yˆ Y ) reg 2 i [(Y bx ) Y ] 2 i (Y bx Y ) 2 i (bx ) 2 i b x x y ( ) x x ( x y ) x 84.64 5.2 16.28. 2 2 i i i 2 i 2 2 i i 2 (12) Note that we could also have computed SS b x y reg i i 1.769 *9.2 16.28. (13) An alternative calculation is given by SS b x 2 reg 2 i (1.769) *5.2 3.1294*5.2 16.28 2 (14) Hence, SS y SS 2 res i reg 17.2 16.28 .92 (15) The equation for the Pearson correlation is r 2 xy ( x y ) x y 2 i 2 i (16) i 2 i Therefore, SSres can also be computed as ( x y ) y x y ( x y ) x 2 r y 2 xy 2 i i 2 i 2 2 i i i (17) 2 i i 2 i SS reg Appendix Beginning with, (Y Y ) [(Yˆ Y ) (Y Yˆ )] (Yˆ Y ) (Y Yˆ ) 2 2 i i i i 2 i i 2 i 2 (Yˆ Y )(Y Yˆ ), i i i we need to show that 2 (Yˆ Y )(Y Yˆ ) 0. i i i Recalling (3, 4 & 5) we can write, 2 (Yˆ Y )(Y Yˆ ) 2 ((Y bX bX ) Y )(Y Yˆ ) 2 ((Y b( X X ) Y )(Y Yˆ )) i i i i i i i i 2 (b( X X )(Y Yˆ )) i i i 2 b( X X )(Y (Y bX ) bX ) i i i 2b ( X X )(Y (Y b( X X ))) i i i 2b ( X X )((Y Y ) b( X X )) i i i 2b (( X X )(Y Y ) b( X X ) ) 2 i i i 2b ( x y bx ) 2 i i i x y 2b x y x 2b( x y x y ) i i i 2 i i 0. i i i i x 2 i