Eat and drink from the provision of Allah,and do not commit abuse on the earth,spreading corruption.: More on Venn Diagrams for Regression (Summary)

More on Venn Diagrams for Regression

Volume 10 | Number 1 | May 2002 p.1-10 Peter E. Kennedy

Journal of Statistics Education

What (abstract)

The main contribution of this the paper consists of suggestions for how this approach (of using Venn diagram) can be used effectively in expositing results relating to bias and variance of coefficient estimates in multiple regression analysis. Previous works IP (2001) have been limited to the R², partial correlation, and sums of squares in the presence of suppressor variables. This article presents a different interpretation of Venn diagrams, highlighting illustrations of bias and variance.

Methodology/Model/Data-

Regression with single explanatory variable X

Y= variation in Y

From the main article

X= variation in X

Purple= variation in common (βx)

Black= Error term (σ2). The magnitude of this area represents

the magnitude of the OLS estimate of σ2, the variance

of the error term

Regression with more than one explanatory variable

From the main article

If regress Yon X alone βx=

Blue+Red

If regress Yon W alone βy= Green+Red

If regress Yon X and W together-

1- β_x= Blue+Red β_y= Green+Red or

2- β_x= Blue β_y= Green or

3- Divide Red into two parts or any other way to

calculate β_xand β_y.

The best case is not using the red part but only using

Blue+Yellow and Orange+Blue to represent y and X respectively. The red area shows the joint variation of X and W together which may result in biased estimates.

Here Yellow area represents the magnitude of σ², the variance of the error term. OLS uses the magnitude of the area that can’t be explained to estimate σ².

Multicollinearity

From the main article

Collinearity is captured by increasing the overlap b/w the X and W circles.

Y estimates- unbiased as in both the figures Blue and The green part is used.

However, it has caused an increase in the variance as the size of Blue and Green the area is shrunk.

Omitting a Relevant Explanatory Variable

Generated by author

Suppose W is emitted. The estimation is biased as both Blue and Red areas are used but variance decreases.

If X and W are orthogonal that is X and W do not overlap, the results remain unbiased and variance is unaffected. We may remove the W variable if it's highly collinear.

Detrending Data

W is a time trend. How will it affect if removed? Remove it. Regress detrended Y on detrended X. Also X and W are not orthogonal. According to the data used…

Reg y on X, obtain B_x and variance vb*.

Reg X onW, save residual r, reg y on r to get c*, est. r coeff., and est. var vc*.

Reg y on W, save the residual s, regress s on r to get d*, est. r coeff., est. var vd*.

Coeff.	Est.	Est. var
b*	1.129427	0.00210754 vb*
c*	1.129427	0.00987857 vc*
d*	1.129427	0.00208904vd*

From the main article

b= usual OLS estimate

r= Orange+Blue (X cant be explained by W)

s= Blue+Yellow (y cant be explained by W)

s+r overlap= Blue

reg s on r (Blue+Yellow on Orange +Blue)= uses the same info as for esti. b* and c*

But the variances vb*, vc* and vd* respectively are different.

Why?

Although the true variances are equal but the estimated are

not. vb* and vd* are also nearly equal. These are the variations not explained and calculated by the magnitude of the Yellow area. Let us now come to vc* as it is high comparatively. It is because the variation not explained in y is the Yellow+Red+Green areas making variance σ² overestimated. That’s why it is greater than vb* and vd*.

Conclusion

The main contribution of this paper is to drawing some effective ways of using Venn Diagram when teaching regression analysis. Also, there are cases where Ballentine (Venn diagram) can mislead in the OLS but for Standard Analysis, it is highly recommended by Kennedy himself.

Eat and drink from the provision of Allah,and do not commit abuse on the earth,spreading corruption.

Friday, 27 March 2020

More on Venn Diagrams for Regression (Summary)

1 comment:

More on Venn Diagrams for Regression (Summary)

Followers

Total Pageviews