Why use OLS when it is assumed there is heteroscedasticity?

up vote
2
down vote

favorite

So I'm slowly going through the Stock and Watson book and I'm a bit confused on how to deal with the issue of homoscedacity/heteroscedacity. Specifically, it is mentioned that economic theory tells us that there's no reason for us to assume that errors will be homoscedastic, so their advice is that we assume heteroscedasticity and always use the heteroscedastic robust standard errors when performing our regression analysis. The way I'm being taught this material, in STATA for example, is that we just run the reg command, always sure to include r for robust standard error.

My question(s) is this: if our default position is to assume heteroscedacticity, then is it also correct that OLS is no longer the best unbiased linear estimator as one of the Gauss-Markov assumptions is violated? And if this is the case, is it also correct that GLS would be the BLUE estimator? Lastly, if both of these assumptions are correct, why would we not just run GLS regressions as our default and not OLS?

Thanks.

asked Nov 26 at 15:35

anguyen1210

133

3

It would be good to get some clarification about your meaning of "GLS." My understanding of GLS is that you have to provide specific information about the error variances. What do you have in mind doing in the general (and by far most common) case when there is no such information directly available?
– whuber♦
Nov 26 at 15:42

Hi, I'm not sure what to clarify. I'm not very familiar with the Generalized Least Squares method, but in your comment, is that the answer? Namely, that even assuming that our errors are heteroscedastic, we do an OLS regression anyways, because to do a GLS we need information on the error terms that we don't have? Sorry, this is all quite new to me, and I'm sure I've not phrased the question well. Thanks to all for their comments.
– anguyen1210
Nov 26 at 15:49

GLS as a method has more pedagogic value than practical. One almost never sees GLS used in papers because researchers typically don't know $operatorname{Cov}[ epsilon mid X] = Omega$. You could assume some structure on $Omega$ and estimate the rest, but such a procedure (i.e. FGLS) can have big problems with robustness! Use a poor $Omega$ and your estimates will be worse than OLS. To me, GLS is interesting mostly from the standpoint of developing a deeper understanding of linear algebra and OLS.
– Matthew Gunn
Nov 27 at 5:50

@MatthewGunn but GLS with a compound symmetry correlation structure (assuming exchangeability of responses within cases) is identical to a random intercept multilevel model. So in this sense, an identical model is regularly used. Also, in the case where you are analyzing reasonably normal data from a randomized control trial with not small n, I see no reason not to use it since you can simultaneously model heterogeneous variances by the groups. I think the reason it is not common has a lot to do with researchers ignoring substantive questions about variances.
– Heteroskedastic Jim
Nov 28 at 1:02

add a comment |

up vote
2
down vote

favorite

Thanks.

asked Nov 26 at 15:35

anguyen1210

133

3

It would be good to get some clarification about your meaning of "GLS." My understanding of GLS is that you have to provide specific information about the error variances. What do you have in mind doing in the general (and by far most common) case when there is no such information directly available?
– whuber♦
Nov 26 at 15:42

Hi, I'm not sure what to clarify. I'm not very familiar with the Generalized Least Squares method, but in your comment, is that the answer? Namely, that even assuming that our errors are heteroscedastic, we do an OLS regression anyways, because to do a GLS we need information on the error terms that we don't have? Sorry, this is all quite new to me, and I'm sure I've not phrased the question well. Thanks to all for their comments.
– anguyen1210
Nov 26 at 15:49

GLS as a method has more pedagogic value than practical. One almost never sees GLS used in papers because researchers typically don't know $operatorname{Cov}[ epsilon mid X] = Omega$. You could assume some structure on $Omega$ and estimate the rest, but such a procedure (i.e. FGLS) can have big problems with robustness! Use a poor $Omega$ and your estimates will be worse than OLS. To me, GLS is interesting mostly from the standpoint of developing a deeper understanding of linear algebra and OLS.
– Matthew Gunn
Nov 27 at 5:50

@MatthewGunn but GLS with a compound symmetry correlation structure (assuming exchangeability of responses within cases) is identical to a random intercept multilevel model. So in this sense, an identical model is regularly used. Also, in the case where you are analyzing reasonably normal data from a randomized control trial with not small n, I see no reason not to use it since you can simultaneously model heterogeneous variances by the groups. I think the reason it is not common has a lot to do with researchers ignoring substantive questions about variances.
– Heteroskedastic Jim
Nov 28 at 1:02

add a comment |

up vote
2
down vote

favorite

Thanks.

asked Nov 26 at 15:35

anguyen1210

133

Thanks.

least-squares heteroscedasticity generalized-least-squares blue

asked Nov 26 at 15:35

anguyen1210

133

asked Nov 26 at 15:35

anguyen1210

133

asked Nov 26 at 15:35

anguyen1210

133

asked Nov 26 at 15:35

anguyen1210

133

asked Nov 26 at 15:35

anguyen1210

133

3

It would be good to get some clarification about your meaning of "GLS." My understanding of GLS is that you have to provide specific information about the error variances. What do you have in mind doing in the general (and by far most common) case when there is no such information directly available?
– whuber♦
Nov 26 at 15:42

Hi, I'm not sure what to clarify. I'm not very familiar with the Generalized Least Squares method, but in your comment, is that the answer? Namely, that even assuming that our errors are heteroscedastic, we do an OLS regression anyways, because to do a GLS we need information on the error terms that we don't have? Sorry, this is all quite new to me, and I'm sure I've not phrased the question well. Thanks to all for their comments.
– anguyen1210
Nov 26 at 15:49

GLS as a method has more pedagogic value than practical. One almost never sees GLS used in papers because researchers typically don't know $operatorname{Cov}[ epsilon mid X] = Omega$. You could assume some structure on $Omega$ and estimate the rest, but such a procedure (i.e. FGLS) can have big problems with robustness! Use a poor $Omega$ and your estimates will be worse than OLS. To me, GLS is interesting mostly from the standpoint of developing a deeper understanding of linear algebra and OLS.
– Matthew Gunn
Nov 27 at 5:50

@MatthewGunn but GLS with a compound symmetry correlation structure (assuming exchangeability of responses within cases) is identical to a random intercept multilevel model. So in this sense, an identical model is regularly used. Also, in the case where you are analyzing reasonably normal data from a randomized control trial with not small n, I see no reason not to use it since you can simultaneously model heterogeneous variances by the groups. I think the reason it is not common has a lot to do with researchers ignoring substantive questions about variances.
– Heteroskedastic Jim
Nov 28 at 1:02

add a comment |

3

It would be good to get some clarification about your meaning of "GLS." My understanding of GLS is that you have to provide specific information about the error variances. What do you have in mind doing in the general (and by far most common) case when there is no such information directly available?
– whuber♦
Nov 26 at 15:42

Hi, I'm not sure what to clarify. I'm not very familiar with the Generalized Least Squares method, but in your comment, is that the answer? Namely, that even assuming that our errors are heteroscedastic, we do an OLS regression anyways, because to do a GLS we need information on the error terms that we don't have? Sorry, this is all quite new to me, and I'm sure I've not phrased the question well. Thanks to all for their comments.
– anguyen1210
Nov 26 at 15:49

GLS as a method has more pedagogic value than practical. One almost never sees GLS used in papers because researchers typically don't know $operatorname{Cov}[ epsilon mid X] = Omega$. You could assume some structure on $Omega$ and estimate the rest, but such a procedure (i.e. FGLS) can have big problems with robustness! Use a poor $Omega$ and your estimates will be worse than OLS. To me, GLS is interesting mostly from the standpoint of developing a deeper understanding of linear algebra and OLS.
– Matthew Gunn
Nov 27 at 5:50

@MatthewGunn but GLS with a compound symmetry correlation structure (assuming exchangeability of responses within cases) is identical to a random intercept multilevel model. So in this sense, an identical model is regularly used. Also, in the case where you are analyzing reasonably normal data from a randomized control trial with not small n, I see no reason not to use it since you can simultaneously model heterogeneous variances by the groups. I think the reason it is not common has a lot to do with researchers ignoring substantive questions about variances.
– Heteroskedastic Jim
Nov 28 at 1:02

It would be good to get some clarification about your meaning of "GLS." My understanding of GLS is that you have to provide specific information about the error variances. What do you have in mind doing in the general (and by far most common) case when there is no such information directly available?
– whuber♦
Nov 26 at 15:42

Hi, I'm not sure what to clarify. I'm not very familiar with the Generalized Least Squares method, but in your comment, is that the answer? Namely, that even assuming that our errors are heteroscedastic, we do an OLS regression anyways, because to do a GLS we need information on the error terms that we don't have? Sorry, this is all quite new to me, and I'm sure I've not phrased the question well. Thanks to all for their comments.
– anguyen1210
Nov 26 at 15:49

GLS as a method has more pedagogic value than practical. One almost never sees GLS used in papers because researchers typically don't know $operatorname{Cov}[ epsilon mid X] = Omega$. You could assume some structure on $Omega$ and estimate the rest, but such a procedure (i.e. FGLS) can have big problems with robustness! Use a poor $Omega$ and your estimates will be worse than OLS. To me, GLS is interesting mostly from the standpoint of developing a deeper understanding of linear algebra and OLS.
– Matthew Gunn
Nov 27 at 5:50

@MatthewGunn but GLS with a compound symmetry correlation structure (assuming exchangeability of responses within cases) is identical to a random intercept multilevel model. So in this sense, an identical model is regularly used. Also, in the case where you are analyzing reasonably normal data from a randomized control trial with not small n, I see no reason not to use it since you can simultaneously model heterogeneous variances by the groups. I think the reason it is not common has a lot to do with researchers ignoring substantive questions about variances.
– Heteroskedastic Jim
Nov 28 at 1:02

add a comment |

2 Answers
2

active

oldest

votes

up vote
5
down vote

accepted

Because GLS is BLUE if you know the form of heteroskedasticity (and correlated errors). If you misspecify the form of heteroscedasticity, GLS estimates will lose their nice properties.

Under heteroscedasticity, OLS remains unbiased and consistent, but you lose efficiency.

So unless you're certain of the form of heteroscedasticity, it makes sense to stick with unbiased and consistent estimates from OLS. Then adjust inference for heteroskedasticity using robust standard errors which are valid asymptotically if you don't know the form of heteroscedasticity.

A hybrid approach is to do your best at specifying the form of heteroskedasticity but still apply robust standard errors for inference. See Resurrecting weighted least squares (PDF).

Modeling is all about tradeoffs and resources. If you are convinced there is nothing to be learned from modeling the form of heteroscedasticity, then specifying its form is a waste of time. I would argue that there is usually something to be learned in empirical applications. But tradition/convention nudges us away from studying heteroscedasticity, looking at the variances, since all they are is "error".

edited Nov 26 at 17:08

answered Nov 26 at 15:52

Heteroskedastic Jim

2,871523

add a comment |

up vote
0
down vote

OLS is still unbiased when the data are correlated (provided the mean model is true). The net effect of heteroscedasticity is that it offsets the errors, so that the 95% CI for the regression is, at times, too tight and at other times too wide. Even still, you can correct the standard errors by using the sandwich or heteroscedasticity consistent standard error (HC) estimator. Technically, this is not "ordinary least squares" but it results in the same effect summary measures: a slope, interecept, and 95% CIs for their values, just no global test, or F-tests, and no validity to prediction intervals.

answered Nov 26 at 16:45

AdamO

32.3k257138

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f378851%2fwhy-use-ols-when-it-is-assumed-there-is-heteroscedasticity%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
5
down vote

accepted

Because GLS is BLUE if you know the form of heteroskedasticity (and correlated errors). If you misspecify the form of heteroscedasticity, GLS estimates will lose their nice properties.

Under heteroscedasticity, OLS remains unbiased and consistent, but you lose efficiency.

A hybrid approach is to do your best at specifying the form of heteroskedasticity but still apply robust standard errors for inference. See Resurrecting weighted least squares (PDF).

edited Nov 26 at 17:08

answered Nov 26 at 15:52

Heteroskedastic Jim

2,871523

add a comment |

up vote
5
down vote

accepted

Because GLS is BLUE if you know the form of heteroskedasticity (and correlated errors). If you misspecify the form of heteroscedasticity, GLS estimates will lose their nice properties.

Under heteroscedasticity, OLS remains unbiased and consistent, but you lose efficiency.

A hybrid approach is to do your best at specifying the form of heteroskedasticity but still apply robust standard errors for inference. See Resurrecting weighted least squares (PDF).

edited Nov 26 at 17:08

answered Nov 26 at 15:52

Heteroskedastic Jim

2,871523

add a comment |

up vote
5
down vote

accepted

Because GLS is BLUE if you know the form of heteroskedasticity (and correlated errors). If you misspecify the form of heteroscedasticity, GLS estimates will lose their nice properties.

Under heteroscedasticity, OLS remains unbiased and consistent, but you lose efficiency.

A hybrid approach is to do your best at specifying the form of heteroskedasticity but still apply robust standard errors for inference. See Resurrecting weighted least squares (PDF).

edited Nov 26 at 17:08

answered Nov 26 at 15:52

Heteroskedastic Jim

2,871523

Because GLS is BLUE if you know the form of heteroskedasticity (and correlated errors). If you misspecify the form of heteroscedasticity, GLS estimates will lose their nice properties.

Under heteroscedasticity, OLS remains unbiased and consistent, but you lose efficiency.

A hybrid approach is to do your best at specifying the form of heteroskedasticity but still apply robust standard errors for inference. See Resurrecting weighted least squares (PDF).

edited Nov 26 at 17:08

answered Nov 26 at 15:52

Heteroskedastic Jim

2,871523

edited Nov 26 at 17:08

answered Nov 26 at 15:52

Heteroskedastic Jim

2,871523

answered Nov 26 at 15:52

Heteroskedastic Jim

2,871523

answered Nov 26 at 15:52

Heteroskedastic Jim

2,871523

add a comment |

up vote
0
down vote

answered Nov 26 at 16:45

AdamO

32.3k257138

add a comment |

up vote
0
down vote

answered Nov 26 at 16:45

AdamO

32.3k257138

add a comment |

up vote
0
down vote

answered Nov 26 at 16:45

AdamO

32.3k257138

answered Nov 26 at 16:45

AdamO

32.3k257138

answered Nov 26 at 16:45

AdamO

32.3k257138

answered Nov 26 at 16:45

AdamO

32.3k257138

answered Nov 26 at 16:45

AdamO

32.3k257138

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Cross Validated!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrgtkky