Do test scores really follow a normal distribution?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty{ margin-bottom:0;
}

up vote
13
down vote

favorite

I've been trying to learn which distributions to use in GLMs, and I'm a little fuzzled on when to use the normal distribution. In one part of my textbook, it says that a normal distribution could be good for modeling exam scores. In the next part, it asks what distribution would be appropriate to model a car insurance claim. This time, it said that the appropriate distributions would be Gamma or Inverse Gaussian because they're continuous with only positive values. Well, I believe that exam scores would also be continuous with only positive values, so why would we use a normal distribution there? Doesn't the normal distribution allow for negative values?

asked 23 hours ago

mistersunnyd

190110

1

If you're worried about the bounds on scores, you could try en.wikipedia.org/wiki/Truncated_normal_distribution
– J.G.
17 hours ago

3

In the real world, of course, exam score distributions often don't look anything like a normal distribution anyway. As an example from my math undergrad days, I remember the Topology I class as having been notorious for its highly bimodal "dumbbell curve" grade distribution: you either understood the key concepts and got a nearly perfect score, or you didn't and were lucky to get any points at all. Very few people ended up scoring anywhere in the middle between those two extremes.
– Ilmari Karonen
13 hours ago

1

No. Next question.
– Carl Witthoft
8 hours ago

add a comment |

up vote
13
down vote

favorite

asked 23 hours ago

mistersunnyd

190110

1

If you're worried about the bounds on scores, you could try en.wikipedia.org/wiki/Truncated_normal_distribution
– J.G.
17 hours ago

3

In the real world, of course, exam score distributions often don't look anything like a normal distribution anyway. As an example from my math undergrad days, I remember the Topology I class as having been notorious for its highly bimodal "dumbbell curve" grade distribution: you either understood the key concepts and got a nearly perfect score, or you didn't and were lucky to get any points at all. Very few people ended up scoring anywhere in the middle between those two extremes.
– Ilmari Karonen
13 hours ago

1

No. Next question.
– Carl Witthoft
8 hours ago

add a comment |

up vote
13
down vote

favorite

asked 23 hours ago

mistersunnyd

190110

normal-distribution generalized-linear-model gamma-distribution inverse-gaussian-distrib

asked 23 hours ago

mistersunnyd

190110

asked 23 hours ago

mistersunnyd

190110

asked 23 hours ago

mistersunnyd

190110

asked 23 hours ago

mistersunnyd

190110

asked 23 hours ago

mistersunnyd

190110

1

If you're worried about the bounds on scores, you could try en.wikipedia.org/wiki/Truncated_normal_distribution
– J.G.
17 hours ago

3

In the real world, of course, exam score distributions often don't look anything like a normal distribution anyway. As an example from my math undergrad days, I remember the Topology I class as having been notorious for its highly bimodal "dumbbell curve" grade distribution: you either understood the key concepts and got a nearly perfect score, or you didn't and were lucky to get any points at all. Very few people ended up scoring anywhere in the middle between those two extremes.
– Ilmari Karonen
13 hours ago

1

No. Next question.
– Carl Witthoft
8 hours ago

add a comment |

1

If you're worried about the bounds on scores, you could try en.wikipedia.org/wiki/Truncated_normal_distribution
– J.G.
17 hours ago

3

In the real world, of course, exam score distributions often don't look anything like a normal distribution anyway. As an example from my math undergrad days, I remember the Topology I class as having been notorious for its highly bimodal "dumbbell curve" grade distribution: you either understood the key concepts and got a nearly perfect score, or you didn't and were lucky to get any points at all. Very few people ended up scoring anywhere in the middle between those two extremes.
– Ilmari Karonen
13 hours ago

1

No. Next question.
– Carl Witthoft
8 hours ago

If you're worried about the bounds on scores, you could try en.wikipedia.org/wiki/Truncated_normal_distribution
– J.G.
17 hours ago

In the real world, of course, exam score distributions often don't look anything like a normal distribution anyway. As an example from my math undergrad days, I remember the Topology I class as having been notorious for its highly bimodal "dumbbell curve" grade distribution: you either understood the key concepts and got a nearly perfect score, or you didn't and were lucky to get any points at all. Very few people ended up scoring anywhere in the middle between those two extremes.
– Ilmari Karonen
13 hours ago

No. Next question.
– Carl Witthoft
8 hours ago

add a comment |

2 Answers
2

active

oldest

votes

up vote
14
down vote

accepted

Height, for instance, is often modelled as being normal. Maybe the height of men is something like 5 foot 10 with a standard deviation of 2 inches. We know negative height is unphysical, but under this model, the probability of observing a negative height is essentially zero. We use the model anyway because it is a good enough approximation.

All models are wrong. The question is "can this model still be useful", and in instances where we are modelling things like height and test scores, modelling the phenomenon as normal is useful despite it technically allowing for unphysical things.

edited 22 hours ago

answered 22 hours ago

Demetri P

532316

add a comment |

up vote
10
down vote

Doesn't the normal distribution allow for negative values?

Correct. It also has no upper bound.

In one part of my textbook, it says that a normal distribution could be good for modeling exam scores.

In spite of the previous statements, nevertheless this is sometimes the case. If you have many components to the test, not too strongly related (e.g. so you're not essentially the same question a dozen times, nor having each part requiring a correct answer to the previous part), and not very easy or very hard (so that most marks are somewhere near the middle), then marks may often be reasonably well approximated by a normal distribution; often well enough that typical analyses should cause little concern.

We know for sure that they aren't normal, but that's not automatically a problem -- as long as the behaviour of the procedures we use are close enough to what they should be for our purposes (e.g. standard errors, confidence intervals, significance levels and power - whichever are needed - do close to what we expect them to)

In the next part, it asks what distribution would be appropriate to model a car insurance claim. This time, it said that the appropriate distributions would be Gamma or Inverse Gaussian because they're continuous with only positive values.

Yes, but more than that -- they tend to be heavily right skew and the variability tends to increase when the mean gets larger.

Here's an example of a claim-size distribution for vehicle claims:

https://ars.els-cdn.com/content/image/1-s2.0-S0167668715303358-gr5.jpg

(Fig 5 from Garrido, Genest & Schulz (2016) "Generalized linear models for dependent frequency and severity of insurance claims", Insurance: Mathematics and Economics, Vol 70, Sept., p205-215. https://www.sciencedirect.com/science/article/pii/S0167668715303358)

This shows a typical right-skew and heavy right tail. However we must be very careful because this is a marginal distribution, and we are writing a model for the conditional distribution, which will typically be much less skew (the marginal distribution we look at if we just do a histogram of claim sizes being a mixture of these conditional distributions). Nevertheless it is typically the case that if we look at the claim size in subgroups of the predictors (perhaps categorizing continuous variables) that the distribution is still strongly right skew and quite heavy tailed on the right, suggesting that something like a gamma model* is likely to be much more suitable than a Gaussian model.

* there may be any number of other distributions which would be more suitable than a Gaussian - the inverse Gaussian is another choice - though less common; lognormal or Weibull models, while not GLMs as they stand, may be quite useful also.

[It's rarely the case that any of these distributions are near-perfect descriptions; they're inexact approximations, but in many cases sufficiently good that the analysis is useful and has close to the desired properties.]

Well, I believe that exam scores would also be continuous with only positive values, so why would we use a normal distribution there?

Because (under the conditions I mentioned before -- lots of components, not too dependent, not to hard or easy) the distribution tends to be fairly close to symmetric, unimodal and not heavy-tailed.

edited 20 hours ago

answered 22 hours ago

Glen_b♦

206k22392724

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f376512%2fdo-test-scores-really-follow-a-normal-distribution%23new-answer', 'question_page');
}
);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
14
down vote

accepted

edited 22 hours ago

answered 22 hours ago

Demetri P

532316

add a comment |

up vote
14
down vote

accepted

edited 22 hours ago

answered 22 hours ago

Demetri P

532316

add a comment |

up vote
14
down vote

accepted

edited 22 hours ago

answered 22 hours ago

Demetri P

532316

edited 22 hours ago

answered 22 hours ago

Demetri P

532316

edited 22 hours ago

answered 22 hours ago

Demetri P

532316

answered 22 hours ago

Demetri P

532316

answered 22 hours ago

Demetri P

532316

add a comment |

up vote
10
down vote

Doesn't the normal distribution allow for negative values?

Correct. It also has no upper bound.

In one part of my textbook, it says that a normal distribution could be good for modeling exam scores.

In the next part, it asks what distribution would be appropriate to model a car insurance claim. This time, it said that the appropriate distributions would be Gamma or Inverse Gaussian because they're continuous with only positive values.

Yes, but more than that -- they tend to be heavily right skew and the variability tends to increase when the mean gets larger.

Here's an example of a claim-size distribution for vehicle claims:

https://ars.els-cdn.com/content/image/1-s2.0-S0167668715303358-gr5.jpg

Well, I believe that exam scores would also be continuous with only positive values, so why would we use a normal distribution there?

Because (under the conditions I mentioned before -- lots of components, not too dependent, not to hard or easy) the distribution tends to be fairly close to symmetric, unimodal and not heavy-tailed.

edited 20 hours ago

answered 22 hours ago

Glen_b♦

206k22392724

add a comment |

up vote
10
down vote

Doesn't the normal distribution allow for negative values?

Correct. It also has no upper bound.

In one part of my textbook, it says that a normal distribution could be good for modeling exam scores.

In the next part, it asks what distribution would be appropriate to model a car insurance claim. This time, it said that the appropriate distributions would be Gamma or Inverse Gaussian because they're continuous with only positive values.

Yes, but more than that -- they tend to be heavily right skew and the variability tends to increase when the mean gets larger.

Here's an example of a claim-size distribution for vehicle claims:

https://ars.els-cdn.com/content/image/1-s2.0-S0167668715303358-gr5.jpg

Well, I believe that exam scores would also be continuous with only positive values, so why would we use a normal distribution there?

Because (under the conditions I mentioned before -- lots of components, not too dependent, not to hard or easy) the distribution tends to be fairly close to symmetric, unimodal and not heavy-tailed.

edited 20 hours ago

answered 22 hours ago

Glen_b♦

206k22392724

add a comment |

up vote
10
down vote

Doesn't the normal distribution allow for negative values?

Correct. It also has no upper bound.

In one part of my textbook, it says that a normal distribution could be good for modeling exam scores.

In the next part, it asks what distribution would be appropriate to model a car insurance claim. This time, it said that the appropriate distributions would be Gamma or Inverse Gaussian because they're continuous with only positive values.

Yes, but more than that -- they tend to be heavily right skew and the variability tends to increase when the mean gets larger.

Here's an example of a claim-size distribution for vehicle claims:

https://ars.els-cdn.com/content/image/1-s2.0-S0167668715303358-gr5.jpg

Well, I believe that exam scores would also be continuous with only positive values, so why would we use a normal distribution there?

Because (under the conditions I mentioned before -- lots of components, not too dependent, not to hard or easy) the distribution tends to be fairly close to symmetric, unimodal and not heavy-tailed.

edited 20 hours ago

answered 22 hours ago

Glen_b♦

206k22392724

Doesn't the normal distribution allow for negative values?

Correct. It also has no upper bound.

In one part of my textbook, it says that a normal distribution could be good for modeling exam scores.

In the next part, it asks what distribution would be appropriate to model a car insurance claim. This time, it said that the appropriate distributions would be Gamma or Inverse Gaussian because they're continuous with only positive values.

Yes, but more than that -- they tend to be heavily right skew and the variability tends to increase when the mean gets larger.

Here's an example of a claim-size distribution for vehicle claims:

https://ars.els-cdn.com/content/image/1-s2.0-S0167668715303358-gr5.jpg

Well, I believe that exam scores would also be continuous with only positive values, so why would we use a normal distribution there?

Because (under the conditions I mentioned before -- lots of components, not too dependent, not to hard or easy) the distribution tends to be fairly close to symmetric, unimodal and not heavy-tailed.

edited 20 hours ago

answered 22 hours ago

Glen_b♦

206k22392724

edited 20 hours ago

answered 22 hours ago

Glen_b♦

206k22392724

answered 22 hours ago

Glen_b♦

206k22392724

answered 22 hours ago

Glen_b♦

206k22392724

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Name

This page is only for reference, If you need detailed information, please check here

kAJz1Y1FgZr99XGSGEhCFD,pmw8 koxodVtg9Y0kVGWM uyO

搜尋此網誌

Cfrgtkky