Density plot looks normal, qqplot not normal Shapiro significative

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}

The following QQ plot looks with too many points out of the line, the density plot looks normal and the Shapiro Test p-value < 2.2e-16, so this is not a normal distribution but I've read not to trust Shapiro Test when I have about 1000 data points so I should conclude that this distribution is normal? enter image description here

enter image description here

asked Feb 17 at 15:24

AnaHochma

111

$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
Feb 17 at 16:07

1

$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber♦
Feb 17 at 16:12

1

$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
Feb 17 at 16:18

$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom♦
Feb 17 at 16:18

$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
Feb 17 at 16:20

|
show 2 more comments

enter image description here

asked Feb 17 at 15:24

AnaHochma

111

$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
Feb 17 at 16:07

1

$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber♦
Feb 17 at 16:12

1

$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
Feb 17 at 16:18

$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom♦
Feb 17 at 16:18

$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
Feb 17 at 16:20

|
show 2 more comments

enter image description here

asked Feb 17 at 15:24

AnaHochma

111

enter image description here

normal-distribution

asked Feb 17 at 15:24

AnaHochma

111

asked Feb 17 at 15:24

AnaHochma

111

asked Feb 17 at 15:24

AnaHochma

111

asked Feb 17 at 15:24

AnaHochma

111

asked Feb 17 at 15:24

AnaHochma

111

$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
Feb 17 at 16:07

1

$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber♦
Feb 17 at 16:12

1

$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
Feb 17 at 16:18

$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom♦
Feb 17 at 16:18

$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
Feb 17 at 16:20

|
show 2 more comments

$begingroup$
Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.
$endgroup$
– Peter Leopold
Feb 17 at 16:07

1

$begingroup$
@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.
$endgroup$
– whuber♦
Feb 17 at 16:12

1

$begingroup$
@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.
$endgroup$
– Nick Cox
Feb 17 at 16:18

$begingroup$
@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.
$endgroup$
– Peter Flom♦
Feb 17 at 16:18

$begingroup$
I would not trust a scale for happiness [NB] with such results!
$endgroup$
– Nick Cox
Feb 17 at 16:20

Clear asymmetry suggests something is up. A) Did you calculate the skewness? A nominal rule-of-thumb is if Pearson Skewness $ge$ 0.1, then you have to take corrective action, e.g., performing statistics on the log() of your measurements, rather thanon the measurements direclty. Also, are there additional factors (meta data, etc.) that you can use to subselect your data? This may also be a mixture of models, so you might have two or 3 normal distributions all sitting close together, but the second and third are small enough so as not to create an obviously multimodal histogram.

– Peter Leopold
Feb 17 at 16:07

@Peter Where does that rule of thumb come from? It's not generally applicable, so it would be of interest to know its limitations and assumptions.

– whuber♦
Feb 17 at 16:12

@PeterLeopold "Pearson skewness" is not uniquely defined. Pearson himself put most emphasis on measuring skewness relative to the mode, which had a major role in his system of distributions. But he did also use a dimensionless ratio based on third and second moments around the mean. And yet again (mean $-$ median) / SD appears in his work. But regardless I wouldn't regard skewness of about 0.1 on any measure I've encountered as requiring transformation. I would always want to see the data, however.

– Nick Cox
Feb 17 at 16:18

@Peter The problem here is more about kurtosis than skewness and it's not clear that taking the log is justified, even if it were skew. It depends on what the OP is going to use the data for.

– Peter Flom♦
Feb 17 at 16:18

I would not trust a scale for happiness [NB] with such results!

– Nick Cox
Feb 17 at 16:20

|
show 2 more comments

1 Answer
1

active

oldest

votes

First, the density plot does not really look normal. It's symmetric, but the shape is wrong. I suggest generating a normal distribution with the same mean and variance as yours and then overlaying that density on the one you've got. I am fairly sure you will see a mismatch.

Second, a quantile normal plot is often a better clue to nonnormality.

Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?

answered Feb 17 at 16:15

Peter Flom♦

77.6k12110219

1

$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
Feb 17 at 18:42

$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
Feb 17 at 19:08

$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
Feb 17 at 19:15

1

$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
Feb 17 at 19:22

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f392949%2fdensity-plot-looks-normal-qqplot-not-normal-shapiro-significative%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Second, a quantile normal plot is often a better clue to nonnormality.

Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?

answered Feb 17 at 16:15

Peter Flom♦

77.6k12110219

1

$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
Feb 17 at 18:42

$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
Feb 17 at 19:08

$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
Feb 17 at 19:15

1

$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
Feb 17 at 19:22

add a comment |

Second, a quantile normal plot is often a better clue to nonnormality.

Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?

answered Feb 17 at 16:15

Peter Flom♦

77.6k12110219

1

$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
Feb 17 at 18:42

$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
Feb 17 at 19:08

$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
Feb 17 at 19:15

1

$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
Feb 17 at 19:22

add a comment |

Second, a quantile normal plot is often a better clue to nonnormality.

Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?

answered Feb 17 at 16:15

Peter Flom♦

77.6k12110219

Second, a quantile normal plot is often a better clue to nonnormality.

Third, and probably most importantly, why are you concerned about the normality of this variable? What are you going to do with the variable?

answered Feb 17 at 16:15

Peter Flom♦

77.6k12110219

answered Feb 17 at 16:15

Peter Flom♦

77.6k12110219

answered Feb 17 at 16:15

Peter Flom♦

77.6k12110219

answered Feb 17 at 16:15

Peter Flom♦

77.6k12110219

1

$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
Feb 17 at 18:42

$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
Feb 17 at 19:08

$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
Feb 17 at 19:15

1

$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
Feb 17 at 19:22

add a comment |

1

$begingroup$
Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon
$endgroup$
– AnaHochma
Feb 17 at 18:42

$begingroup$
I'd go with Wilcoxon. Or maybe a bootstrap.
$endgroup$
– Peter Flom♦
Feb 17 at 19:08

$begingroup$
Ah, two distributions! Thanks for confirming what the data was hinting strongly at.
$endgroup$
– Peter Leopold
Feb 17 at 19:15

1

$begingroup$
Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?
$endgroup$
– AnaHochma
Feb 17 at 19:22

Thanks for all comments and answers, @Peter Flom I've got measures of "Happyness" for two groups of people, I plotted them against time and looks like one group gets higher values so I'm trying to statistically compare them, I run Shapiro test for both groups and got p value << 0.05 so I don't know if a t-test or Wilcoxon

– AnaHochma
Feb 17 at 18:42

I'd go with Wilcoxon. Or maybe a bootstrap.

– Peter Flom♦
Feb 17 at 19:08

Ah, two distributions! Thanks for confirming what the data was hinting strongly at.

– Peter Leopold
Feb 17 at 19:15

Thanks! so I'll go with Wilcoxon too :) the main reason is because the Shapiro test << 0.05?

– AnaHochma
Feb 17 at 19:22

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Cross Validated!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrgtkky