Two sample z test: Applicability
$begingroup$
The following example exercise is taken from Statistics by Freedman
A geography test was given to a simple random sample of 250 high-school students
in a certain large school dishict. One question involved an outline map of Europe,
with the counhies identified only by number. The students were asked to pick out
Great Britain and France. As it turned out, 65.6% could find France, compared to
70.4% for Great Britain. 18 Is the difference statistically significant? Or can this be
determined from the information given?
The author says
Exercise 5 on p. 515 (the geography test)
is an example of when not to use the formulas. Each subject makes two responses,
by answering (i) the question on Great Britain, and (ii) the question on France.
Both responses are observed, because each subject answers both questions. And
the responses are correlated, because a geography whiz is likely to be able to
answer both questions correctly, while someone who does not pay attention to
maps is likely to get both of them wrong. By contrast, if you took two independent
samples-asking one group about France and the other about Great Britain-the
formula would be fine. (That would be an inefficient way to do the study.)
The author is talking about two sample z-test. And the formula he is talking about is
$$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$
I understand that the variables are co-related, so $CoVar(bar{X},bar{Y})$ should also be present in the formula.
What I don't understand is
By contrast, if you took two independent samples-asking one group about France and the other about Great Britain-the formula would be fine. (That would be an inefficient way to do the study.)
- Why we don't need to consider covariance in this case and only in case of single sample?
Geography whiz are going to be present in the second independent sample in the approximately same proportion as the first sample.
The author says that example problem can be solved using more advanced mathematics if we have information about the perctanges of the following category
1 1 found Great Britain and France on the map
1 0 found Great Britain; could not find France
0 1 could not find Great Britain; found France
0 0 could not find either country
I would like to know what this advanced mathematics is.
Thanks.
statistics hypothesis-testing
$endgroup$
add a comment |
$begingroup$
The following example exercise is taken from Statistics by Freedman
A geography test was given to a simple random sample of 250 high-school students
in a certain large school dishict. One question involved an outline map of Europe,
with the counhies identified only by number. The students were asked to pick out
Great Britain and France. As it turned out, 65.6% could find France, compared to
70.4% for Great Britain. 18 Is the difference statistically significant? Or can this be
determined from the information given?
The author says
Exercise 5 on p. 515 (the geography test)
is an example of when not to use the formulas. Each subject makes two responses,
by answering (i) the question on Great Britain, and (ii) the question on France.
Both responses are observed, because each subject answers both questions. And
the responses are correlated, because a geography whiz is likely to be able to
answer both questions correctly, while someone who does not pay attention to
maps is likely to get both of them wrong. By contrast, if you took two independent
samples-asking one group about France and the other about Great Britain-the
formula would be fine. (That would be an inefficient way to do the study.)
The author is talking about two sample z-test. And the formula he is talking about is
$$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$
I understand that the variables are co-related, so $CoVar(bar{X},bar{Y})$ should also be present in the formula.
What I don't understand is
By contrast, if you took two independent samples-asking one group about France and the other about Great Britain-the formula would be fine. (That would be an inefficient way to do the study.)
- Why we don't need to consider covariance in this case and only in case of single sample?
Geography whiz are going to be present in the second independent sample in the approximately same proportion as the first sample.
The author says that example problem can be solved using more advanced mathematics if we have information about the perctanges of the following category
1 1 found Great Britain and France on the map
1 0 found Great Britain; could not find France
0 1 could not find Great Britain; found France
0 0 could not find either country
I would like to know what this advanced mathematics is.
Thanks.
statistics hypothesis-testing
$endgroup$
1
$begingroup$
My guess is they have a $2 times 2$ contingency table in mind. Rows for France (Yes and No), Columns for GB (Yes and No). With enough subjects, the test statistic would have approx a chi-squared distribution with 1 degree of freedom. Alternatively, one might use a Fisher Exact test.
$endgroup$
– BruceET
Dec 1 '18 at 8:48
add a comment |
$begingroup$
The following example exercise is taken from Statistics by Freedman
A geography test was given to a simple random sample of 250 high-school students
in a certain large school dishict. One question involved an outline map of Europe,
with the counhies identified only by number. The students were asked to pick out
Great Britain and France. As it turned out, 65.6% could find France, compared to
70.4% for Great Britain. 18 Is the difference statistically significant? Or can this be
determined from the information given?
The author says
Exercise 5 on p. 515 (the geography test)
is an example of when not to use the formulas. Each subject makes two responses,
by answering (i) the question on Great Britain, and (ii) the question on France.
Both responses are observed, because each subject answers both questions. And
the responses are correlated, because a geography whiz is likely to be able to
answer both questions correctly, while someone who does not pay attention to
maps is likely to get both of them wrong. By contrast, if you took two independent
samples-asking one group about France and the other about Great Britain-the
formula would be fine. (That would be an inefficient way to do the study.)
The author is talking about two sample z-test. And the formula he is talking about is
$$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$
I understand that the variables are co-related, so $CoVar(bar{X},bar{Y})$ should also be present in the formula.
What I don't understand is
By contrast, if you took two independent samples-asking one group about France and the other about Great Britain-the formula would be fine. (That would be an inefficient way to do the study.)
- Why we don't need to consider covariance in this case and only in case of single sample?
Geography whiz are going to be present in the second independent sample in the approximately same proportion as the first sample.
The author says that example problem can be solved using more advanced mathematics if we have information about the perctanges of the following category
1 1 found Great Britain and France on the map
1 0 found Great Britain; could not find France
0 1 could not find Great Britain; found France
0 0 could not find either country
I would like to know what this advanced mathematics is.
Thanks.
statistics hypothesis-testing
$endgroup$
The following example exercise is taken from Statistics by Freedman
A geography test was given to a simple random sample of 250 high-school students
in a certain large school dishict. One question involved an outline map of Europe,
with the counhies identified only by number. The students were asked to pick out
Great Britain and France. As it turned out, 65.6% could find France, compared to
70.4% for Great Britain. 18 Is the difference statistically significant? Or can this be
determined from the information given?
The author says
Exercise 5 on p. 515 (the geography test)
is an example of when not to use the formulas. Each subject makes two responses,
by answering (i) the question on Great Britain, and (ii) the question on France.
Both responses are observed, because each subject answers both questions. And
the responses are correlated, because a geography whiz is likely to be able to
answer both questions correctly, while someone who does not pay attention to
maps is likely to get both of them wrong. By contrast, if you took two independent
samples-asking one group about France and the other about Great Britain-the
formula would be fine. (That would be an inefficient way to do the study.)
The author is talking about two sample z-test. And the formula he is talking about is
$$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$
I understand that the variables are co-related, so $CoVar(bar{X},bar{Y})$ should also be present in the formula.
What I don't understand is
By contrast, if you took two independent samples-asking one group about France and the other about Great Britain-the formula would be fine. (That would be an inefficient way to do the study.)
- Why we don't need to consider covariance in this case and only in case of single sample?
Geography whiz are going to be present in the second independent sample in the approximately same proportion as the first sample.
The author says that example problem can be solved using more advanced mathematics if we have information about the perctanges of the following category
1 1 found Great Britain and France on the map
1 0 found Great Britain; could not find France
0 1 could not find Great Britain; found France
0 0 could not find either country
I would like to know what this advanced mathematics is.
Thanks.
statistics hypothesis-testing
statistics hypothesis-testing
asked Dec 1 '18 at 7:36
q126yq126y
239212
239212
1
$begingroup$
My guess is they have a $2 times 2$ contingency table in mind. Rows for France (Yes and No), Columns for GB (Yes and No). With enough subjects, the test statistic would have approx a chi-squared distribution with 1 degree of freedom. Alternatively, one might use a Fisher Exact test.
$endgroup$
– BruceET
Dec 1 '18 at 8:48
add a comment |
1
$begingroup$
My guess is they have a $2 times 2$ contingency table in mind. Rows for France (Yes and No), Columns for GB (Yes and No). With enough subjects, the test statistic would have approx a chi-squared distribution with 1 degree of freedom. Alternatively, one might use a Fisher Exact test.
$endgroup$
– BruceET
Dec 1 '18 at 8:48
1
1
$begingroup$
My guess is they have a $2 times 2$ contingency table in mind. Rows for France (Yes and No), Columns for GB (Yes and No). With enough subjects, the test statistic would have approx a chi-squared distribution with 1 degree of freedom. Alternatively, one might use a Fisher Exact test.
$endgroup$
– BruceET
Dec 1 '18 at 8:48
$begingroup$
My guess is they have a $2 times 2$ contingency table in mind. Rows for France (Yes and No), Columns for GB (Yes and No). With enough subjects, the test statistic would have approx a chi-squared distribution with 1 degree of freedom. Alternatively, one might use a Fisher Exact test.
$endgroup$
– BruceET
Dec 1 '18 at 8:48
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Here is Minitab output for fake data in such a table.
I did not try to match the percentages you give in your problem. The null hypothesis is that recognition of GB and of France are independent abilities. The small p-value indicates the null hypothesis is rejected.
Chi-Square Test for Association: France, GB
Rows: France Columns: GB
Yes No All
Yes 43 21 64
33.58 30.42
2.640 2.915
No 10 27 37
19.42 17.58
4.566 5.042
All 53 48 101
Cell Contents: Count
Expected count
Contribution to Chi-square
Pearson Chi-Square = 15.163,
DF = 1, P-Value = 0.000
Computations:
The observed count for the upper-left cell
is $X_{11} = 43.$
The expected count for the upper-left cell
is $E_{11} = 64(53)/101 = 33.58.$
The contribution for that cell is
$(X_{11} - E_{11})^2/E_{11} = 2.64.$
The chi-squared statistic $15.163$ is the sum of
the 'contributions' from all four cells.
From the row with DF=1 in a printed table
of chi-squared distributions, you can see that the value $3.8415$ cuts 5% from the upper
tail of the distribution $mathsf{Chisq}(1),$
so that any value of the chi-squared statistic
above 3.8415 would lead you to believe that
identification of GB and identification of France are not independent abilities (at the 5% level of significance). The
chi-squared statistic here is $15.16 > 3.84.$
Perhaps you can find a more complete discussion of this kind
of test later in your text.
Addendum. Suppose my data are real. In these data, the 43 + 27 who got both countries right or neither, you have no info whether GB or France is easier to identify on a map. Of the other 31, who got exactly one country right, there are only 10 who got only GB wrong.
Those 10 are in the lower tail of $mathsf{Binom}(31, .5).$
That is, assuming
both countries are equally easy to identify, there is only probability 0.0354 < 5% that 10 or fewer get only GB wrong. I would hesitate to draw strong conclusions from only 31 useful responses, but there does seem to be evidence more people recognize GB than France on a map.
(That wouldn't be surprising, because many people know GB is
an island nation, and there aren't many big
islands on a map of Europe.)
In R:
pbinom(10, 31, .5)
[1] 0.03537777
$endgroup$
$begingroup$
Thanks Mr Bruce. Can you also throw some light on Question 1?
$endgroup$
– q126y
Dec 1 '18 at 15:30
$begingroup$
Don't understand what Q#1 is asking, but see addendum.
$endgroup$
– BruceET
Dec 1 '18 at 17:59
$begingroup$
The author says that if we had two groups of students and 1 was given to identify france and the other group was asked to identify GB, we could use 2 sample z test to say whether the difference was significant. And this relation would work okay. $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$ But I think, even if we take two samples, the presence of geography whiz in both samples will mean $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y}) -2 Covar(bar{X},bar{Y})$$ So why we can ignore the covar in case of 2 samples and cannot if we ask same sample to identify both countries?
$endgroup$
– q126y
Dec 4 '18 at 18:08
1
$begingroup$
If you have two groups chosen independently at random, then there is no covariance to ignore.
$endgroup$
– BruceET
Dec 4 '18 at 18:14
$begingroup$
Ah, yes! Thanks.
$endgroup$
– q126y
Dec 4 '18 at 18:37
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3021095%2ftwo-sample-z-test-applicability%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Here is Minitab output for fake data in such a table.
I did not try to match the percentages you give in your problem. The null hypothesis is that recognition of GB and of France are independent abilities. The small p-value indicates the null hypothesis is rejected.
Chi-Square Test for Association: France, GB
Rows: France Columns: GB
Yes No All
Yes 43 21 64
33.58 30.42
2.640 2.915
No 10 27 37
19.42 17.58
4.566 5.042
All 53 48 101
Cell Contents: Count
Expected count
Contribution to Chi-square
Pearson Chi-Square = 15.163,
DF = 1, P-Value = 0.000
Computations:
The observed count for the upper-left cell
is $X_{11} = 43.$
The expected count for the upper-left cell
is $E_{11} = 64(53)/101 = 33.58.$
The contribution for that cell is
$(X_{11} - E_{11})^2/E_{11} = 2.64.$
The chi-squared statistic $15.163$ is the sum of
the 'contributions' from all four cells.
From the row with DF=1 in a printed table
of chi-squared distributions, you can see that the value $3.8415$ cuts 5% from the upper
tail of the distribution $mathsf{Chisq}(1),$
so that any value of the chi-squared statistic
above 3.8415 would lead you to believe that
identification of GB and identification of France are not independent abilities (at the 5% level of significance). The
chi-squared statistic here is $15.16 > 3.84.$
Perhaps you can find a more complete discussion of this kind
of test later in your text.
Addendum. Suppose my data are real. In these data, the 43 + 27 who got both countries right or neither, you have no info whether GB or France is easier to identify on a map. Of the other 31, who got exactly one country right, there are only 10 who got only GB wrong.
Those 10 are in the lower tail of $mathsf{Binom}(31, .5).$
That is, assuming
both countries are equally easy to identify, there is only probability 0.0354 < 5% that 10 or fewer get only GB wrong. I would hesitate to draw strong conclusions from only 31 useful responses, but there does seem to be evidence more people recognize GB than France on a map.
(That wouldn't be surprising, because many people know GB is
an island nation, and there aren't many big
islands on a map of Europe.)
In R:
pbinom(10, 31, .5)
[1] 0.03537777
$endgroup$
$begingroup$
Thanks Mr Bruce. Can you also throw some light on Question 1?
$endgroup$
– q126y
Dec 1 '18 at 15:30
$begingroup$
Don't understand what Q#1 is asking, but see addendum.
$endgroup$
– BruceET
Dec 1 '18 at 17:59
$begingroup$
The author says that if we had two groups of students and 1 was given to identify france and the other group was asked to identify GB, we could use 2 sample z test to say whether the difference was significant. And this relation would work okay. $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$ But I think, even if we take two samples, the presence of geography whiz in both samples will mean $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y}) -2 Covar(bar{X},bar{Y})$$ So why we can ignore the covar in case of 2 samples and cannot if we ask same sample to identify both countries?
$endgroup$
– q126y
Dec 4 '18 at 18:08
1
$begingroup$
If you have two groups chosen independently at random, then there is no covariance to ignore.
$endgroup$
– BruceET
Dec 4 '18 at 18:14
$begingroup$
Ah, yes! Thanks.
$endgroup$
– q126y
Dec 4 '18 at 18:37
add a comment |
$begingroup$
Here is Minitab output for fake data in such a table.
I did not try to match the percentages you give in your problem. The null hypothesis is that recognition of GB and of France are independent abilities. The small p-value indicates the null hypothesis is rejected.
Chi-Square Test for Association: France, GB
Rows: France Columns: GB
Yes No All
Yes 43 21 64
33.58 30.42
2.640 2.915
No 10 27 37
19.42 17.58
4.566 5.042
All 53 48 101
Cell Contents: Count
Expected count
Contribution to Chi-square
Pearson Chi-Square = 15.163,
DF = 1, P-Value = 0.000
Computations:
The observed count for the upper-left cell
is $X_{11} = 43.$
The expected count for the upper-left cell
is $E_{11} = 64(53)/101 = 33.58.$
The contribution for that cell is
$(X_{11} - E_{11})^2/E_{11} = 2.64.$
The chi-squared statistic $15.163$ is the sum of
the 'contributions' from all four cells.
From the row with DF=1 in a printed table
of chi-squared distributions, you can see that the value $3.8415$ cuts 5% from the upper
tail of the distribution $mathsf{Chisq}(1),$
so that any value of the chi-squared statistic
above 3.8415 would lead you to believe that
identification of GB and identification of France are not independent abilities (at the 5% level of significance). The
chi-squared statistic here is $15.16 > 3.84.$
Perhaps you can find a more complete discussion of this kind
of test later in your text.
Addendum. Suppose my data are real. In these data, the 43 + 27 who got both countries right or neither, you have no info whether GB or France is easier to identify on a map. Of the other 31, who got exactly one country right, there are only 10 who got only GB wrong.
Those 10 are in the lower tail of $mathsf{Binom}(31, .5).$
That is, assuming
both countries are equally easy to identify, there is only probability 0.0354 < 5% that 10 or fewer get only GB wrong. I would hesitate to draw strong conclusions from only 31 useful responses, but there does seem to be evidence more people recognize GB than France on a map.
(That wouldn't be surprising, because many people know GB is
an island nation, and there aren't many big
islands on a map of Europe.)
In R:
pbinom(10, 31, .5)
[1] 0.03537777
$endgroup$
$begingroup$
Thanks Mr Bruce. Can you also throw some light on Question 1?
$endgroup$
– q126y
Dec 1 '18 at 15:30
$begingroup$
Don't understand what Q#1 is asking, but see addendum.
$endgroup$
– BruceET
Dec 1 '18 at 17:59
$begingroup$
The author says that if we had two groups of students and 1 was given to identify france and the other group was asked to identify GB, we could use 2 sample z test to say whether the difference was significant. And this relation would work okay. $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$ But I think, even if we take two samples, the presence of geography whiz in both samples will mean $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y}) -2 Covar(bar{X},bar{Y})$$ So why we can ignore the covar in case of 2 samples and cannot if we ask same sample to identify both countries?
$endgroup$
– q126y
Dec 4 '18 at 18:08
1
$begingroup$
If you have two groups chosen independently at random, then there is no covariance to ignore.
$endgroup$
– BruceET
Dec 4 '18 at 18:14
$begingroup$
Ah, yes! Thanks.
$endgroup$
– q126y
Dec 4 '18 at 18:37
add a comment |
$begingroup$
Here is Minitab output for fake data in such a table.
I did not try to match the percentages you give in your problem. The null hypothesis is that recognition of GB and of France are independent abilities. The small p-value indicates the null hypothesis is rejected.
Chi-Square Test for Association: France, GB
Rows: France Columns: GB
Yes No All
Yes 43 21 64
33.58 30.42
2.640 2.915
No 10 27 37
19.42 17.58
4.566 5.042
All 53 48 101
Cell Contents: Count
Expected count
Contribution to Chi-square
Pearson Chi-Square = 15.163,
DF = 1, P-Value = 0.000
Computations:
The observed count for the upper-left cell
is $X_{11} = 43.$
The expected count for the upper-left cell
is $E_{11} = 64(53)/101 = 33.58.$
The contribution for that cell is
$(X_{11} - E_{11})^2/E_{11} = 2.64.$
The chi-squared statistic $15.163$ is the sum of
the 'contributions' from all four cells.
From the row with DF=1 in a printed table
of chi-squared distributions, you can see that the value $3.8415$ cuts 5% from the upper
tail of the distribution $mathsf{Chisq}(1),$
so that any value of the chi-squared statistic
above 3.8415 would lead you to believe that
identification of GB and identification of France are not independent abilities (at the 5% level of significance). The
chi-squared statistic here is $15.16 > 3.84.$
Perhaps you can find a more complete discussion of this kind
of test later in your text.
Addendum. Suppose my data are real. In these data, the 43 + 27 who got both countries right or neither, you have no info whether GB or France is easier to identify on a map. Of the other 31, who got exactly one country right, there are only 10 who got only GB wrong.
Those 10 are in the lower tail of $mathsf{Binom}(31, .5).$
That is, assuming
both countries are equally easy to identify, there is only probability 0.0354 < 5% that 10 or fewer get only GB wrong. I would hesitate to draw strong conclusions from only 31 useful responses, but there does seem to be evidence more people recognize GB than France on a map.
(That wouldn't be surprising, because many people know GB is
an island nation, and there aren't many big
islands on a map of Europe.)
In R:
pbinom(10, 31, .5)
[1] 0.03537777
$endgroup$
Here is Minitab output for fake data in such a table.
I did not try to match the percentages you give in your problem. The null hypothesis is that recognition of GB and of France are independent abilities. The small p-value indicates the null hypothesis is rejected.
Chi-Square Test for Association: France, GB
Rows: France Columns: GB
Yes No All
Yes 43 21 64
33.58 30.42
2.640 2.915
No 10 27 37
19.42 17.58
4.566 5.042
All 53 48 101
Cell Contents: Count
Expected count
Contribution to Chi-square
Pearson Chi-Square = 15.163,
DF = 1, P-Value = 0.000
Computations:
The observed count for the upper-left cell
is $X_{11} = 43.$
The expected count for the upper-left cell
is $E_{11} = 64(53)/101 = 33.58.$
The contribution for that cell is
$(X_{11} - E_{11})^2/E_{11} = 2.64.$
The chi-squared statistic $15.163$ is the sum of
the 'contributions' from all four cells.
From the row with DF=1 in a printed table
of chi-squared distributions, you can see that the value $3.8415$ cuts 5% from the upper
tail of the distribution $mathsf{Chisq}(1),$
so that any value of the chi-squared statistic
above 3.8415 would lead you to believe that
identification of GB and identification of France are not independent abilities (at the 5% level of significance). The
chi-squared statistic here is $15.16 > 3.84.$
Perhaps you can find a more complete discussion of this kind
of test later in your text.
Addendum. Suppose my data are real. In these data, the 43 + 27 who got both countries right or neither, you have no info whether GB or France is easier to identify on a map. Of the other 31, who got exactly one country right, there are only 10 who got only GB wrong.
Those 10 are in the lower tail of $mathsf{Binom}(31, .5).$
That is, assuming
both countries are equally easy to identify, there is only probability 0.0354 < 5% that 10 or fewer get only GB wrong. I would hesitate to draw strong conclusions from only 31 useful responses, but there does seem to be evidence more people recognize GB than France on a map.
(That wouldn't be surprising, because many people know GB is
an island nation, and there aren't many big
islands on a map of Europe.)
In R:
pbinom(10, 31, .5)
[1] 0.03537777
edited Dec 1 '18 at 18:18
answered Dec 1 '18 at 9:00
BruceETBruceET
35.6k71440
35.6k71440
$begingroup$
Thanks Mr Bruce. Can you also throw some light on Question 1?
$endgroup$
– q126y
Dec 1 '18 at 15:30
$begingroup$
Don't understand what Q#1 is asking, but see addendum.
$endgroup$
– BruceET
Dec 1 '18 at 17:59
$begingroup$
The author says that if we had two groups of students and 1 was given to identify france and the other group was asked to identify GB, we could use 2 sample z test to say whether the difference was significant. And this relation would work okay. $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$ But I think, even if we take two samples, the presence of geography whiz in both samples will mean $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y}) -2 Covar(bar{X},bar{Y})$$ So why we can ignore the covar in case of 2 samples and cannot if we ask same sample to identify both countries?
$endgroup$
– q126y
Dec 4 '18 at 18:08
1
$begingroup$
If you have two groups chosen independently at random, then there is no covariance to ignore.
$endgroup$
– BruceET
Dec 4 '18 at 18:14
$begingroup$
Ah, yes! Thanks.
$endgroup$
– q126y
Dec 4 '18 at 18:37
add a comment |
$begingroup$
Thanks Mr Bruce. Can you also throw some light on Question 1?
$endgroup$
– q126y
Dec 1 '18 at 15:30
$begingroup$
Don't understand what Q#1 is asking, but see addendum.
$endgroup$
– BruceET
Dec 1 '18 at 17:59
$begingroup$
The author says that if we had two groups of students and 1 was given to identify france and the other group was asked to identify GB, we could use 2 sample z test to say whether the difference was significant. And this relation would work okay. $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$ But I think, even if we take two samples, the presence of geography whiz in both samples will mean $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y}) -2 Covar(bar{X},bar{Y})$$ So why we can ignore the covar in case of 2 samples and cannot if we ask same sample to identify both countries?
$endgroup$
– q126y
Dec 4 '18 at 18:08
1
$begingroup$
If you have two groups chosen independently at random, then there is no covariance to ignore.
$endgroup$
– BruceET
Dec 4 '18 at 18:14
$begingroup$
Ah, yes! Thanks.
$endgroup$
– q126y
Dec 4 '18 at 18:37
$begingroup$
Thanks Mr Bruce. Can you also throw some light on Question 1?
$endgroup$
– q126y
Dec 1 '18 at 15:30
$begingroup$
Thanks Mr Bruce. Can you also throw some light on Question 1?
$endgroup$
– q126y
Dec 1 '18 at 15:30
$begingroup$
Don't understand what Q#1 is asking, but see addendum.
$endgroup$
– BruceET
Dec 1 '18 at 17:59
$begingroup$
Don't understand what Q#1 is asking, but see addendum.
$endgroup$
– BruceET
Dec 1 '18 at 17:59
$begingroup$
The author says that if we had two groups of students and 1 was given to identify france and the other group was asked to identify GB, we could use 2 sample z test to say whether the difference was significant. And this relation would work okay. $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$ But I think, even if we take two samples, the presence of geography whiz in both samples will mean $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y}) -2 Covar(bar{X},bar{Y})$$ So why we can ignore the covar in case of 2 samples and cannot if we ask same sample to identify both countries?
$endgroup$
– q126y
Dec 4 '18 at 18:08
$begingroup$
The author says that if we had two groups of students and 1 was given to identify france and the other group was asked to identify GB, we could use 2 sample z test to say whether the difference was significant. And this relation would work okay. $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y})$$ But I think, even if we take two samples, the presence of geography whiz in both samples will mean $$Var(bar{X}-bar{Y})=Var(bar{X})+Var(bar{Y}) -2 Covar(bar{X},bar{Y})$$ So why we can ignore the covar in case of 2 samples and cannot if we ask same sample to identify both countries?
$endgroup$
– q126y
Dec 4 '18 at 18:08
1
1
$begingroup$
If you have two groups chosen independently at random, then there is no covariance to ignore.
$endgroup$
– BruceET
Dec 4 '18 at 18:14
$begingroup$
If you have two groups chosen independently at random, then there is no covariance to ignore.
$endgroup$
– BruceET
Dec 4 '18 at 18:14
$begingroup$
Ah, yes! Thanks.
$endgroup$
– q126y
Dec 4 '18 at 18:37
$begingroup$
Ah, yes! Thanks.
$endgroup$
– q126y
Dec 4 '18 at 18:37
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3021095%2ftwo-sample-z-test-applicability%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
$begingroup$
My guess is they have a $2 times 2$ contingency table in mind. Rows for France (Yes and No), Columns for GB (Yes and No). With enough subjects, the test statistic would have approx a chi-squared distribution with 1 degree of freedom. Alternatively, one might use a Fisher Exact test.
$endgroup$
– BruceET
Dec 1 '18 at 8:48