calculate sample size for specified type II error probability, comparing 2 proportions












0












$begingroup$


This is from Devore Probability and Statistics for Engineering and Sciences 9th edition, chapter 9, section 9.4, exercise #53 on p.398.



The problem involves a hypothesis test comparing two proportions. Part b of the problem states:

"If the true percentages for the two treatments were 15% and 20%, respectively, what sample sizes $(m=n)$ would be necessary to detect such a difference?"



I understand that it is asking us to raise the probability of rejecting the null hypothesis given a specific alternative hypothesis to $1-beta$ (i.e. reduce the probability of type II error by making the sample size large enough). This wording of the question is vastly different from that in definitions, theorems, and examples and can easily cause confusion for students, but that is really a separate issue from my question here.



Equation 9.7 on p.395 provides the required formula:



With $p_1$ and $p_2$ ($q_i=1-p_i$) being the alternative proportions and $z_p$ being the $p^{th}$ percentile of the standard normal, the sample size is
$$n=frac{left(z_{1-alpha/2}sqrt{(p_1+p_2)(q_1+q_2)/2}+z_{1-beta}sqrt{p_1q_1+p_2q_2}right)^2}{(p_1-p_2)^2}.$$



I will assume $p_1>p_2$ in the alternative (it doesn't matter). I will also thus ignore the left tail since it will have extremely small probability under the alternative hypothesis.



We have that:
$$hat{p}_isim Nleft(p_i,sqrt{frac{p_iq_i}{n}}right)$$
$$hat{p}_1-hat p_2sim Nleft(p_1-p_2,sqrt{frac{p_1q_1+p_2q_2}{n}}right)$$
Thus under the null hypothesis, $hat{p}_1-hat p_2sim Nleft(0,sqrt{frac{2pq}{n}}right)$ where we assume that $p_1=p_2=p$. However the null hypothesis only specifies the difference $p_1-p_2=0$ but does not posit any specific value for $p$.



My question is: what justifies setting $p=frac12(p_1+p_2)$ (as used in the sample size calculation formula above)? In other words why are we using the alternative proportions to calculate the percentile under the null hypothesis distribution?



It seems that we might want to maximize this $n$ by using $p=frac12$. I guess there is a built in assumption that the alternative proportions will naturally be somewhat close to our null proportions, or maybe that we can set our null proportions to whatever we want.



There can be a wide range of sample sizes depending on what we input for $p$. For example, with $p_1=0.2$ and $p_2=0.15$, letting $p$ range from 0 to 0.5 gives $n$ ranging from 479 to 1719 (requiring also that $np>10$ and $n(1-p)>10$ otherwise the lower bound on $n$ is 189).










share|cite|improve this question









$endgroup$

















    0












    $begingroup$


    This is from Devore Probability and Statistics for Engineering and Sciences 9th edition, chapter 9, section 9.4, exercise #53 on p.398.



    The problem involves a hypothesis test comparing two proportions. Part b of the problem states:

    "If the true percentages for the two treatments were 15% and 20%, respectively, what sample sizes $(m=n)$ would be necessary to detect such a difference?"



    I understand that it is asking us to raise the probability of rejecting the null hypothesis given a specific alternative hypothesis to $1-beta$ (i.e. reduce the probability of type II error by making the sample size large enough). This wording of the question is vastly different from that in definitions, theorems, and examples and can easily cause confusion for students, but that is really a separate issue from my question here.



    Equation 9.7 on p.395 provides the required formula:



    With $p_1$ and $p_2$ ($q_i=1-p_i$) being the alternative proportions and $z_p$ being the $p^{th}$ percentile of the standard normal, the sample size is
    $$n=frac{left(z_{1-alpha/2}sqrt{(p_1+p_2)(q_1+q_2)/2}+z_{1-beta}sqrt{p_1q_1+p_2q_2}right)^2}{(p_1-p_2)^2}.$$



    I will assume $p_1>p_2$ in the alternative (it doesn't matter). I will also thus ignore the left tail since it will have extremely small probability under the alternative hypothesis.



    We have that:
    $$hat{p}_isim Nleft(p_i,sqrt{frac{p_iq_i}{n}}right)$$
    $$hat{p}_1-hat p_2sim Nleft(p_1-p_2,sqrt{frac{p_1q_1+p_2q_2}{n}}right)$$
    Thus under the null hypothesis, $hat{p}_1-hat p_2sim Nleft(0,sqrt{frac{2pq}{n}}right)$ where we assume that $p_1=p_2=p$. However the null hypothesis only specifies the difference $p_1-p_2=0$ but does not posit any specific value for $p$.



    My question is: what justifies setting $p=frac12(p_1+p_2)$ (as used in the sample size calculation formula above)? In other words why are we using the alternative proportions to calculate the percentile under the null hypothesis distribution?



    It seems that we might want to maximize this $n$ by using $p=frac12$. I guess there is a built in assumption that the alternative proportions will naturally be somewhat close to our null proportions, or maybe that we can set our null proportions to whatever we want.



    There can be a wide range of sample sizes depending on what we input for $p$. For example, with $p_1=0.2$ and $p_2=0.15$, letting $p$ range from 0 to 0.5 gives $n$ ranging from 479 to 1719 (requiring also that $np>10$ and $n(1-p)>10$ otherwise the lower bound on $n$ is 189).










    share|cite|improve this question









    $endgroup$















      0












      0








      0





      $begingroup$


      This is from Devore Probability and Statistics for Engineering and Sciences 9th edition, chapter 9, section 9.4, exercise #53 on p.398.



      The problem involves a hypothesis test comparing two proportions. Part b of the problem states:

      "If the true percentages for the two treatments were 15% and 20%, respectively, what sample sizes $(m=n)$ would be necessary to detect such a difference?"



      I understand that it is asking us to raise the probability of rejecting the null hypothesis given a specific alternative hypothesis to $1-beta$ (i.e. reduce the probability of type II error by making the sample size large enough). This wording of the question is vastly different from that in definitions, theorems, and examples and can easily cause confusion for students, but that is really a separate issue from my question here.



      Equation 9.7 on p.395 provides the required formula:



      With $p_1$ and $p_2$ ($q_i=1-p_i$) being the alternative proportions and $z_p$ being the $p^{th}$ percentile of the standard normal, the sample size is
      $$n=frac{left(z_{1-alpha/2}sqrt{(p_1+p_2)(q_1+q_2)/2}+z_{1-beta}sqrt{p_1q_1+p_2q_2}right)^2}{(p_1-p_2)^2}.$$



      I will assume $p_1>p_2$ in the alternative (it doesn't matter). I will also thus ignore the left tail since it will have extremely small probability under the alternative hypothesis.



      We have that:
      $$hat{p}_isim Nleft(p_i,sqrt{frac{p_iq_i}{n}}right)$$
      $$hat{p}_1-hat p_2sim Nleft(p_1-p_2,sqrt{frac{p_1q_1+p_2q_2}{n}}right)$$
      Thus under the null hypothesis, $hat{p}_1-hat p_2sim Nleft(0,sqrt{frac{2pq}{n}}right)$ where we assume that $p_1=p_2=p$. However the null hypothesis only specifies the difference $p_1-p_2=0$ but does not posit any specific value for $p$.



      My question is: what justifies setting $p=frac12(p_1+p_2)$ (as used in the sample size calculation formula above)? In other words why are we using the alternative proportions to calculate the percentile under the null hypothesis distribution?



      It seems that we might want to maximize this $n$ by using $p=frac12$. I guess there is a built in assumption that the alternative proportions will naturally be somewhat close to our null proportions, or maybe that we can set our null proportions to whatever we want.



      There can be a wide range of sample sizes depending on what we input for $p$. For example, with $p_1=0.2$ and $p_2=0.15$, letting $p$ range from 0 to 0.5 gives $n$ ranging from 479 to 1719 (requiring also that $np>10$ and $n(1-p)>10$ otherwise the lower bound on $n$ is 189).










      share|cite|improve this question









      $endgroup$




      This is from Devore Probability and Statistics for Engineering and Sciences 9th edition, chapter 9, section 9.4, exercise #53 on p.398.



      The problem involves a hypothesis test comparing two proportions. Part b of the problem states:

      "If the true percentages for the two treatments were 15% and 20%, respectively, what sample sizes $(m=n)$ would be necessary to detect such a difference?"



      I understand that it is asking us to raise the probability of rejecting the null hypothesis given a specific alternative hypothesis to $1-beta$ (i.e. reduce the probability of type II error by making the sample size large enough). This wording of the question is vastly different from that in definitions, theorems, and examples and can easily cause confusion for students, but that is really a separate issue from my question here.



      Equation 9.7 on p.395 provides the required formula:



      With $p_1$ and $p_2$ ($q_i=1-p_i$) being the alternative proportions and $z_p$ being the $p^{th}$ percentile of the standard normal, the sample size is
      $$n=frac{left(z_{1-alpha/2}sqrt{(p_1+p_2)(q_1+q_2)/2}+z_{1-beta}sqrt{p_1q_1+p_2q_2}right)^2}{(p_1-p_2)^2}.$$



      I will assume $p_1>p_2$ in the alternative (it doesn't matter). I will also thus ignore the left tail since it will have extremely small probability under the alternative hypothesis.



      We have that:
      $$hat{p}_isim Nleft(p_i,sqrt{frac{p_iq_i}{n}}right)$$
      $$hat{p}_1-hat p_2sim Nleft(p_1-p_2,sqrt{frac{p_1q_1+p_2q_2}{n}}right)$$
      Thus under the null hypothesis, $hat{p}_1-hat p_2sim Nleft(0,sqrt{frac{2pq}{n}}right)$ where we assume that $p_1=p_2=p$. However the null hypothesis only specifies the difference $p_1-p_2=0$ but does not posit any specific value for $p$.



      My question is: what justifies setting $p=frac12(p_1+p_2)$ (as used in the sample size calculation formula above)? In other words why are we using the alternative proportions to calculate the percentile under the null hypothesis distribution?



      It seems that we might want to maximize this $n$ by using $p=frac12$. I guess there is a built in assumption that the alternative proportions will naturally be somewhat close to our null proportions, or maybe that we can set our null proportions to whatever we want.



      There can be a wide range of sample sizes depending on what we input for $p$. For example, with $p_1=0.2$ and $p_2=0.15$, letting $p$ range from 0 to 0.5 gives $n$ ranging from 479 to 1719 (requiring also that $np>10$ and $n(1-p)>10$ otherwise the lower bound on $n$ is 189).







      statistics hypothesis-testing






      share|cite|improve this question













      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked Dec 10 '18 at 19:57









      jdodsjdods

      3,67311234




      3,67311234






















          0






          active

          oldest

          votes











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "69"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          noCode: true, onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3034394%2fcalculate-sample-size-for-specified-type-ii-error-probability-comparing-2-propo%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Mathematics Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3034394%2fcalculate-sample-size-for-specified-type-ii-error-probability-comparing-2-propo%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Biblatex bibliography style without URLs when DOI exists (in Overleaf with Zotero bibliography)

          ComboBox Display Member on multiple fields

          Is it possible to collect Nectar points via Trainline?