Determining sample size of a set of boolean data where the probability is not 50%












2












$begingroup$


I'll lay out the problem as a simplified puzzle of what I am attempting to calculate. I imagine some of this may seem fairly straightforward to many but I'm starting to get a bit lost in my head while trying to think through the problem.



Let's say I roll a 1000-sided die until it lands on the number 1. Let's say it took me 700 rolls to get there. I want to prove that the first 699 rolls were not number 1 and obviously the only way to deterministically do this is to include the first 699 failures as part of the result to show they were in fact "not 1".



However, that's a lot of data I would need to prove this. I would have to include all 700 rolls, which is a lot. Therefore, I want to probabilistically demonstrate the fact that I rolled 699 "not 1s" prior to rolling a 1. To do this, I decide I will randomly sample my "not 1" rolls to reduce the set to a statistically significant, yet more wieldy number. It will be good enough to demonstrate that I very probably did not roll a 1 prior to roll 700.



Here are my current assumptions about the state of this problem:




  • My initial experiment of rolling until success is one of geometric distribution.

  • However my goal for this problem is to demonstrate to a third party that I am not lying, therefore the skeptical third party is not concerned with geometric distribution but would view this simply as a binomial distribution problem.


A lot of sample size calculators exist on the web. They are all based around binomial distribution from what I can tell. So here's the formula I am considering:



$$
n = frac{N times X}{X + N – 1}
$$



$$
X = frac{{Z_{alpha/2}}^2 ­times p times (1-p)}{mathsf{MOE}^2}
$$





  • $n$ is sample size


  • $N$ is population size


  • $Z$ is critical value ($alpha$ is $1-mathsf{confidencespace levelspace asspace probability}$)


  • $p$ is sample proportion


  • $mathsf{MOE}$ is margin of error


As an aside, the website where I got this formula says it implements "finite population correction", is this desirable for my requirements?



Here is the math executed on my above numbers. I will use $Z_{a/2}=2.58$ for $alpha=0.01$, $p=0.001$ and $mathsf{MOE}=0.005$. As stated above, $N=699$ on account of there being 699 failure cases that I would like to sample with a certain level of confidence.



Based on my understanding, what this math will do is recommend a sample size that will show, with 99% confidence, that the sample result is within 0.5 percentage points of reality.



Doing the math, $X=265.989744$ and $n=192.8722086653approx193$, implying that I can have a sample size of 193 to fulfill this confidence level and interval.



My main question is whether my assumption about $p=frac{1}{1000}$ is valid. If it's not, and I use the conservative $p=0.5$, then my sample size shoots up to $approx692$. So I would like to know if my assumptions about what sample proportion actually is are correct.



More broadly, am I on the right track at all with this? From my attempt at demonstrating this probabilistically to my current thought process, is any of this accurate at all? Thank you.










share|cite|improve this question









$endgroup$












  • $begingroup$
    I don’t understand what your question is. Are you asking how large of a sample size one needs to be sure of a biased proportion estimate to be within a confidence interval? The fact that $p neq 1/2$ seems irrelevant to me.
    $endgroup$
    – LoveTooNap29
    Jan 1 at 17:42










  • $begingroup$
    I guess what I’m asking is how can I get a sample size such that I can demonstrate that the trials I personally know all failed actually did all fail (probabilistically speaking). Deterministically, I could give a skeptic verifier 699 failed dice rolls. Probabilistically, I could give them a random sample of n failed dice rolls. How can I find n? Or, and I’m open to this as well, am I completely off base with regards to this being a valid solution to proving my dice rolls?
    $endgroup$
    – Samuel Horwitz
    Jan 1 at 17:56










  • $begingroup$
    I still do not follow, sorry. If I wanted to convince a skeptic about the failure rate of a random experiment I am just going to measure the proportion of successes, $hat{p}$ and find how many trials, $N$, I need to perform so that my confidence interval is sufficiently small and then translate this into a CI for the failure percentage, $hat{q}=1-hat{p}$. I still fail to see how this doesn't even marginally satisfy your aim.
    $endgroup$
    – LoveTooNap29
    Jan 1 at 18:37










  • $begingroup$
    Let me be a bit less abstract about the problem, I attempted to write it out in a more isolated way, but the actual problem I am attempting to solve requires the experiment as it is run to be the canonical result. Imagine a POW scheme where a nonce incrementing by 1 from 0 is used to find a hash that has a specific trailing number of 0 bits. Once a nonce is reached that satisfies this, that POW is valid. Unlike most POWs, I want there to be only 1 valid answer (the first success). Is it possible to sample my failures in a way that probabilistically shows I didn't illicitly skip a success?
    $endgroup$
    – Samuel Horwitz
    Jan 1 at 19:02










  • $begingroup$
    Now you've really lost me. This is a math forum, remember, not every one here is going to know to what a POW scheme refers... Anyway, if I run an experiment $N$ times with the only and first success on trial $n=N$, and you want to validate this, then yes the easiest way to is to show you the first recorded $N-1$ outcomes of failure. If the outcomes are recorded in some data structure/spreadsheet, one can easily count them with a simple function. No probabilistic method is required because there is no longer any source of uncertainty after the experiment is done.
    $endgroup$
    – LoveTooNap29
    Jan 1 at 19:25
















2












$begingroup$


I'll lay out the problem as a simplified puzzle of what I am attempting to calculate. I imagine some of this may seem fairly straightforward to many but I'm starting to get a bit lost in my head while trying to think through the problem.



Let's say I roll a 1000-sided die until it lands on the number 1. Let's say it took me 700 rolls to get there. I want to prove that the first 699 rolls were not number 1 and obviously the only way to deterministically do this is to include the first 699 failures as part of the result to show they were in fact "not 1".



However, that's a lot of data I would need to prove this. I would have to include all 700 rolls, which is a lot. Therefore, I want to probabilistically demonstrate the fact that I rolled 699 "not 1s" prior to rolling a 1. To do this, I decide I will randomly sample my "not 1" rolls to reduce the set to a statistically significant, yet more wieldy number. It will be good enough to demonstrate that I very probably did not roll a 1 prior to roll 700.



Here are my current assumptions about the state of this problem:




  • My initial experiment of rolling until success is one of geometric distribution.

  • However my goal for this problem is to demonstrate to a third party that I am not lying, therefore the skeptical third party is not concerned with geometric distribution but would view this simply as a binomial distribution problem.


A lot of sample size calculators exist on the web. They are all based around binomial distribution from what I can tell. So here's the formula I am considering:



$$
n = frac{N times X}{X + N – 1}
$$



$$
X = frac{{Z_{alpha/2}}^2 ­times p times (1-p)}{mathsf{MOE}^2}
$$





  • $n$ is sample size


  • $N$ is population size


  • $Z$ is critical value ($alpha$ is $1-mathsf{confidencespace levelspace asspace probability}$)


  • $p$ is sample proportion


  • $mathsf{MOE}$ is margin of error


As an aside, the website where I got this formula says it implements "finite population correction", is this desirable for my requirements?



Here is the math executed on my above numbers. I will use $Z_{a/2}=2.58$ for $alpha=0.01$, $p=0.001$ and $mathsf{MOE}=0.005$. As stated above, $N=699$ on account of there being 699 failure cases that I would like to sample with a certain level of confidence.



Based on my understanding, what this math will do is recommend a sample size that will show, with 99% confidence, that the sample result is within 0.5 percentage points of reality.



Doing the math, $X=265.989744$ and $n=192.8722086653approx193$, implying that I can have a sample size of 193 to fulfill this confidence level and interval.



My main question is whether my assumption about $p=frac{1}{1000}$ is valid. If it's not, and I use the conservative $p=0.5$, then my sample size shoots up to $approx692$. So I would like to know if my assumptions about what sample proportion actually is are correct.



More broadly, am I on the right track at all with this? From my attempt at demonstrating this probabilistically to my current thought process, is any of this accurate at all? Thank you.










share|cite|improve this question









$endgroup$












  • $begingroup$
    I don’t understand what your question is. Are you asking how large of a sample size one needs to be sure of a biased proportion estimate to be within a confidence interval? The fact that $p neq 1/2$ seems irrelevant to me.
    $endgroup$
    – LoveTooNap29
    Jan 1 at 17:42










  • $begingroup$
    I guess what I’m asking is how can I get a sample size such that I can demonstrate that the trials I personally know all failed actually did all fail (probabilistically speaking). Deterministically, I could give a skeptic verifier 699 failed dice rolls. Probabilistically, I could give them a random sample of n failed dice rolls. How can I find n? Or, and I’m open to this as well, am I completely off base with regards to this being a valid solution to proving my dice rolls?
    $endgroup$
    – Samuel Horwitz
    Jan 1 at 17:56










  • $begingroup$
    I still do not follow, sorry. If I wanted to convince a skeptic about the failure rate of a random experiment I am just going to measure the proportion of successes, $hat{p}$ and find how many trials, $N$, I need to perform so that my confidence interval is sufficiently small and then translate this into a CI for the failure percentage, $hat{q}=1-hat{p}$. I still fail to see how this doesn't even marginally satisfy your aim.
    $endgroup$
    – LoveTooNap29
    Jan 1 at 18:37










  • $begingroup$
    Let me be a bit less abstract about the problem, I attempted to write it out in a more isolated way, but the actual problem I am attempting to solve requires the experiment as it is run to be the canonical result. Imagine a POW scheme where a nonce incrementing by 1 from 0 is used to find a hash that has a specific trailing number of 0 bits. Once a nonce is reached that satisfies this, that POW is valid. Unlike most POWs, I want there to be only 1 valid answer (the first success). Is it possible to sample my failures in a way that probabilistically shows I didn't illicitly skip a success?
    $endgroup$
    – Samuel Horwitz
    Jan 1 at 19:02










  • $begingroup$
    Now you've really lost me. This is a math forum, remember, not every one here is going to know to what a POW scheme refers... Anyway, if I run an experiment $N$ times with the only and first success on trial $n=N$, and you want to validate this, then yes the easiest way to is to show you the first recorded $N-1$ outcomes of failure. If the outcomes are recorded in some data structure/spreadsheet, one can easily count them with a simple function. No probabilistic method is required because there is no longer any source of uncertainty after the experiment is done.
    $endgroup$
    – LoveTooNap29
    Jan 1 at 19:25














2












2








2





$begingroup$


I'll lay out the problem as a simplified puzzle of what I am attempting to calculate. I imagine some of this may seem fairly straightforward to many but I'm starting to get a bit lost in my head while trying to think through the problem.



Let's say I roll a 1000-sided die until it lands on the number 1. Let's say it took me 700 rolls to get there. I want to prove that the first 699 rolls were not number 1 and obviously the only way to deterministically do this is to include the first 699 failures as part of the result to show they were in fact "not 1".



However, that's a lot of data I would need to prove this. I would have to include all 700 rolls, which is a lot. Therefore, I want to probabilistically demonstrate the fact that I rolled 699 "not 1s" prior to rolling a 1. To do this, I decide I will randomly sample my "not 1" rolls to reduce the set to a statistically significant, yet more wieldy number. It will be good enough to demonstrate that I very probably did not roll a 1 prior to roll 700.



Here are my current assumptions about the state of this problem:




  • My initial experiment of rolling until success is one of geometric distribution.

  • However my goal for this problem is to demonstrate to a third party that I am not lying, therefore the skeptical third party is not concerned with geometric distribution but would view this simply as a binomial distribution problem.


A lot of sample size calculators exist on the web. They are all based around binomial distribution from what I can tell. So here's the formula I am considering:



$$
n = frac{N times X}{X + N – 1}
$$



$$
X = frac{{Z_{alpha/2}}^2 ­times p times (1-p)}{mathsf{MOE}^2}
$$





  • $n$ is sample size


  • $N$ is population size


  • $Z$ is critical value ($alpha$ is $1-mathsf{confidencespace levelspace asspace probability}$)


  • $p$ is sample proportion


  • $mathsf{MOE}$ is margin of error


As an aside, the website where I got this formula says it implements "finite population correction", is this desirable for my requirements?



Here is the math executed on my above numbers. I will use $Z_{a/2}=2.58$ for $alpha=0.01$, $p=0.001$ and $mathsf{MOE}=0.005$. As stated above, $N=699$ on account of there being 699 failure cases that I would like to sample with a certain level of confidence.



Based on my understanding, what this math will do is recommend a sample size that will show, with 99% confidence, that the sample result is within 0.5 percentage points of reality.



Doing the math, $X=265.989744$ and $n=192.8722086653approx193$, implying that I can have a sample size of 193 to fulfill this confidence level and interval.



My main question is whether my assumption about $p=frac{1}{1000}$ is valid. If it's not, and I use the conservative $p=0.5$, then my sample size shoots up to $approx692$. So I would like to know if my assumptions about what sample proportion actually is are correct.



More broadly, am I on the right track at all with this? From my attempt at demonstrating this probabilistically to my current thought process, is any of this accurate at all? Thank you.










share|cite|improve this question









$endgroup$




I'll lay out the problem as a simplified puzzle of what I am attempting to calculate. I imagine some of this may seem fairly straightforward to many but I'm starting to get a bit lost in my head while trying to think through the problem.



Let's say I roll a 1000-sided die until it lands on the number 1. Let's say it took me 700 rolls to get there. I want to prove that the first 699 rolls were not number 1 and obviously the only way to deterministically do this is to include the first 699 failures as part of the result to show they were in fact "not 1".



However, that's a lot of data I would need to prove this. I would have to include all 700 rolls, which is a lot. Therefore, I want to probabilistically demonstrate the fact that I rolled 699 "not 1s" prior to rolling a 1. To do this, I decide I will randomly sample my "not 1" rolls to reduce the set to a statistically significant, yet more wieldy number. It will be good enough to demonstrate that I very probably did not roll a 1 prior to roll 700.



Here are my current assumptions about the state of this problem:




  • My initial experiment of rolling until success is one of geometric distribution.

  • However my goal for this problem is to demonstrate to a third party that I am not lying, therefore the skeptical third party is not concerned with geometric distribution but would view this simply as a binomial distribution problem.


A lot of sample size calculators exist on the web. They are all based around binomial distribution from what I can tell. So here's the formula I am considering:



$$
n = frac{N times X}{X + N – 1}
$$



$$
X = frac{{Z_{alpha/2}}^2 ­times p times (1-p)}{mathsf{MOE}^2}
$$





  • $n$ is sample size


  • $N$ is population size


  • $Z$ is critical value ($alpha$ is $1-mathsf{confidencespace levelspace asspace probability}$)


  • $p$ is sample proportion


  • $mathsf{MOE}$ is margin of error


As an aside, the website where I got this formula says it implements "finite population correction", is this desirable for my requirements?



Here is the math executed on my above numbers. I will use $Z_{a/2}=2.58$ for $alpha=0.01$, $p=0.001$ and $mathsf{MOE}=0.005$. As stated above, $N=699$ on account of there being 699 failure cases that I would like to sample with a certain level of confidence.



Based on my understanding, what this math will do is recommend a sample size that will show, with 99% confidence, that the sample result is within 0.5 percentage points of reality.



Doing the math, $X=265.989744$ and $n=192.8722086653approx193$, implying that I can have a sample size of 193 to fulfill this confidence level and interval.



My main question is whether my assumption about $p=frac{1}{1000}$ is valid. If it's not, and I use the conservative $p=0.5$, then my sample size shoots up to $approx692$. So I would like to know if my assumptions about what sample proportion actually is are correct.



More broadly, am I on the right track at all with this? From my attempt at demonstrating this probabilistically to my current thought process, is any of this accurate at all? Thank you.







probability






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Jan 1 at 15:00









Samuel HorwitzSamuel Horwitz

1112




1112












  • $begingroup$
    I don’t understand what your question is. Are you asking how large of a sample size one needs to be sure of a biased proportion estimate to be within a confidence interval? The fact that $p neq 1/2$ seems irrelevant to me.
    $endgroup$
    – LoveTooNap29
    Jan 1 at 17:42










  • $begingroup$
    I guess what I’m asking is how can I get a sample size such that I can demonstrate that the trials I personally know all failed actually did all fail (probabilistically speaking). Deterministically, I could give a skeptic verifier 699 failed dice rolls. Probabilistically, I could give them a random sample of n failed dice rolls. How can I find n? Or, and I’m open to this as well, am I completely off base with regards to this being a valid solution to proving my dice rolls?
    $endgroup$
    – Samuel Horwitz
    Jan 1 at 17:56










  • $begingroup$
    I still do not follow, sorry. If I wanted to convince a skeptic about the failure rate of a random experiment I am just going to measure the proportion of successes, $hat{p}$ and find how many trials, $N$, I need to perform so that my confidence interval is sufficiently small and then translate this into a CI for the failure percentage, $hat{q}=1-hat{p}$. I still fail to see how this doesn't even marginally satisfy your aim.
    $endgroup$
    – LoveTooNap29
    Jan 1 at 18:37










  • $begingroup$
    Let me be a bit less abstract about the problem, I attempted to write it out in a more isolated way, but the actual problem I am attempting to solve requires the experiment as it is run to be the canonical result. Imagine a POW scheme where a nonce incrementing by 1 from 0 is used to find a hash that has a specific trailing number of 0 bits. Once a nonce is reached that satisfies this, that POW is valid. Unlike most POWs, I want there to be only 1 valid answer (the first success). Is it possible to sample my failures in a way that probabilistically shows I didn't illicitly skip a success?
    $endgroup$
    – Samuel Horwitz
    Jan 1 at 19:02










  • $begingroup$
    Now you've really lost me. This is a math forum, remember, not every one here is going to know to what a POW scheme refers... Anyway, if I run an experiment $N$ times with the only and first success on trial $n=N$, and you want to validate this, then yes the easiest way to is to show you the first recorded $N-1$ outcomes of failure. If the outcomes are recorded in some data structure/spreadsheet, one can easily count them with a simple function. No probabilistic method is required because there is no longer any source of uncertainty after the experiment is done.
    $endgroup$
    – LoveTooNap29
    Jan 1 at 19:25


















  • $begingroup$
    I don’t understand what your question is. Are you asking how large of a sample size one needs to be sure of a biased proportion estimate to be within a confidence interval? The fact that $p neq 1/2$ seems irrelevant to me.
    $endgroup$
    – LoveTooNap29
    Jan 1 at 17:42










  • $begingroup$
    I guess what I’m asking is how can I get a sample size such that I can demonstrate that the trials I personally know all failed actually did all fail (probabilistically speaking). Deterministically, I could give a skeptic verifier 699 failed dice rolls. Probabilistically, I could give them a random sample of n failed dice rolls. How can I find n? Or, and I’m open to this as well, am I completely off base with regards to this being a valid solution to proving my dice rolls?
    $endgroup$
    – Samuel Horwitz
    Jan 1 at 17:56










  • $begingroup$
    I still do not follow, sorry. If I wanted to convince a skeptic about the failure rate of a random experiment I am just going to measure the proportion of successes, $hat{p}$ and find how many trials, $N$, I need to perform so that my confidence interval is sufficiently small and then translate this into a CI for the failure percentage, $hat{q}=1-hat{p}$. I still fail to see how this doesn't even marginally satisfy your aim.
    $endgroup$
    – LoveTooNap29
    Jan 1 at 18:37










  • $begingroup$
    Let me be a bit less abstract about the problem, I attempted to write it out in a more isolated way, but the actual problem I am attempting to solve requires the experiment as it is run to be the canonical result. Imagine a POW scheme where a nonce incrementing by 1 from 0 is used to find a hash that has a specific trailing number of 0 bits. Once a nonce is reached that satisfies this, that POW is valid. Unlike most POWs, I want there to be only 1 valid answer (the first success). Is it possible to sample my failures in a way that probabilistically shows I didn't illicitly skip a success?
    $endgroup$
    – Samuel Horwitz
    Jan 1 at 19:02










  • $begingroup$
    Now you've really lost me. This is a math forum, remember, not every one here is going to know to what a POW scheme refers... Anyway, if I run an experiment $N$ times with the only and first success on trial $n=N$, and you want to validate this, then yes the easiest way to is to show you the first recorded $N-1$ outcomes of failure. If the outcomes are recorded in some data structure/spreadsheet, one can easily count them with a simple function. No probabilistic method is required because there is no longer any source of uncertainty after the experiment is done.
    $endgroup$
    – LoveTooNap29
    Jan 1 at 19:25
















$begingroup$
I don’t understand what your question is. Are you asking how large of a sample size one needs to be sure of a biased proportion estimate to be within a confidence interval? The fact that $p neq 1/2$ seems irrelevant to me.
$endgroup$
– LoveTooNap29
Jan 1 at 17:42




$begingroup$
I don’t understand what your question is. Are you asking how large of a sample size one needs to be sure of a biased proportion estimate to be within a confidence interval? The fact that $p neq 1/2$ seems irrelevant to me.
$endgroup$
– LoveTooNap29
Jan 1 at 17:42












$begingroup$
I guess what I’m asking is how can I get a sample size such that I can demonstrate that the trials I personally know all failed actually did all fail (probabilistically speaking). Deterministically, I could give a skeptic verifier 699 failed dice rolls. Probabilistically, I could give them a random sample of n failed dice rolls. How can I find n? Or, and I’m open to this as well, am I completely off base with regards to this being a valid solution to proving my dice rolls?
$endgroup$
– Samuel Horwitz
Jan 1 at 17:56




$begingroup$
I guess what I’m asking is how can I get a sample size such that I can demonstrate that the trials I personally know all failed actually did all fail (probabilistically speaking). Deterministically, I could give a skeptic verifier 699 failed dice rolls. Probabilistically, I could give them a random sample of n failed dice rolls. How can I find n? Or, and I’m open to this as well, am I completely off base with regards to this being a valid solution to proving my dice rolls?
$endgroup$
– Samuel Horwitz
Jan 1 at 17:56












$begingroup$
I still do not follow, sorry. If I wanted to convince a skeptic about the failure rate of a random experiment I am just going to measure the proportion of successes, $hat{p}$ and find how many trials, $N$, I need to perform so that my confidence interval is sufficiently small and then translate this into a CI for the failure percentage, $hat{q}=1-hat{p}$. I still fail to see how this doesn't even marginally satisfy your aim.
$endgroup$
– LoveTooNap29
Jan 1 at 18:37




$begingroup$
I still do not follow, sorry. If I wanted to convince a skeptic about the failure rate of a random experiment I am just going to measure the proportion of successes, $hat{p}$ and find how many trials, $N$, I need to perform so that my confidence interval is sufficiently small and then translate this into a CI for the failure percentage, $hat{q}=1-hat{p}$. I still fail to see how this doesn't even marginally satisfy your aim.
$endgroup$
– LoveTooNap29
Jan 1 at 18:37












$begingroup$
Let me be a bit less abstract about the problem, I attempted to write it out in a more isolated way, but the actual problem I am attempting to solve requires the experiment as it is run to be the canonical result. Imagine a POW scheme where a nonce incrementing by 1 from 0 is used to find a hash that has a specific trailing number of 0 bits. Once a nonce is reached that satisfies this, that POW is valid. Unlike most POWs, I want there to be only 1 valid answer (the first success). Is it possible to sample my failures in a way that probabilistically shows I didn't illicitly skip a success?
$endgroup$
– Samuel Horwitz
Jan 1 at 19:02




$begingroup$
Let me be a bit less abstract about the problem, I attempted to write it out in a more isolated way, but the actual problem I am attempting to solve requires the experiment as it is run to be the canonical result. Imagine a POW scheme where a nonce incrementing by 1 from 0 is used to find a hash that has a specific trailing number of 0 bits. Once a nonce is reached that satisfies this, that POW is valid. Unlike most POWs, I want there to be only 1 valid answer (the first success). Is it possible to sample my failures in a way that probabilistically shows I didn't illicitly skip a success?
$endgroup$
– Samuel Horwitz
Jan 1 at 19:02












$begingroup$
Now you've really lost me. This is a math forum, remember, not every one here is going to know to what a POW scheme refers... Anyway, if I run an experiment $N$ times with the only and first success on trial $n=N$, and you want to validate this, then yes the easiest way to is to show you the first recorded $N-1$ outcomes of failure. If the outcomes are recorded in some data structure/spreadsheet, one can easily count them with a simple function. No probabilistic method is required because there is no longer any source of uncertainty after the experiment is done.
$endgroup$
– LoveTooNap29
Jan 1 at 19:25




$begingroup$
Now you've really lost me. This is a math forum, remember, not every one here is going to know to what a POW scheme refers... Anyway, if I run an experiment $N$ times with the only and first success on trial $n=N$, and you want to validate this, then yes the easiest way to is to show you the first recorded $N-1$ outcomes of failure. If the outcomes are recorded in some data structure/spreadsheet, one can easily count them with a simple function. No probabilistic method is required because there is no longer any source of uncertainty after the experiment is done.
$endgroup$
– LoveTooNap29
Jan 1 at 19:25










2 Answers
2






active

oldest

votes


















0












$begingroup$

If the probability of success is $S=0.001$ then the probability of failure is $F=0.999$ and the probability of 699 failures without success is $F^{699}approx0.4969$






share|cite|improve this answer









$endgroup$













  • $begingroup$
    When I read about sample proportions online, they always use examples like 60% of college students are women therefore the sample proportion would lean towards 0.6, which seems like it maps to my problem using 0.001 or 0.999 (both have same result) rather than using the failure^number of times for sample proportion. Am I misunderstanding?
    $endgroup$
    – Samuel Horwitz
    Jan 1 at 15:20





















0












$begingroup$

If the method you have chosen to analyze your problem of rolling a die is selecting a sample from a population which involves taking into account a finite population (correction factor), which means "without replacement" and hence there can be a difference of $p = .001$ for all rolls versus an increase in $p$ as your sample size increases which is "not desirable for your requirements".



However, analyzing it as a one proportion $Z$ test, with $n = 699$, $x = 0$, and $p_0 = .001$, the $p$ value is $.4029$ versus $.4969$ by Daniel Mathias's method for a die roll. In both cases such a high $p$ value indicates that getting $699$ failures is not statistically significant for either a proportion or a probability of $.001$.






share|cite|improve this answer









$endgroup$














    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "69"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3058547%2fdetermining-sample-size-of-a-set-of-boolean-data-where-the-probability-is-not-50%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0












    $begingroup$

    If the probability of success is $S=0.001$ then the probability of failure is $F=0.999$ and the probability of 699 failures without success is $F^{699}approx0.4969$






    share|cite|improve this answer









    $endgroup$













    • $begingroup$
      When I read about sample proportions online, they always use examples like 60% of college students are women therefore the sample proportion would lean towards 0.6, which seems like it maps to my problem using 0.001 or 0.999 (both have same result) rather than using the failure^number of times for sample proportion. Am I misunderstanding?
      $endgroup$
      – Samuel Horwitz
      Jan 1 at 15:20


















    0












    $begingroup$

    If the probability of success is $S=0.001$ then the probability of failure is $F=0.999$ and the probability of 699 failures without success is $F^{699}approx0.4969$






    share|cite|improve this answer









    $endgroup$













    • $begingroup$
      When I read about sample proportions online, they always use examples like 60% of college students are women therefore the sample proportion would lean towards 0.6, which seems like it maps to my problem using 0.001 or 0.999 (both have same result) rather than using the failure^number of times for sample proportion. Am I misunderstanding?
      $endgroup$
      – Samuel Horwitz
      Jan 1 at 15:20
















    0












    0








    0





    $begingroup$

    If the probability of success is $S=0.001$ then the probability of failure is $F=0.999$ and the probability of 699 failures without success is $F^{699}approx0.4969$






    share|cite|improve this answer









    $endgroup$



    If the probability of success is $S=0.001$ then the probability of failure is $F=0.999$ and the probability of 699 failures without success is $F^{699}approx0.4969$







    share|cite|improve this answer












    share|cite|improve this answer



    share|cite|improve this answer










    answered Jan 1 at 15:13









    Daniel MathiasDaniel Mathias

    1,41518




    1,41518












    • $begingroup$
      When I read about sample proportions online, they always use examples like 60% of college students are women therefore the sample proportion would lean towards 0.6, which seems like it maps to my problem using 0.001 or 0.999 (both have same result) rather than using the failure^number of times for sample proportion. Am I misunderstanding?
      $endgroup$
      – Samuel Horwitz
      Jan 1 at 15:20




















    • $begingroup$
      When I read about sample proportions online, they always use examples like 60% of college students are women therefore the sample proportion would lean towards 0.6, which seems like it maps to my problem using 0.001 or 0.999 (both have same result) rather than using the failure^number of times for sample proportion. Am I misunderstanding?
      $endgroup$
      – Samuel Horwitz
      Jan 1 at 15:20


















    $begingroup$
    When I read about sample proportions online, they always use examples like 60% of college students are women therefore the sample proportion would lean towards 0.6, which seems like it maps to my problem using 0.001 or 0.999 (both have same result) rather than using the failure^number of times for sample proportion. Am I misunderstanding?
    $endgroup$
    – Samuel Horwitz
    Jan 1 at 15:20






    $begingroup$
    When I read about sample proportions online, they always use examples like 60% of college students are women therefore the sample proportion would lean towards 0.6, which seems like it maps to my problem using 0.001 or 0.999 (both have same result) rather than using the failure^number of times for sample proportion. Am I misunderstanding?
    $endgroup$
    – Samuel Horwitz
    Jan 1 at 15:20













    0












    $begingroup$

    If the method you have chosen to analyze your problem of rolling a die is selecting a sample from a population which involves taking into account a finite population (correction factor), which means "without replacement" and hence there can be a difference of $p = .001$ for all rolls versus an increase in $p$ as your sample size increases which is "not desirable for your requirements".



    However, analyzing it as a one proportion $Z$ test, with $n = 699$, $x = 0$, and $p_0 = .001$, the $p$ value is $.4029$ versus $.4969$ by Daniel Mathias's method for a die roll. In both cases such a high $p$ value indicates that getting $699$ failures is not statistically significant for either a proportion or a probability of $.001$.






    share|cite|improve this answer









    $endgroup$


















      0












      $begingroup$

      If the method you have chosen to analyze your problem of rolling a die is selecting a sample from a population which involves taking into account a finite population (correction factor), which means "without replacement" and hence there can be a difference of $p = .001$ for all rolls versus an increase in $p$ as your sample size increases which is "not desirable for your requirements".



      However, analyzing it as a one proportion $Z$ test, with $n = 699$, $x = 0$, and $p_0 = .001$, the $p$ value is $.4029$ versus $.4969$ by Daniel Mathias's method for a die roll. In both cases such a high $p$ value indicates that getting $699$ failures is not statistically significant for either a proportion or a probability of $.001$.






      share|cite|improve this answer









      $endgroup$
















        0












        0








        0





        $begingroup$

        If the method you have chosen to analyze your problem of rolling a die is selecting a sample from a population which involves taking into account a finite population (correction factor), which means "without replacement" and hence there can be a difference of $p = .001$ for all rolls versus an increase in $p$ as your sample size increases which is "not desirable for your requirements".



        However, analyzing it as a one proportion $Z$ test, with $n = 699$, $x = 0$, and $p_0 = .001$, the $p$ value is $.4029$ versus $.4969$ by Daniel Mathias's method for a die roll. In both cases such a high $p$ value indicates that getting $699$ failures is not statistically significant for either a proportion or a probability of $.001$.






        share|cite|improve this answer









        $endgroup$



        If the method you have chosen to analyze your problem of rolling a die is selecting a sample from a population which involves taking into account a finite population (correction factor), which means "without replacement" and hence there can be a difference of $p = .001$ for all rolls versus an increase in $p$ as your sample size increases which is "not desirable for your requirements".



        However, analyzing it as a one proportion $Z$ test, with $n = 699$, $x = 0$, and $p_0 = .001$, the $p$ value is $.4029$ versus $.4969$ by Daniel Mathias's method for a die roll. In both cases such a high $p$ value indicates that getting $699$ failures is not statistically significant for either a proportion or a probability of $.001$.







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered Jan 1 at 17:14









        Phil HPhil H

        4,3102412




        4,3102412






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Mathematics Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3058547%2fdetermining-sample-size-of-a-set-of-boolean-data-where-the-probability-is-not-50%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Biblatex bibliography style without URLs when DOI exists (in Overleaf with Zotero bibliography)

            ComboBox Display Member on multiple fields

            Is it possible to collect Nectar points via Trainline?