Split a set keeping the proportion of its elements on each subset
I have a set of elements. To simplify here, these elements are $+$ and $-$.
I need to split this set into n sets keeping the proportion of $+$ and $-$. These new sets need to have similar size: their size must differ in one or zero.
For example, if I have a set with 7 $+$ and 4 $-$, and I need to split it into 5 sets. Each of these sets will have: 3 elements, 2 elements, 2 elements, 2 elements and 2 elements.
To solve this, I have done the following:
- Calculate the percentage of $+$ elements: $p = 4/11 = 0.37 => 37%$
- Calculate the percentage of $-$ elements: $n = 7/11 = 0.63 => 63%$
Now I have the percentages, I use the following formulas to calculate how many $+$ and $-$ elements I will have on each subset.
If $m$ is the number of elements on a subset, I calculate:
$mp = m * p$
$mn = m * n$
Where $mp$ is the number of $+$ elements, and $mn$ is the number of $-$ elements on each subset.
Those numbers always have decimals, so I need to round then.
If $mp > mn$ I do:
$mp = lceil mprceil$
$mn = lfloor mn rfloor $
And, if $mp < mn$ I do:
$mp = lfloor mp rfloor $
$mn = lceil mnrceil$
But it doesn't work in this example, because when I do:
$mp = 2 * 0.37 = 0.74$
$mn = 2 * 0.63 = 1.26$
I get than $mn > mp$, so:
$mp = lfloor 0.74 rfloor = 0 $
$mn = lceil 1.26 rceil = 2$
I don't use any of the $+$ elements in four subsets. This is an error.
My problem is I don't know what to do with the decimals.
How can I do to keep the proportion and avoid these kind of problems?
elementary-set-theory algorithms
add a comment |
I have a set of elements. To simplify here, these elements are $+$ and $-$.
I need to split this set into n sets keeping the proportion of $+$ and $-$. These new sets need to have similar size: their size must differ in one or zero.
For example, if I have a set with 7 $+$ and 4 $-$, and I need to split it into 5 sets. Each of these sets will have: 3 elements, 2 elements, 2 elements, 2 elements and 2 elements.
To solve this, I have done the following:
- Calculate the percentage of $+$ elements: $p = 4/11 = 0.37 => 37%$
- Calculate the percentage of $-$ elements: $n = 7/11 = 0.63 => 63%$
Now I have the percentages, I use the following formulas to calculate how many $+$ and $-$ elements I will have on each subset.
If $m$ is the number of elements on a subset, I calculate:
$mp = m * p$
$mn = m * n$
Where $mp$ is the number of $+$ elements, and $mn$ is the number of $-$ elements on each subset.
Those numbers always have decimals, so I need to round then.
If $mp > mn$ I do:
$mp = lceil mprceil$
$mn = lfloor mn rfloor $
And, if $mp < mn$ I do:
$mp = lfloor mp rfloor $
$mn = lceil mnrceil$
But it doesn't work in this example, because when I do:
$mp = 2 * 0.37 = 0.74$
$mn = 2 * 0.63 = 1.26$
I get than $mn > mp$, so:
$mp = lfloor 0.74 rfloor = 0 $
$mn = lceil 1.26 rceil = 2$
I don't use any of the $+$ elements in four subsets. This is an error.
My problem is I don't know what to do with the decimals.
How can I do to keep the proportion and avoid these kind of problems?
elementary-set-theory algorithms
1
Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
– hardmath
Nov 21 at 2:33
add a comment |
I have a set of elements. To simplify here, these elements are $+$ and $-$.
I need to split this set into n sets keeping the proportion of $+$ and $-$. These new sets need to have similar size: their size must differ in one or zero.
For example, if I have a set with 7 $+$ and 4 $-$, and I need to split it into 5 sets. Each of these sets will have: 3 elements, 2 elements, 2 elements, 2 elements and 2 elements.
To solve this, I have done the following:
- Calculate the percentage of $+$ elements: $p = 4/11 = 0.37 => 37%$
- Calculate the percentage of $-$ elements: $n = 7/11 = 0.63 => 63%$
Now I have the percentages, I use the following formulas to calculate how many $+$ and $-$ elements I will have on each subset.
If $m$ is the number of elements on a subset, I calculate:
$mp = m * p$
$mn = m * n$
Where $mp$ is the number of $+$ elements, and $mn$ is the number of $-$ elements on each subset.
Those numbers always have decimals, so I need to round then.
If $mp > mn$ I do:
$mp = lceil mprceil$
$mn = lfloor mn rfloor $
And, if $mp < mn$ I do:
$mp = lfloor mp rfloor $
$mn = lceil mnrceil$
But it doesn't work in this example, because when I do:
$mp = 2 * 0.37 = 0.74$
$mn = 2 * 0.63 = 1.26$
I get than $mn > mp$, so:
$mp = lfloor 0.74 rfloor = 0 $
$mn = lceil 1.26 rceil = 2$
I don't use any of the $+$ elements in four subsets. This is an error.
My problem is I don't know what to do with the decimals.
How can I do to keep the proportion and avoid these kind of problems?
elementary-set-theory algorithms
I have a set of elements. To simplify here, these elements are $+$ and $-$.
I need to split this set into n sets keeping the proportion of $+$ and $-$. These new sets need to have similar size: their size must differ in one or zero.
For example, if I have a set with 7 $+$ and 4 $-$, and I need to split it into 5 sets. Each of these sets will have: 3 elements, 2 elements, 2 elements, 2 elements and 2 elements.
To solve this, I have done the following:
- Calculate the percentage of $+$ elements: $p = 4/11 = 0.37 => 37%$
- Calculate the percentage of $-$ elements: $n = 7/11 = 0.63 => 63%$
Now I have the percentages, I use the following formulas to calculate how many $+$ and $-$ elements I will have on each subset.
If $m$ is the number of elements on a subset, I calculate:
$mp = m * p$
$mn = m * n$
Where $mp$ is the number of $+$ elements, and $mn$ is the number of $-$ elements on each subset.
Those numbers always have decimals, so I need to round then.
If $mp > mn$ I do:
$mp = lceil mprceil$
$mn = lfloor mn rfloor $
And, if $mp < mn$ I do:
$mp = lfloor mp rfloor $
$mn = lceil mnrceil$
But it doesn't work in this example, because when I do:
$mp = 2 * 0.37 = 0.74$
$mn = 2 * 0.63 = 1.26$
I get than $mn > mp$, so:
$mp = lfloor 0.74 rfloor = 0 $
$mn = lceil 1.26 rceil = 2$
I don't use any of the $+$ elements in four subsets. This is an error.
My problem is I don't know what to do with the decimals.
How can I do to keep the proportion and avoid these kind of problems?
elementary-set-theory algorithms
elementary-set-theory algorithms
asked Nov 20 at 15:23
VansFannel
122113
122113
1
Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
– hardmath
Nov 21 at 2:33
add a comment |
1
Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
– hardmath
Nov 21 at 2:33
1
1
Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
– hardmath
Nov 21 at 2:33
Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
– hardmath
Nov 21 at 2:33
add a comment |
1 Answer
1
active
oldest
votes
As was mentioned in the comments, there's only so much you can achieve with discrete quantities, so you have to compromise somewhere. In general there are different ways to achieve a compromise, so you'd want to check what are the desirable properties that must be enforced. I'm not too sure whether this applies in this case though.
First an example.
Say you want to play a card game, and have to deal 52 cards evenly between 5 people. Around here it is common to give one card to each player in a fixed order, until you run out of cards. That way you end up with a distribution of $11$, $11$, $10$, $10$, $10$ cards.
Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.
To do so, sort your elements in an array, say $A$ with:
- for $0le i<P$, $Aleft[iright] = +$
- for $Ple i<T$, $Aleft[iright] = -$
Then for $0le k < N$, let the $k$-th bucket/subset be:
$B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
Basically each bucket gets one element every $N$ elements.
With this method, you're guaranteed that:
- All elements are used
- Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$
- A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements
- A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements
- Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.
- In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.
I was too lazy to look how far/close the proportions you obtain are from/to the target proportion... but assuming all elements must be distributed, the four possible ratio look like best approximations.
On a side note, this method simply generalizes to more than two types of elements.
Also if you let $ell$ be the number of types of elements, I think this method can be implemented in $O(ell+N)$ if you just need the quantities of each element in each set. Kinda too lazy to check that last statement.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3006450%2fsplit-a-set-keeping-the-proportion-of-its-elements-on-each-subset%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
As was mentioned in the comments, there's only so much you can achieve with discrete quantities, so you have to compromise somewhere. In general there are different ways to achieve a compromise, so you'd want to check what are the desirable properties that must be enforced. I'm not too sure whether this applies in this case though.
First an example.
Say you want to play a card game, and have to deal 52 cards evenly between 5 people. Around here it is common to give one card to each player in a fixed order, until you run out of cards. That way you end up with a distribution of $11$, $11$, $10$, $10$, $10$ cards.
Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.
To do so, sort your elements in an array, say $A$ with:
- for $0le i<P$, $Aleft[iright] = +$
- for $Ple i<T$, $Aleft[iright] = -$
Then for $0le k < N$, let the $k$-th bucket/subset be:
$B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
Basically each bucket gets one element every $N$ elements.
With this method, you're guaranteed that:
- All elements are used
- Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$
- A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements
- A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements
- Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.
- In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.
I was too lazy to look how far/close the proportions you obtain are from/to the target proportion... but assuming all elements must be distributed, the four possible ratio look like best approximations.
On a side note, this method simply generalizes to more than two types of elements.
Also if you let $ell$ be the number of types of elements, I think this method can be implemented in $O(ell+N)$ if you just need the quantities of each element in each set. Kinda too lazy to check that last statement.
add a comment |
As was mentioned in the comments, there's only so much you can achieve with discrete quantities, so you have to compromise somewhere. In general there are different ways to achieve a compromise, so you'd want to check what are the desirable properties that must be enforced. I'm not too sure whether this applies in this case though.
First an example.
Say you want to play a card game, and have to deal 52 cards evenly between 5 people. Around here it is common to give one card to each player in a fixed order, until you run out of cards. That way you end up with a distribution of $11$, $11$, $10$, $10$, $10$ cards.
Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.
To do so, sort your elements in an array, say $A$ with:
- for $0le i<P$, $Aleft[iright] = +$
- for $Ple i<T$, $Aleft[iright] = -$
Then for $0le k < N$, let the $k$-th bucket/subset be:
$B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
Basically each bucket gets one element every $N$ elements.
With this method, you're guaranteed that:
- All elements are used
- Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$
- A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements
- A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements
- Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.
- In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.
I was too lazy to look how far/close the proportions you obtain are from/to the target proportion... but assuming all elements must be distributed, the four possible ratio look like best approximations.
On a side note, this method simply generalizes to more than two types of elements.
Also if you let $ell$ be the number of types of elements, I think this method can be implemented in $O(ell+N)$ if you just need the quantities of each element in each set. Kinda too lazy to check that last statement.
add a comment |
As was mentioned in the comments, there's only so much you can achieve with discrete quantities, so you have to compromise somewhere. In general there are different ways to achieve a compromise, so you'd want to check what are the desirable properties that must be enforced. I'm not too sure whether this applies in this case though.
First an example.
Say you want to play a card game, and have to deal 52 cards evenly between 5 people. Around here it is common to give one card to each player in a fixed order, until you run out of cards. That way you end up with a distribution of $11$, $11$, $10$, $10$, $10$ cards.
Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.
To do so, sort your elements in an array, say $A$ with:
- for $0le i<P$, $Aleft[iright] = +$
- for $Ple i<T$, $Aleft[iright] = -$
Then for $0le k < N$, let the $k$-th bucket/subset be:
$B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
Basically each bucket gets one element every $N$ elements.
With this method, you're guaranteed that:
- All elements are used
- Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$
- A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements
- A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements
- Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.
- In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.
I was too lazy to look how far/close the proportions you obtain are from/to the target proportion... but assuming all elements must be distributed, the four possible ratio look like best approximations.
On a side note, this method simply generalizes to more than two types of elements.
Also if you let $ell$ be the number of types of elements, I think this method can be implemented in $O(ell+N)$ if you just need the quantities of each element in each set. Kinda too lazy to check that last statement.
As was mentioned in the comments, there's only so much you can achieve with discrete quantities, so you have to compromise somewhere. In general there are different ways to achieve a compromise, so you'd want to check what are the desirable properties that must be enforced. I'm not too sure whether this applies in this case though.
First an example.
Say you want to play a card game, and have to deal 52 cards evenly between 5 people. Around here it is common to give one card to each player in a fixed order, until you run out of cards. That way you end up with a distribution of $11$, $11$, $10$, $10$, $10$ cards.
Now if we get back to your problem, say you have $P$ $+$ elements, $M$ $-$ elements, for a total of $T=P+M$. You want to split these into $N$ sets.
To do so, sort your elements in an array, say $A$ with:
- for $0le i<P$, $Aleft[iright] = +$
- for $Ple i<T$, $Aleft[iright] = -$
Then for $0le k < N$, let the $k$-th bucket/subset be:
$B_k = left{ Aleft[qN+kright], qinmathbb Z right}$.
Basically each bucket gets one element every $N$ elements.
With this method, you're guaranteed that:
- All elements are used
- Bucket size is either $leftlceilfrac TNrightrceil$ or $leftlfloorfrac TNrightrfloor$
- A bucket contains either $leftlceilfrac PNrightrceil$ or $leftlfloorfrac PNrightrfloor$ $+$ elements
- A bucket contains either $leftlceilfrac MNrightrceil$ or $leftlfloorfrac MNrightrfloor$ $-$ elements
- Assuming that the floor and ceil values are distinct, there are always bigger buckets with more $+$ elements, and smaller buckets with fewer $+$ elements.
- In the binary case (only two types of elements), the remaining two types of buckets (bigger bucket with fewer $+$, smaller buckets with more $+$) cannot coexist.
I was too lazy to look how far/close the proportions you obtain are from/to the target proportion... but assuming all elements must be distributed, the four possible ratio look like best approximations.
On a side note, this method simply generalizes to more than two types of elements.
Also if you let $ell$ be the number of types of elements, I think this method can be implemented in $O(ell+N)$ if you just need the quantities of each element in each set. Kinda too lazy to check that last statement.
answered Nov 21 at 23:11
N.Bach
1,526310
1,526310
add a comment |
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3006450%2fsplit-a-set-keeping-the-proportion-of-its-elements-on-each-subset%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Note that if you have four "-" symbols to distribute among five (multi)sets, there will be (at least) one multiset that has none. It seems to me the best you can do is to distribute the symbols with lower frequency as equally as possible among your "buckets", then fill the remaining slots with the more numerous symbols.
– hardmath
Nov 21 at 2:33