How can we count the r-combinations of a multiset?
up vote
0
down vote
favorite
Problem
Define a split of a multiset M to be an ordered pair of multisets of equal size whose union is M.
How can we count the splits of a multiset?
Discussion
This question stems from this one.
This problem is identical to counting the number of r-combinations of a multiset with 2r elements, for which a solution is shown in section 6.2 of Introductory Combinatorics by Richard A. Brualdi, Second Edition, 1992, Prentice-Hall, Inc. To see they are identical, observe there is a 1-1 correspondence between any r-combination X of a multiset M of 2r elements and the split (X, Y), where Y is all the elements of M that are not in X, that is, Y = M−X.
Brualdi’s solution using the inclusion-exclusion principle is well known to mathematicians and is suitable for algorithmic implementation.
In spite of the fact that it is well known to mathematicians, it was only partially answered here at Mathematics Stack Exchange and here at Quora.
c algorithm math
add a comment |
up vote
0
down vote
favorite
Problem
Define a split of a multiset M to be an ordered pair of multisets of equal size whose union is M.
How can we count the splits of a multiset?
Discussion
This question stems from this one.
This problem is identical to counting the number of r-combinations of a multiset with 2r elements, for which a solution is shown in section 6.2 of Introductory Combinatorics by Richard A. Brualdi, Second Edition, 1992, Prentice-Hall, Inc. To see they are identical, observe there is a 1-1 correspondence between any r-combination X of a multiset M of 2r elements and the split (X, Y), where Y is all the elements of M that are not in X, that is, Y = M−X.
Brualdi’s solution using the inclusion-exclusion principle is well known to mathematicians and is suitable for algorithmic implementation.
In spite of the fact that it is well known to mathematicians, it was only partially answered here at Mathematics Stack Exchange and here at Quora.
c algorithm math
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Problem
Define a split of a multiset M to be an ordered pair of multisets of equal size whose union is M.
How can we count the splits of a multiset?
Discussion
This question stems from this one.
This problem is identical to counting the number of r-combinations of a multiset with 2r elements, for which a solution is shown in section 6.2 of Introductory Combinatorics by Richard A. Brualdi, Second Edition, 1992, Prentice-Hall, Inc. To see they are identical, observe there is a 1-1 correspondence between any r-combination X of a multiset M of 2r elements and the split (X, Y), where Y is all the elements of M that are not in X, that is, Y = M−X.
Brualdi’s solution using the inclusion-exclusion principle is well known to mathematicians and is suitable for algorithmic implementation.
In spite of the fact that it is well known to mathematicians, it was only partially answered here at Mathematics Stack Exchange and here at Quora.
c algorithm math
Problem
Define a split of a multiset M to be an ordered pair of multisets of equal size whose union is M.
How can we count the splits of a multiset?
Discussion
This question stems from this one.
This problem is identical to counting the number of r-combinations of a multiset with 2r elements, for which a solution is shown in section 6.2 of Introductory Combinatorics by Richard A. Brualdi, Second Edition, 1992, Prentice-Hall, Inc. To see they are identical, observe there is a 1-1 correspondence between any r-combination X of a multiset M of 2r elements and the split (X, Y), where Y is all the elements of M that are not in X, that is, Y = M−X.
Brualdi’s solution using the inclusion-exclusion principle is well known to mathematicians and is suitable for algorithmic implementation.
In spite of the fact that it is well known to mathematicians, it was only partially answered here at Mathematics Stack Exchange and here at Quora.
c algorithm math
c algorithm math
asked Nov 12 at 21:39
Eric Postpischil
69.1k873150
69.1k873150
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
up vote
1
down vote
Here is a solution. I have included explanation in the comments.
#include <inttypes.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
typedef uintmax_t UInt;
#define UIntFormat PRIuMAX
#define NumberOf(a) (sizeof (a) / sizeof *(a))
/* In this code, a multiset is represented with:
an integer N0 which is the number of types of elements in the set,
an integer N1 which is the number of types of elements in the set that
each appear finitely many times (N0-N1 types each appear infinitely
many times), and
an array M in which each M[i] is the number of times that an element
of type i appears.
Collectively, this is referred to as the multiset M.
*/
/* Return the number of ways to choose r things from n things. This is
n! / (r! * (n-r)!).
*/
static UInt Choose(UInt r, UInt n)
{
UInt result = 1;
for (UInt i = 1; i <= r; ++i)
result = result * n-- / i;
return result;
}
// Count the number of r-combinations of a multiset.
static UInt CountRCombinations(UInt r, size_t N0, size_t N1, UInt M)
{
/* If we have only the unlimited types, there is a one-to-one
correspondence between r objects with N0-1 dividers placed between
them, each divider marking a transition from one type to another. For
example, consider four objects of three types. Below "o" represents
any object, and "|" is a divider. For each arrangement of four o's and
two |s, we show how it defines a selection of four objects of three
types:
oooo|| -> aaaa||
ooo|o| -> aaa|b|
ooo||o -> aaa||c
oo|oo| -> aa|bb|
oo|o|o -> aa|b|c
oo||oo -> aa||cc
o|ooo| -> a|bbb|
o|oo|o -> a|bb|c
o|o|oo -> a|b|cc
o||ooo -> a||ccc
|oooo| -> |bbbb|
|ooo|o -> |bbb|c
|oo|oo -> |bb|cc
|o|ooo -> |b|ccc
||oooo -> ||cccc
Therefore, the number of combinations equals the number of ways to
arrange r indistinguishable objects of one type with N0-1
indistinguishable objects of a different type.
*/
if (N1 == 0)
return Choose(r, r+N0-1);
/* Otherwise, we count the combinations:
Select one of the limited types (we use the last one, N1-1, because
it is easily removed from the array simply by reducing the size of
the array).
Count the number of combinations there would be if that type were
unlimited.
Count the number of combinations there would be if there were at
least M[i]+1 instances of elements of that type.
Subtract to get the number of combinations that have 0 to M[i]
instances of elements of that type.
*/
else
{
/* Let M' be the multiset M with the last type changed to unlimited.
So, where M is represented with N0, N1, M, M' is represented with
N0, N1-1, M.
*/
// Change the last limited type to unlimited.
N1 -= 1;
// Count the r-combinations drawn from M'.
UInt C = CountRCombinations(r, N0, N1, M);
/* Now we count the combinations which have at least M[N1]+1 instances
of the (formerly) last type.
Consider that each such combination has M[N1]+1 instances of that
type plus some combination of r - (M[N1]+1) elements drawn from M',
including zero or more instances of the last type. (Note that if r
<= M[N1], there are no such combinations, since we would be asking
for a negative number of elements.)
So the number of combinations which have at least M[N1]+1 instances
of the last type equals the number of combinations of that type
plus some combination of r - (M[N1]+1) elements drawn from M'.
*/
if (M[N1] < r)
C -= CountRCombinations(r - (M[N1] + 1), N0, N1, M);
return C;
}
}
// Count the number of splits of a multiset M that contains N types of things.
static UInt CountSplits(size_t N, UInt M)
{
// Count the number of elements.
UInt T = 0;
for (size_t i = 0; i < N; ++i)
T += M[i];
// Return the number of T/2-combinations of M.
return T % 2 ? 0 : CountRCombinations(T/2, N, N, M);
}
int main(void)
{
UInt M = { 3, 4, 5 };
size_t N = NumberOf(M);
printf("The number of splits of {");
for (size_t i = 0; i < N; ++i) printf(" %" UIntFormat, M[i]);
printf(" } is %" UIntFormat ".n", CountSplits(N, M));
}
add a comment |
up vote
1
down vote
It's certainly the case that the original problem is isomorphic with the problem of counting r-combinations of a multiset of size 2r.
As an alternative to the inclusion-exclusion formula (which is certainly of analytic value, but possibly of less algorithmic value), we can construct a recursive solution which computes the counts of k-combinations of the set for all values of k, using a very similar approach to recursion employed in the recursive algorithm for counting the binomial coefficients C(n,k), the counts of k-combinations of a set of n elements.
Suppose we represent the multiset as a sorted vector V of size n (where n = 2r). (Technically, it doesn't have to be sorted; it is sufficient for it to be "clumped" so that all the identical elements are consecutive. However, the easiest way to do this is to sort the vector.) We want to produce all unique k-combinations of this vector. All such combinations have one of two forms:
"Choose the first element". The combination starts with V1 and continues with a (k−1)-combination of (V2, V3, …, Vn).
"Skip the first element". The combination is a k-combination of (Vi, Vi+1, … Vn) where i is the smallest index such that Vi ≠ V1. (We need to skip all elements identical to the first element in order to avoid duplication.)
The only difference here with the binomial recursion is the use of the index i in the second option; if the set has no repeated elements, this reduces to i=2, which yields the recursion C(n, k) = C(n − 1, k − 1) + C(n − 1, k).
A naive implementation of this recursion will take exponential time because each computation requires two recursive calls; however, there are only a quadratic number of unique calls so the computation can be reduced to quadratic time by memoisation or dynamic programming. The solution below uses dynamic programming, because that only requires linear space. (Memoisation would require quadratic space since there are a quadratic number of subproblems.)
The dynamic programming solution inverts the recursion by computing the number of k-combinations for the successive suffixes of the vector. It needs to keep the values for only two suffixes: the previous suffix and the previous suffix with a different first element, corresponding to the counts needed by the first and second option of the above recursion. (The actual code uses prefixes instead of suffixes, but it makes absolutely no difference.)
As a minor optimisation, we only compute the counts of k-combinations for values of k between 0 and ⌈n/2⌉. As with binomial coefficients, the counts are symmetrical: the number of k-combinations is equal to the number of (n−k)-combinations because every k-combination corresponds to a unique (n−k)-combination consisting of all the unselected elements. An additional optimisation is possible based on the fact that we only need one count at the end, but the additional complication only obscures the basic algorithm.
The fact that the solution is O(n2) is a little annoying, but since n will generally be a smallish number (otherwise the counts will be astronomical) the computation time seems reasonable. There are undoubtedly further optimisations possible, and it's quite possible that a sub-quadratic algorithm exists.
Here's a basic implementation in C (using an array of strings):
/* Given a *sorted* vector v of size n, compute the number of unique k-combinations
* of the elements of the vector for values of k between 0 and (n/2)+1.
* The counts are stored in the vector r, which must be large enough.
* Counts for larger values of k can be trivially looked up in the returned
* vector, using the identity r[k] == r[n - k].
* If v is not sorted, the result will be incorrect. The function does not
* check for overflow, but the results will be correct modulo (UINTMAX + 1)
*/
void multicomb(const char** v, size_t n, uintmax_t* r) {
size_t lim = n / 2 + 1;
// Prev retains the counts for the previous prefix ending with
// a different element
uintmax_t prev[lim];
// If there are no elements, there is 1 0-combination and no other combinations.
memset(r, 0, sizeof prev);
r[0] = 1;
// Iterate over the unique elements of v
for (size_t k = 0; k < n; ) {
// Save the counts for this prefix
memcpy(prev, r, sizeof prev);
// Iterate over the prefixes with the same ending value
do {
for (size_t i = lim - 1; i > 0; --i)
r[i] = r[i - 1] + prev[i];
} while (++k < n && strcmp(v[k - 1], v[k]) == 0);
};
}
In comparison with the solution in OP's self-answer, this version:
- overflows later because it only depends on addition. (The fact that there is no division also makes it much easier to compute the count modulo a prime.)
- takes quadratic time instead of exponential time.
add a comment |
up vote
0
down vote
I don't think that that's quite right, Eric. The original question asks, if my source array contains A, A, B, B then how many ways can it be uniquely split between two recipients - the answer in this case will be three because the array could be split as follows:
Child Array 1 | Child Array 2
-----------------------------
A A | B B
A B | A B
B B | A A
This source code (C++ function) works but it's horribly slow and inefficient. Brute force and ignorance.
int countPermutations(vector<int> itemset) {
vector<vector<int> > permutations;
do {
vector<int> match;
for(int i = 0; i < itemset.size(); i+=2) {
match.push_back(itemset.at(i));
}
sort(match.begin(), match.end());
int j = 0;
bool found = false;
while (!found && (j < permutations.size())) {
found = (match == permutations.at(j))?true:false;
j++;
}
if (!found) {
permutations.push_back(match);
}
} while (next_permutation(itemset.begin(), itemset.end()));
return permutations.size();
}
Any thoughts on optimisation?
1
Eric's algorithm produces the value 3 for this case, as expected (you have to call the function with the array {2, 2}, indicating "two As and two Bs"). Why do you think it's wrong?
– rici
Nov 14 at 18:30
My first thought on optimisation, by the way: Don't accumulate the possibilities; just count them. For any significant problem size, there will be far too many to keep in memory, and anyway you don't do anything with them. If you actually need the individual splits, that's a very different question (because counting can be optimised a lot more).
– rici
Nov 14 at 18:35
rici is correct, to compute the count for { A, A, B, B }, you need to changeUInt M = { 3, 4, 5 };
toUInt M = { 2, 2 };
.
– Eric Postpischil
Nov 14 at 20:11
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
Here is a solution. I have included explanation in the comments.
#include <inttypes.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
typedef uintmax_t UInt;
#define UIntFormat PRIuMAX
#define NumberOf(a) (sizeof (a) / sizeof *(a))
/* In this code, a multiset is represented with:
an integer N0 which is the number of types of elements in the set,
an integer N1 which is the number of types of elements in the set that
each appear finitely many times (N0-N1 types each appear infinitely
many times), and
an array M in which each M[i] is the number of times that an element
of type i appears.
Collectively, this is referred to as the multiset M.
*/
/* Return the number of ways to choose r things from n things. This is
n! / (r! * (n-r)!).
*/
static UInt Choose(UInt r, UInt n)
{
UInt result = 1;
for (UInt i = 1; i <= r; ++i)
result = result * n-- / i;
return result;
}
// Count the number of r-combinations of a multiset.
static UInt CountRCombinations(UInt r, size_t N0, size_t N1, UInt M)
{
/* If we have only the unlimited types, there is a one-to-one
correspondence between r objects with N0-1 dividers placed between
them, each divider marking a transition from one type to another. For
example, consider four objects of three types. Below "o" represents
any object, and "|" is a divider. For each arrangement of four o's and
two |s, we show how it defines a selection of four objects of three
types:
oooo|| -> aaaa||
ooo|o| -> aaa|b|
ooo||o -> aaa||c
oo|oo| -> aa|bb|
oo|o|o -> aa|b|c
oo||oo -> aa||cc
o|ooo| -> a|bbb|
o|oo|o -> a|bb|c
o|o|oo -> a|b|cc
o||ooo -> a||ccc
|oooo| -> |bbbb|
|ooo|o -> |bbb|c
|oo|oo -> |bb|cc
|o|ooo -> |b|ccc
||oooo -> ||cccc
Therefore, the number of combinations equals the number of ways to
arrange r indistinguishable objects of one type with N0-1
indistinguishable objects of a different type.
*/
if (N1 == 0)
return Choose(r, r+N0-1);
/* Otherwise, we count the combinations:
Select one of the limited types (we use the last one, N1-1, because
it is easily removed from the array simply by reducing the size of
the array).
Count the number of combinations there would be if that type were
unlimited.
Count the number of combinations there would be if there were at
least M[i]+1 instances of elements of that type.
Subtract to get the number of combinations that have 0 to M[i]
instances of elements of that type.
*/
else
{
/* Let M' be the multiset M with the last type changed to unlimited.
So, where M is represented with N0, N1, M, M' is represented with
N0, N1-1, M.
*/
// Change the last limited type to unlimited.
N1 -= 1;
// Count the r-combinations drawn from M'.
UInt C = CountRCombinations(r, N0, N1, M);
/* Now we count the combinations which have at least M[N1]+1 instances
of the (formerly) last type.
Consider that each such combination has M[N1]+1 instances of that
type plus some combination of r - (M[N1]+1) elements drawn from M',
including zero or more instances of the last type. (Note that if r
<= M[N1], there are no such combinations, since we would be asking
for a negative number of elements.)
So the number of combinations which have at least M[N1]+1 instances
of the last type equals the number of combinations of that type
plus some combination of r - (M[N1]+1) elements drawn from M'.
*/
if (M[N1] < r)
C -= CountRCombinations(r - (M[N1] + 1), N0, N1, M);
return C;
}
}
// Count the number of splits of a multiset M that contains N types of things.
static UInt CountSplits(size_t N, UInt M)
{
// Count the number of elements.
UInt T = 0;
for (size_t i = 0; i < N; ++i)
T += M[i];
// Return the number of T/2-combinations of M.
return T % 2 ? 0 : CountRCombinations(T/2, N, N, M);
}
int main(void)
{
UInt M = { 3, 4, 5 };
size_t N = NumberOf(M);
printf("The number of splits of {");
for (size_t i = 0; i < N; ++i) printf(" %" UIntFormat, M[i]);
printf(" } is %" UIntFormat ".n", CountSplits(N, M));
}
add a comment |
up vote
1
down vote
Here is a solution. I have included explanation in the comments.
#include <inttypes.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
typedef uintmax_t UInt;
#define UIntFormat PRIuMAX
#define NumberOf(a) (sizeof (a) / sizeof *(a))
/* In this code, a multiset is represented with:
an integer N0 which is the number of types of elements in the set,
an integer N1 which is the number of types of elements in the set that
each appear finitely many times (N0-N1 types each appear infinitely
many times), and
an array M in which each M[i] is the number of times that an element
of type i appears.
Collectively, this is referred to as the multiset M.
*/
/* Return the number of ways to choose r things from n things. This is
n! / (r! * (n-r)!).
*/
static UInt Choose(UInt r, UInt n)
{
UInt result = 1;
for (UInt i = 1; i <= r; ++i)
result = result * n-- / i;
return result;
}
// Count the number of r-combinations of a multiset.
static UInt CountRCombinations(UInt r, size_t N0, size_t N1, UInt M)
{
/* If we have only the unlimited types, there is a one-to-one
correspondence between r objects with N0-1 dividers placed between
them, each divider marking a transition from one type to another. For
example, consider four objects of three types. Below "o" represents
any object, and "|" is a divider. For each arrangement of four o's and
two |s, we show how it defines a selection of four objects of three
types:
oooo|| -> aaaa||
ooo|o| -> aaa|b|
ooo||o -> aaa||c
oo|oo| -> aa|bb|
oo|o|o -> aa|b|c
oo||oo -> aa||cc
o|ooo| -> a|bbb|
o|oo|o -> a|bb|c
o|o|oo -> a|b|cc
o||ooo -> a||ccc
|oooo| -> |bbbb|
|ooo|o -> |bbb|c
|oo|oo -> |bb|cc
|o|ooo -> |b|ccc
||oooo -> ||cccc
Therefore, the number of combinations equals the number of ways to
arrange r indistinguishable objects of one type with N0-1
indistinguishable objects of a different type.
*/
if (N1 == 0)
return Choose(r, r+N0-1);
/* Otherwise, we count the combinations:
Select one of the limited types (we use the last one, N1-1, because
it is easily removed from the array simply by reducing the size of
the array).
Count the number of combinations there would be if that type were
unlimited.
Count the number of combinations there would be if there were at
least M[i]+1 instances of elements of that type.
Subtract to get the number of combinations that have 0 to M[i]
instances of elements of that type.
*/
else
{
/* Let M' be the multiset M with the last type changed to unlimited.
So, where M is represented with N0, N1, M, M' is represented with
N0, N1-1, M.
*/
// Change the last limited type to unlimited.
N1 -= 1;
// Count the r-combinations drawn from M'.
UInt C = CountRCombinations(r, N0, N1, M);
/* Now we count the combinations which have at least M[N1]+1 instances
of the (formerly) last type.
Consider that each such combination has M[N1]+1 instances of that
type plus some combination of r - (M[N1]+1) elements drawn from M',
including zero or more instances of the last type. (Note that if r
<= M[N1], there are no such combinations, since we would be asking
for a negative number of elements.)
So the number of combinations which have at least M[N1]+1 instances
of the last type equals the number of combinations of that type
plus some combination of r - (M[N1]+1) elements drawn from M'.
*/
if (M[N1] < r)
C -= CountRCombinations(r - (M[N1] + 1), N0, N1, M);
return C;
}
}
// Count the number of splits of a multiset M that contains N types of things.
static UInt CountSplits(size_t N, UInt M)
{
// Count the number of elements.
UInt T = 0;
for (size_t i = 0; i < N; ++i)
T += M[i];
// Return the number of T/2-combinations of M.
return T % 2 ? 0 : CountRCombinations(T/2, N, N, M);
}
int main(void)
{
UInt M = { 3, 4, 5 };
size_t N = NumberOf(M);
printf("The number of splits of {");
for (size_t i = 0; i < N; ++i) printf(" %" UIntFormat, M[i]);
printf(" } is %" UIntFormat ".n", CountSplits(N, M));
}
add a comment |
up vote
1
down vote
up vote
1
down vote
Here is a solution. I have included explanation in the comments.
#include <inttypes.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
typedef uintmax_t UInt;
#define UIntFormat PRIuMAX
#define NumberOf(a) (sizeof (a) / sizeof *(a))
/* In this code, a multiset is represented with:
an integer N0 which is the number of types of elements in the set,
an integer N1 which is the number of types of elements in the set that
each appear finitely many times (N0-N1 types each appear infinitely
many times), and
an array M in which each M[i] is the number of times that an element
of type i appears.
Collectively, this is referred to as the multiset M.
*/
/* Return the number of ways to choose r things from n things. This is
n! / (r! * (n-r)!).
*/
static UInt Choose(UInt r, UInt n)
{
UInt result = 1;
for (UInt i = 1; i <= r; ++i)
result = result * n-- / i;
return result;
}
// Count the number of r-combinations of a multiset.
static UInt CountRCombinations(UInt r, size_t N0, size_t N1, UInt M)
{
/* If we have only the unlimited types, there is a one-to-one
correspondence between r objects with N0-1 dividers placed between
them, each divider marking a transition from one type to another. For
example, consider four objects of three types. Below "o" represents
any object, and "|" is a divider. For each arrangement of four o's and
two |s, we show how it defines a selection of four objects of three
types:
oooo|| -> aaaa||
ooo|o| -> aaa|b|
ooo||o -> aaa||c
oo|oo| -> aa|bb|
oo|o|o -> aa|b|c
oo||oo -> aa||cc
o|ooo| -> a|bbb|
o|oo|o -> a|bb|c
o|o|oo -> a|b|cc
o||ooo -> a||ccc
|oooo| -> |bbbb|
|ooo|o -> |bbb|c
|oo|oo -> |bb|cc
|o|ooo -> |b|ccc
||oooo -> ||cccc
Therefore, the number of combinations equals the number of ways to
arrange r indistinguishable objects of one type with N0-1
indistinguishable objects of a different type.
*/
if (N1 == 0)
return Choose(r, r+N0-1);
/* Otherwise, we count the combinations:
Select one of the limited types (we use the last one, N1-1, because
it is easily removed from the array simply by reducing the size of
the array).
Count the number of combinations there would be if that type were
unlimited.
Count the number of combinations there would be if there were at
least M[i]+1 instances of elements of that type.
Subtract to get the number of combinations that have 0 to M[i]
instances of elements of that type.
*/
else
{
/* Let M' be the multiset M with the last type changed to unlimited.
So, where M is represented with N0, N1, M, M' is represented with
N0, N1-1, M.
*/
// Change the last limited type to unlimited.
N1 -= 1;
// Count the r-combinations drawn from M'.
UInt C = CountRCombinations(r, N0, N1, M);
/* Now we count the combinations which have at least M[N1]+1 instances
of the (formerly) last type.
Consider that each such combination has M[N1]+1 instances of that
type plus some combination of r - (M[N1]+1) elements drawn from M',
including zero or more instances of the last type. (Note that if r
<= M[N1], there are no such combinations, since we would be asking
for a negative number of elements.)
So the number of combinations which have at least M[N1]+1 instances
of the last type equals the number of combinations of that type
plus some combination of r - (M[N1]+1) elements drawn from M'.
*/
if (M[N1] < r)
C -= CountRCombinations(r - (M[N1] + 1), N0, N1, M);
return C;
}
}
// Count the number of splits of a multiset M that contains N types of things.
static UInt CountSplits(size_t N, UInt M)
{
// Count the number of elements.
UInt T = 0;
for (size_t i = 0; i < N; ++i)
T += M[i];
// Return the number of T/2-combinations of M.
return T % 2 ? 0 : CountRCombinations(T/2, N, N, M);
}
int main(void)
{
UInt M = { 3, 4, 5 };
size_t N = NumberOf(M);
printf("The number of splits of {");
for (size_t i = 0; i < N; ++i) printf(" %" UIntFormat, M[i]);
printf(" } is %" UIntFormat ".n", CountSplits(N, M));
}
Here is a solution. I have included explanation in the comments.
#include <inttypes.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
typedef uintmax_t UInt;
#define UIntFormat PRIuMAX
#define NumberOf(a) (sizeof (a) / sizeof *(a))
/* In this code, a multiset is represented with:
an integer N0 which is the number of types of elements in the set,
an integer N1 which is the number of types of elements in the set that
each appear finitely many times (N0-N1 types each appear infinitely
many times), and
an array M in which each M[i] is the number of times that an element
of type i appears.
Collectively, this is referred to as the multiset M.
*/
/* Return the number of ways to choose r things from n things. This is
n! / (r! * (n-r)!).
*/
static UInt Choose(UInt r, UInt n)
{
UInt result = 1;
for (UInt i = 1; i <= r; ++i)
result = result * n-- / i;
return result;
}
// Count the number of r-combinations of a multiset.
static UInt CountRCombinations(UInt r, size_t N0, size_t N1, UInt M)
{
/* If we have only the unlimited types, there is a one-to-one
correspondence between r objects with N0-1 dividers placed between
them, each divider marking a transition from one type to another. For
example, consider four objects of three types. Below "o" represents
any object, and "|" is a divider. For each arrangement of four o's and
two |s, we show how it defines a selection of four objects of three
types:
oooo|| -> aaaa||
ooo|o| -> aaa|b|
ooo||o -> aaa||c
oo|oo| -> aa|bb|
oo|o|o -> aa|b|c
oo||oo -> aa||cc
o|ooo| -> a|bbb|
o|oo|o -> a|bb|c
o|o|oo -> a|b|cc
o||ooo -> a||ccc
|oooo| -> |bbbb|
|ooo|o -> |bbb|c
|oo|oo -> |bb|cc
|o|ooo -> |b|ccc
||oooo -> ||cccc
Therefore, the number of combinations equals the number of ways to
arrange r indistinguishable objects of one type with N0-1
indistinguishable objects of a different type.
*/
if (N1 == 0)
return Choose(r, r+N0-1);
/* Otherwise, we count the combinations:
Select one of the limited types (we use the last one, N1-1, because
it is easily removed from the array simply by reducing the size of
the array).
Count the number of combinations there would be if that type were
unlimited.
Count the number of combinations there would be if there were at
least M[i]+1 instances of elements of that type.
Subtract to get the number of combinations that have 0 to M[i]
instances of elements of that type.
*/
else
{
/* Let M' be the multiset M with the last type changed to unlimited.
So, where M is represented with N0, N1, M, M' is represented with
N0, N1-1, M.
*/
// Change the last limited type to unlimited.
N1 -= 1;
// Count the r-combinations drawn from M'.
UInt C = CountRCombinations(r, N0, N1, M);
/* Now we count the combinations which have at least M[N1]+1 instances
of the (formerly) last type.
Consider that each such combination has M[N1]+1 instances of that
type plus some combination of r - (M[N1]+1) elements drawn from M',
including zero or more instances of the last type. (Note that if r
<= M[N1], there are no such combinations, since we would be asking
for a negative number of elements.)
So the number of combinations which have at least M[N1]+1 instances
of the last type equals the number of combinations of that type
plus some combination of r - (M[N1]+1) elements drawn from M'.
*/
if (M[N1] < r)
C -= CountRCombinations(r - (M[N1] + 1), N0, N1, M);
return C;
}
}
// Count the number of splits of a multiset M that contains N types of things.
static UInt CountSplits(size_t N, UInt M)
{
// Count the number of elements.
UInt T = 0;
for (size_t i = 0; i < N; ++i)
T += M[i];
// Return the number of T/2-combinations of M.
return T % 2 ? 0 : CountRCombinations(T/2, N, N, M);
}
int main(void)
{
UInt M = { 3, 4, 5 };
size_t N = NumberOf(M);
printf("The number of splits of {");
for (size_t i = 0; i < N; ++i) printf(" %" UIntFormat, M[i]);
printf(" } is %" UIntFormat ".n", CountSplits(N, M));
}
answered Nov 12 at 23:49
Eric Postpischil
69.1k873150
69.1k873150
add a comment |
add a comment |
up vote
1
down vote
It's certainly the case that the original problem is isomorphic with the problem of counting r-combinations of a multiset of size 2r.
As an alternative to the inclusion-exclusion formula (which is certainly of analytic value, but possibly of less algorithmic value), we can construct a recursive solution which computes the counts of k-combinations of the set for all values of k, using a very similar approach to recursion employed in the recursive algorithm for counting the binomial coefficients C(n,k), the counts of k-combinations of a set of n elements.
Suppose we represent the multiset as a sorted vector V of size n (where n = 2r). (Technically, it doesn't have to be sorted; it is sufficient for it to be "clumped" so that all the identical elements are consecutive. However, the easiest way to do this is to sort the vector.) We want to produce all unique k-combinations of this vector. All such combinations have one of two forms:
"Choose the first element". The combination starts with V1 and continues with a (k−1)-combination of (V2, V3, …, Vn).
"Skip the first element". The combination is a k-combination of (Vi, Vi+1, … Vn) where i is the smallest index such that Vi ≠ V1. (We need to skip all elements identical to the first element in order to avoid duplication.)
The only difference here with the binomial recursion is the use of the index i in the second option; if the set has no repeated elements, this reduces to i=2, which yields the recursion C(n, k) = C(n − 1, k − 1) + C(n − 1, k).
A naive implementation of this recursion will take exponential time because each computation requires two recursive calls; however, there are only a quadratic number of unique calls so the computation can be reduced to quadratic time by memoisation or dynamic programming. The solution below uses dynamic programming, because that only requires linear space. (Memoisation would require quadratic space since there are a quadratic number of subproblems.)
The dynamic programming solution inverts the recursion by computing the number of k-combinations for the successive suffixes of the vector. It needs to keep the values for only two suffixes: the previous suffix and the previous suffix with a different first element, corresponding to the counts needed by the first and second option of the above recursion. (The actual code uses prefixes instead of suffixes, but it makes absolutely no difference.)
As a minor optimisation, we only compute the counts of k-combinations for values of k between 0 and ⌈n/2⌉. As with binomial coefficients, the counts are symmetrical: the number of k-combinations is equal to the number of (n−k)-combinations because every k-combination corresponds to a unique (n−k)-combination consisting of all the unselected elements. An additional optimisation is possible based on the fact that we only need one count at the end, but the additional complication only obscures the basic algorithm.
The fact that the solution is O(n2) is a little annoying, but since n will generally be a smallish number (otherwise the counts will be astronomical) the computation time seems reasonable. There are undoubtedly further optimisations possible, and it's quite possible that a sub-quadratic algorithm exists.
Here's a basic implementation in C (using an array of strings):
/* Given a *sorted* vector v of size n, compute the number of unique k-combinations
* of the elements of the vector for values of k between 0 and (n/2)+1.
* The counts are stored in the vector r, which must be large enough.
* Counts for larger values of k can be trivially looked up in the returned
* vector, using the identity r[k] == r[n - k].
* If v is not sorted, the result will be incorrect. The function does not
* check for overflow, but the results will be correct modulo (UINTMAX + 1)
*/
void multicomb(const char** v, size_t n, uintmax_t* r) {
size_t lim = n / 2 + 1;
// Prev retains the counts for the previous prefix ending with
// a different element
uintmax_t prev[lim];
// If there are no elements, there is 1 0-combination and no other combinations.
memset(r, 0, sizeof prev);
r[0] = 1;
// Iterate over the unique elements of v
for (size_t k = 0; k < n; ) {
// Save the counts for this prefix
memcpy(prev, r, sizeof prev);
// Iterate over the prefixes with the same ending value
do {
for (size_t i = lim - 1; i > 0; --i)
r[i] = r[i - 1] + prev[i];
} while (++k < n && strcmp(v[k - 1], v[k]) == 0);
};
}
In comparison with the solution in OP's self-answer, this version:
- overflows later because it only depends on addition. (The fact that there is no division also makes it much easier to compute the count modulo a prime.)
- takes quadratic time instead of exponential time.
add a comment |
up vote
1
down vote
It's certainly the case that the original problem is isomorphic with the problem of counting r-combinations of a multiset of size 2r.
As an alternative to the inclusion-exclusion formula (which is certainly of analytic value, but possibly of less algorithmic value), we can construct a recursive solution which computes the counts of k-combinations of the set for all values of k, using a very similar approach to recursion employed in the recursive algorithm for counting the binomial coefficients C(n,k), the counts of k-combinations of a set of n elements.
Suppose we represent the multiset as a sorted vector V of size n (where n = 2r). (Technically, it doesn't have to be sorted; it is sufficient for it to be "clumped" so that all the identical elements are consecutive. However, the easiest way to do this is to sort the vector.) We want to produce all unique k-combinations of this vector. All such combinations have one of two forms:
"Choose the first element". The combination starts with V1 and continues with a (k−1)-combination of (V2, V3, …, Vn).
"Skip the first element". The combination is a k-combination of (Vi, Vi+1, … Vn) where i is the smallest index such that Vi ≠ V1. (We need to skip all elements identical to the first element in order to avoid duplication.)
The only difference here with the binomial recursion is the use of the index i in the second option; if the set has no repeated elements, this reduces to i=2, which yields the recursion C(n, k) = C(n − 1, k − 1) + C(n − 1, k).
A naive implementation of this recursion will take exponential time because each computation requires two recursive calls; however, there are only a quadratic number of unique calls so the computation can be reduced to quadratic time by memoisation or dynamic programming. The solution below uses dynamic programming, because that only requires linear space. (Memoisation would require quadratic space since there are a quadratic number of subproblems.)
The dynamic programming solution inverts the recursion by computing the number of k-combinations for the successive suffixes of the vector. It needs to keep the values for only two suffixes: the previous suffix and the previous suffix with a different first element, corresponding to the counts needed by the first and second option of the above recursion. (The actual code uses prefixes instead of suffixes, but it makes absolutely no difference.)
As a minor optimisation, we only compute the counts of k-combinations for values of k between 0 and ⌈n/2⌉. As with binomial coefficients, the counts are symmetrical: the number of k-combinations is equal to the number of (n−k)-combinations because every k-combination corresponds to a unique (n−k)-combination consisting of all the unselected elements. An additional optimisation is possible based on the fact that we only need one count at the end, but the additional complication only obscures the basic algorithm.
The fact that the solution is O(n2) is a little annoying, but since n will generally be a smallish number (otherwise the counts will be astronomical) the computation time seems reasonable. There are undoubtedly further optimisations possible, and it's quite possible that a sub-quadratic algorithm exists.
Here's a basic implementation in C (using an array of strings):
/* Given a *sorted* vector v of size n, compute the number of unique k-combinations
* of the elements of the vector for values of k between 0 and (n/2)+1.
* The counts are stored in the vector r, which must be large enough.
* Counts for larger values of k can be trivially looked up in the returned
* vector, using the identity r[k] == r[n - k].
* If v is not sorted, the result will be incorrect. The function does not
* check for overflow, but the results will be correct modulo (UINTMAX + 1)
*/
void multicomb(const char** v, size_t n, uintmax_t* r) {
size_t lim = n / 2 + 1;
// Prev retains the counts for the previous prefix ending with
// a different element
uintmax_t prev[lim];
// If there are no elements, there is 1 0-combination and no other combinations.
memset(r, 0, sizeof prev);
r[0] = 1;
// Iterate over the unique elements of v
for (size_t k = 0; k < n; ) {
// Save the counts for this prefix
memcpy(prev, r, sizeof prev);
// Iterate over the prefixes with the same ending value
do {
for (size_t i = lim - 1; i > 0; --i)
r[i] = r[i - 1] + prev[i];
} while (++k < n && strcmp(v[k - 1], v[k]) == 0);
};
}
In comparison with the solution in OP's self-answer, this version:
- overflows later because it only depends on addition. (The fact that there is no division also makes it much easier to compute the count modulo a prime.)
- takes quadratic time instead of exponential time.
add a comment |
up vote
1
down vote
up vote
1
down vote
It's certainly the case that the original problem is isomorphic with the problem of counting r-combinations of a multiset of size 2r.
As an alternative to the inclusion-exclusion formula (which is certainly of analytic value, but possibly of less algorithmic value), we can construct a recursive solution which computes the counts of k-combinations of the set for all values of k, using a very similar approach to recursion employed in the recursive algorithm for counting the binomial coefficients C(n,k), the counts of k-combinations of a set of n elements.
Suppose we represent the multiset as a sorted vector V of size n (where n = 2r). (Technically, it doesn't have to be sorted; it is sufficient for it to be "clumped" so that all the identical elements are consecutive. However, the easiest way to do this is to sort the vector.) We want to produce all unique k-combinations of this vector. All such combinations have one of two forms:
"Choose the first element". The combination starts with V1 and continues with a (k−1)-combination of (V2, V3, …, Vn).
"Skip the first element". The combination is a k-combination of (Vi, Vi+1, … Vn) where i is the smallest index such that Vi ≠ V1. (We need to skip all elements identical to the first element in order to avoid duplication.)
The only difference here with the binomial recursion is the use of the index i in the second option; if the set has no repeated elements, this reduces to i=2, which yields the recursion C(n, k) = C(n − 1, k − 1) + C(n − 1, k).
A naive implementation of this recursion will take exponential time because each computation requires two recursive calls; however, there are only a quadratic number of unique calls so the computation can be reduced to quadratic time by memoisation or dynamic programming. The solution below uses dynamic programming, because that only requires linear space. (Memoisation would require quadratic space since there are a quadratic number of subproblems.)
The dynamic programming solution inverts the recursion by computing the number of k-combinations for the successive suffixes of the vector. It needs to keep the values for only two suffixes: the previous suffix and the previous suffix with a different first element, corresponding to the counts needed by the first and second option of the above recursion. (The actual code uses prefixes instead of suffixes, but it makes absolutely no difference.)
As a minor optimisation, we only compute the counts of k-combinations for values of k between 0 and ⌈n/2⌉. As with binomial coefficients, the counts are symmetrical: the number of k-combinations is equal to the number of (n−k)-combinations because every k-combination corresponds to a unique (n−k)-combination consisting of all the unselected elements. An additional optimisation is possible based on the fact that we only need one count at the end, but the additional complication only obscures the basic algorithm.
The fact that the solution is O(n2) is a little annoying, but since n will generally be a smallish number (otherwise the counts will be astronomical) the computation time seems reasonable. There are undoubtedly further optimisations possible, and it's quite possible that a sub-quadratic algorithm exists.
Here's a basic implementation in C (using an array of strings):
/* Given a *sorted* vector v of size n, compute the number of unique k-combinations
* of the elements of the vector for values of k between 0 and (n/2)+1.
* The counts are stored in the vector r, which must be large enough.
* Counts for larger values of k can be trivially looked up in the returned
* vector, using the identity r[k] == r[n - k].
* If v is not sorted, the result will be incorrect. The function does not
* check for overflow, but the results will be correct modulo (UINTMAX + 1)
*/
void multicomb(const char** v, size_t n, uintmax_t* r) {
size_t lim = n / 2 + 1;
// Prev retains the counts for the previous prefix ending with
// a different element
uintmax_t prev[lim];
// If there are no elements, there is 1 0-combination and no other combinations.
memset(r, 0, sizeof prev);
r[0] = 1;
// Iterate over the unique elements of v
for (size_t k = 0; k < n; ) {
// Save the counts for this prefix
memcpy(prev, r, sizeof prev);
// Iterate over the prefixes with the same ending value
do {
for (size_t i = lim - 1; i > 0; --i)
r[i] = r[i - 1] + prev[i];
} while (++k < n && strcmp(v[k - 1], v[k]) == 0);
};
}
In comparison with the solution in OP's self-answer, this version:
- overflows later because it only depends on addition. (The fact that there is no division also makes it much easier to compute the count modulo a prime.)
- takes quadratic time instead of exponential time.
It's certainly the case that the original problem is isomorphic with the problem of counting r-combinations of a multiset of size 2r.
As an alternative to the inclusion-exclusion formula (which is certainly of analytic value, but possibly of less algorithmic value), we can construct a recursive solution which computes the counts of k-combinations of the set for all values of k, using a very similar approach to recursion employed in the recursive algorithm for counting the binomial coefficients C(n,k), the counts of k-combinations of a set of n elements.
Suppose we represent the multiset as a sorted vector V of size n (where n = 2r). (Technically, it doesn't have to be sorted; it is sufficient for it to be "clumped" so that all the identical elements are consecutive. However, the easiest way to do this is to sort the vector.) We want to produce all unique k-combinations of this vector. All such combinations have one of two forms:
"Choose the first element". The combination starts with V1 and continues with a (k−1)-combination of (V2, V3, …, Vn).
"Skip the first element". The combination is a k-combination of (Vi, Vi+1, … Vn) where i is the smallest index such that Vi ≠ V1. (We need to skip all elements identical to the first element in order to avoid duplication.)
The only difference here with the binomial recursion is the use of the index i in the second option; if the set has no repeated elements, this reduces to i=2, which yields the recursion C(n, k) = C(n − 1, k − 1) + C(n − 1, k).
A naive implementation of this recursion will take exponential time because each computation requires two recursive calls; however, there are only a quadratic number of unique calls so the computation can be reduced to quadratic time by memoisation or dynamic programming. The solution below uses dynamic programming, because that only requires linear space. (Memoisation would require quadratic space since there are a quadratic number of subproblems.)
The dynamic programming solution inverts the recursion by computing the number of k-combinations for the successive suffixes of the vector. It needs to keep the values for only two suffixes: the previous suffix and the previous suffix with a different first element, corresponding to the counts needed by the first and second option of the above recursion. (The actual code uses prefixes instead of suffixes, but it makes absolutely no difference.)
As a minor optimisation, we only compute the counts of k-combinations for values of k between 0 and ⌈n/2⌉. As with binomial coefficients, the counts are symmetrical: the number of k-combinations is equal to the number of (n−k)-combinations because every k-combination corresponds to a unique (n−k)-combination consisting of all the unselected elements. An additional optimisation is possible based on the fact that we only need one count at the end, but the additional complication only obscures the basic algorithm.
The fact that the solution is O(n2) is a little annoying, but since n will generally be a smallish number (otherwise the counts will be astronomical) the computation time seems reasonable. There are undoubtedly further optimisations possible, and it's quite possible that a sub-quadratic algorithm exists.
Here's a basic implementation in C (using an array of strings):
/* Given a *sorted* vector v of size n, compute the number of unique k-combinations
* of the elements of the vector for values of k between 0 and (n/2)+1.
* The counts are stored in the vector r, which must be large enough.
* Counts for larger values of k can be trivially looked up in the returned
* vector, using the identity r[k] == r[n - k].
* If v is not sorted, the result will be incorrect. The function does not
* check for overflow, but the results will be correct modulo (UINTMAX + 1)
*/
void multicomb(const char** v, size_t n, uintmax_t* r) {
size_t lim = n / 2 + 1;
// Prev retains the counts for the previous prefix ending with
// a different element
uintmax_t prev[lim];
// If there are no elements, there is 1 0-combination and no other combinations.
memset(r, 0, sizeof prev);
r[0] = 1;
// Iterate over the unique elements of v
for (size_t k = 0; k < n; ) {
// Save the counts for this prefix
memcpy(prev, r, sizeof prev);
// Iterate over the prefixes with the same ending value
do {
for (size_t i = lim - 1; i > 0; --i)
r[i] = r[i - 1] + prev[i];
} while (++k < n && strcmp(v[k - 1], v[k]) == 0);
};
}
In comparison with the solution in OP's self-answer, this version:
- overflows later because it only depends on addition. (The fact that there is no division also makes it much easier to compute the count modulo a prime.)
- takes quadratic time instead of exponential time.
edited Nov 14 at 20:57
answered Nov 14 at 19:31
rici
149k19128192
149k19128192
add a comment |
add a comment |
up vote
0
down vote
I don't think that that's quite right, Eric. The original question asks, if my source array contains A, A, B, B then how many ways can it be uniquely split between two recipients - the answer in this case will be three because the array could be split as follows:
Child Array 1 | Child Array 2
-----------------------------
A A | B B
A B | A B
B B | A A
This source code (C++ function) works but it's horribly slow and inefficient. Brute force and ignorance.
int countPermutations(vector<int> itemset) {
vector<vector<int> > permutations;
do {
vector<int> match;
for(int i = 0; i < itemset.size(); i+=2) {
match.push_back(itemset.at(i));
}
sort(match.begin(), match.end());
int j = 0;
bool found = false;
while (!found && (j < permutations.size())) {
found = (match == permutations.at(j))?true:false;
j++;
}
if (!found) {
permutations.push_back(match);
}
} while (next_permutation(itemset.begin(), itemset.end()));
return permutations.size();
}
Any thoughts on optimisation?
1
Eric's algorithm produces the value 3 for this case, as expected (you have to call the function with the array {2, 2}, indicating "two As and two Bs"). Why do you think it's wrong?
– rici
Nov 14 at 18:30
My first thought on optimisation, by the way: Don't accumulate the possibilities; just count them. For any significant problem size, there will be far too many to keep in memory, and anyway you don't do anything with them. If you actually need the individual splits, that's a very different question (because counting can be optimised a lot more).
– rici
Nov 14 at 18:35
rici is correct, to compute the count for { A, A, B, B }, you need to changeUInt M = { 3, 4, 5 };
toUInt M = { 2, 2 };
.
– Eric Postpischil
Nov 14 at 20:11
add a comment |
up vote
0
down vote
I don't think that that's quite right, Eric. The original question asks, if my source array contains A, A, B, B then how many ways can it be uniquely split between two recipients - the answer in this case will be three because the array could be split as follows:
Child Array 1 | Child Array 2
-----------------------------
A A | B B
A B | A B
B B | A A
This source code (C++ function) works but it's horribly slow and inefficient. Brute force and ignorance.
int countPermutations(vector<int> itemset) {
vector<vector<int> > permutations;
do {
vector<int> match;
for(int i = 0; i < itemset.size(); i+=2) {
match.push_back(itemset.at(i));
}
sort(match.begin(), match.end());
int j = 0;
bool found = false;
while (!found && (j < permutations.size())) {
found = (match == permutations.at(j))?true:false;
j++;
}
if (!found) {
permutations.push_back(match);
}
} while (next_permutation(itemset.begin(), itemset.end()));
return permutations.size();
}
Any thoughts on optimisation?
1
Eric's algorithm produces the value 3 for this case, as expected (you have to call the function with the array {2, 2}, indicating "two As and two Bs"). Why do you think it's wrong?
– rici
Nov 14 at 18:30
My first thought on optimisation, by the way: Don't accumulate the possibilities; just count them. For any significant problem size, there will be far too many to keep in memory, and anyway you don't do anything with them. If you actually need the individual splits, that's a very different question (because counting can be optimised a lot more).
– rici
Nov 14 at 18:35
rici is correct, to compute the count for { A, A, B, B }, you need to changeUInt M = { 3, 4, 5 };
toUInt M = { 2, 2 };
.
– Eric Postpischil
Nov 14 at 20:11
add a comment |
up vote
0
down vote
up vote
0
down vote
I don't think that that's quite right, Eric. The original question asks, if my source array contains A, A, B, B then how many ways can it be uniquely split between two recipients - the answer in this case will be three because the array could be split as follows:
Child Array 1 | Child Array 2
-----------------------------
A A | B B
A B | A B
B B | A A
This source code (C++ function) works but it's horribly slow and inefficient. Brute force and ignorance.
int countPermutations(vector<int> itemset) {
vector<vector<int> > permutations;
do {
vector<int> match;
for(int i = 0; i < itemset.size(); i+=2) {
match.push_back(itemset.at(i));
}
sort(match.begin(), match.end());
int j = 0;
bool found = false;
while (!found && (j < permutations.size())) {
found = (match == permutations.at(j))?true:false;
j++;
}
if (!found) {
permutations.push_back(match);
}
} while (next_permutation(itemset.begin(), itemset.end()));
return permutations.size();
}
Any thoughts on optimisation?
I don't think that that's quite right, Eric. The original question asks, if my source array contains A, A, B, B then how many ways can it be uniquely split between two recipients - the answer in this case will be three because the array could be split as follows:
Child Array 1 | Child Array 2
-----------------------------
A A | B B
A B | A B
B B | A A
This source code (C++ function) works but it's horribly slow and inefficient. Brute force and ignorance.
int countPermutations(vector<int> itemset) {
vector<vector<int> > permutations;
do {
vector<int> match;
for(int i = 0; i < itemset.size(); i+=2) {
match.push_back(itemset.at(i));
}
sort(match.begin(), match.end());
int j = 0;
bool found = false;
while (!found && (j < permutations.size())) {
found = (match == permutations.at(j))?true:false;
j++;
}
if (!found) {
permutations.push_back(match);
}
} while (next_permutation(itemset.begin(), itemset.end()));
return permutations.size();
}
Any thoughts on optimisation?
answered Nov 14 at 15:32
headbanger
406416
406416
1
Eric's algorithm produces the value 3 for this case, as expected (you have to call the function with the array {2, 2}, indicating "two As and two Bs"). Why do you think it's wrong?
– rici
Nov 14 at 18:30
My first thought on optimisation, by the way: Don't accumulate the possibilities; just count them. For any significant problem size, there will be far too many to keep in memory, and anyway you don't do anything with them. If you actually need the individual splits, that's a very different question (because counting can be optimised a lot more).
– rici
Nov 14 at 18:35
rici is correct, to compute the count for { A, A, B, B }, you need to changeUInt M = { 3, 4, 5 };
toUInt M = { 2, 2 };
.
– Eric Postpischil
Nov 14 at 20:11
add a comment |
1
Eric's algorithm produces the value 3 for this case, as expected (you have to call the function with the array {2, 2}, indicating "two As and two Bs"). Why do you think it's wrong?
– rici
Nov 14 at 18:30
My first thought on optimisation, by the way: Don't accumulate the possibilities; just count them. For any significant problem size, there will be far too many to keep in memory, and anyway you don't do anything with them. If you actually need the individual splits, that's a very different question (because counting can be optimised a lot more).
– rici
Nov 14 at 18:35
rici is correct, to compute the count for { A, A, B, B }, you need to changeUInt M = { 3, 4, 5 };
toUInt M = { 2, 2 };
.
– Eric Postpischil
Nov 14 at 20:11
1
1
Eric's algorithm produces the value 3 for this case, as expected (you have to call the function with the array {2, 2}, indicating "two As and two Bs"). Why do you think it's wrong?
– rici
Nov 14 at 18:30
Eric's algorithm produces the value 3 for this case, as expected (you have to call the function with the array {2, 2}, indicating "two As and two Bs"). Why do you think it's wrong?
– rici
Nov 14 at 18:30
My first thought on optimisation, by the way: Don't accumulate the possibilities; just count them. For any significant problem size, there will be far too many to keep in memory, and anyway you don't do anything with them. If you actually need the individual splits, that's a very different question (because counting can be optimised a lot more).
– rici
Nov 14 at 18:35
My first thought on optimisation, by the way: Don't accumulate the possibilities; just count them. For any significant problem size, there will be far too many to keep in memory, and anyway you don't do anything with them. If you actually need the individual splits, that's a very different question (because counting can be optimised a lot more).
– rici
Nov 14 at 18:35
rici is correct, to compute the count for { A, A, B, B }, you need to change
UInt M = { 3, 4, 5 };
to UInt M = { 2, 2 };
.– Eric Postpischil
Nov 14 at 20:11
rici is correct, to compute the count for { A, A, B, B }, you need to change
UInt M = { 3, 4, 5 };
to UInt M = { 2, 2 };
.– Eric Postpischil
Nov 14 at 20:11
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53270489%2fhow-can-we-count-the-r-combinations-of-a-multiset%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown