Pandas: issues with groupby. Error: 'ValueError: Grouper for not 1-dimensional'
I've been looking for this error here, but I only found this solution 1 (which doesn't work in my case). Does anybody can guide me on how to solve it?.
My dataset (df2) looks like this:
id_cl id_sup total_t cl_ind cl_city sup_ind sup_city same_city
0 1000135 1797029 414.85 I5610 11308.0 G4711 10901.0 no
1 1000135 1798069 19.76 I5610 11308.0 G4719 10901.0 no
2 1000135 1923186 302.73 I5610 11308.0 G4630 10901.0 no
3 1000135 2502927 1262.86 I5610 11308.0 G4630 11308.0 yes
4 1000135 2504288 155.04 I5610 11308.0 G4711 11308.0 yes
I need to group this dataset as follows:
df_sup = df2.groupby(['cl_city','cl_ind','same_city']).agg({'id_sup':'nunique', 'total_t':'sum'})
But when performing this, I'm getting this error!:
ValueError: Grouper for 'cl_city' not 1-dimensional
As a result I need something like this:
id_sup total_t
cl_city cl_ind same_city
10701 A0112 no 2 21964.22
yes 31 3530.40
A0122 no 2374 23328061.47
yes 1228 2684408.12
A0127 no 11 19962.68
yes 7 915.44
A0163 no 357 574827.97
yes 140 60385.7
python pandas
|
show 5 more comments
I've been looking for this error here, but I only found this solution 1 (which doesn't work in my case). Does anybody can guide me on how to solve it?.
My dataset (df2) looks like this:
id_cl id_sup total_t cl_ind cl_city sup_ind sup_city same_city
0 1000135 1797029 414.85 I5610 11308.0 G4711 10901.0 no
1 1000135 1798069 19.76 I5610 11308.0 G4719 10901.0 no
2 1000135 1923186 302.73 I5610 11308.0 G4630 10901.0 no
3 1000135 2502927 1262.86 I5610 11308.0 G4630 11308.0 yes
4 1000135 2504288 155.04 I5610 11308.0 G4711 11308.0 yes
I need to group this dataset as follows:
df_sup = df2.groupby(['cl_city','cl_ind','same_city']).agg({'id_sup':'nunique', 'total_t':'sum'})
But when performing this, I'm getting this error!:
ValueError: Grouper for 'cl_city' not 1-dimensional
As a result I need something like this:
id_sup total_t
cl_city cl_ind same_city
10701 A0112 no 2 21964.22
yes 31 3530.40
A0122 no 2374 23328061.47
yes 1228 2684408.12
A0127 no 11 19962.68
yes 7 915.44
A0163 no 357 574827.97
yes 140 60385.7
python pandas
What is the output ofdf2.columns
?
– Peter Leimbigler
Nov 21 '18 at 22:47
as an output I need to know: how many unique 'id_sup' and the sum of 'total_t' of every item of cl_city, cl_ind and same_city. Basically a table with columns: cl_city | cl_ind | same_city | count of unique id_sup | sum of total_T
– PAstudilloE
Nov 21 '18 at 22:54
Possible duplicate of stackoverflow.com/questions/43298192/…
– Kevin Fang
Nov 21 '18 at 22:58
1
@PAstudilloE, thanks! This does reveal the immediate cause of the error: for some terrible reason, the columns ofdf2
are a highly nested MultiIndex, where instead of each string being a label, each string is actually the name of a separate *index level*(!) The fix for this comes from stackoverflow.com/q/14507794, and is this:df.columns = [' '.join(col).strip() for col in df.columns.values]
. You mentioneddf2
is the result of many merges; I suspect those merges are not written optimally, leading to this pathological MultiIndex situation.
– Peter Leimbigler
Nov 21 '18 at 23:21
1
Thanks @PeterLeimbigler!! That's exactly what happened. I solved the issue as you recommended.
– PAstudilloE
Nov 21 '18 at 23:26
|
show 5 more comments
I've been looking for this error here, but I only found this solution 1 (which doesn't work in my case). Does anybody can guide me on how to solve it?.
My dataset (df2) looks like this:
id_cl id_sup total_t cl_ind cl_city sup_ind sup_city same_city
0 1000135 1797029 414.85 I5610 11308.0 G4711 10901.0 no
1 1000135 1798069 19.76 I5610 11308.0 G4719 10901.0 no
2 1000135 1923186 302.73 I5610 11308.0 G4630 10901.0 no
3 1000135 2502927 1262.86 I5610 11308.0 G4630 11308.0 yes
4 1000135 2504288 155.04 I5610 11308.0 G4711 11308.0 yes
I need to group this dataset as follows:
df_sup = df2.groupby(['cl_city','cl_ind','same_city']).agg({'id_sup':'nunique', 'total_t':'sum'})
But when performing this, I'm getting this error!:
ValueError: Grouper for 'cl_city' not 1-dimensional
As a result I need something like this:
id_sup total_t
cl_city cl_ind same_city
10701 A0112 no 2 21964.22
yes 31 3530.40
A0122 no 2374 23328061.47
yes 1228 2684408.12
A0127 no 11 19962.68
yes 7 915.44
A0163 no 357 574827.97
yes 140 60385.7
python pandas
I've been looking for this error here, but I only found this solution 1 (which doesn't work in my case). Does anybody can guide me on how to solve it?.
My dataset (df2) looks like this:
id_cl id_sup total_t cl_ind cl_city sup_ind sup_city same_city
0 1000135 1797029 414.85 I5610 11308.0 G4711 10901.0 no
1 1000135 1798069 19.76 I5610 11308.0 G4719 10901.0 no
2 1000135 1923186 302.73 I5610 11308.0 G4630 10901.0 no
3 1000135 2502927 1262.86 I5610 11308.0 G4630 11308.0 yes
4 1000135 2504288 155.04 I5610 11308.0 G4711 11308.0 yes
I need to group this dataset as follows:
df_sup = df2.groupby(['cl_city','cl_ind','same_city']).agg({'id_sup':'nunique', 'total_t':'sum'})
But when performing this, I'm getting this error!:
ValueError: Grouper for 'cl_city' not 1-dimensional
As a result I need something like this:
id_sup total_t
cl_city cl_ind same_city
10701 A0112 no 2 21964.22
yes 31 3530.40
A0122 no 2374 23328061.47
yes 1228 2684408.12
A0127 no 11 19962.68
yes 7 915.44
A0163 no 357 574827.97
yes 140 60385.7
python pandas
python pandas
edited Nov 21 '18 at 23:03
PAstudilloE
asked Nov 21 '18 at 22:37
PAstudilloEPAstudilloE
142111
142111
What is the output ofdf2.columns
?
– Peter Leimbigler
Nov 21 '18 at 22:47
as an output I need to know: how many unique 'id_sup' and the sum of 'total_t' of every item of cl_city, cl_ind and same_city. Basically a table with columns: cl_city | cl_ind | same_city | count of unique id_sup | sum of total_T
– PAstudilloE
Nov 21 '18 at 22:54
Possible duplicate of stackoverflow.com/questions/43298192/…
– Kevin Fang
Nov 21 '18 at 22:58
1
@PAstudilloE, thanks! This does reveal the immediate cause of the error: for some terrible reason, the columns ofdf2
are a highly nested MultiIndex, where instead of each string being a label, each string is actually the name of a separate *index level*(!) The fix for this comes from stackoverflow.com/q/14507794, and is this:df.columns = [' '.join(col).strip() for col in df.columns.values]
. You mentioneddf2
is the result of many merges; I suspect those merges are not written optimally, leading to this pathological MultiIndex situation.
– Peter Leimbigler
Nov 21 '18 at 23:21
1
Thanks @PeterLeimbigler!! That's exactly what happened. I solved the issue as you recommended.
– PAstudilloE
Nov 21 '18 at 23:26
|
show 5 more comments
What is the output ofdf2.columns
?
– Peter Leimbigler
Nov 21 '18 at 22:47
as an output I need to know: how many unique 'id_sup' and the sum of 'total_t' of every item of cl_city, cl_ind and same_city. Basically a table with columns: cl_city | cl_ind | same_city | count of unique id_sup | sum of total_T
– PAstudilloE
Nov 21 '18 at 22:54
Possible duplicate of stackoverflow.com/questions/43298192/…
– Kevin Fang
Nov 21 '18 at 22:58
1
@PAstudilloE, thanks! This does reveal the immediate cause of the error: for some terrible reason, the columns ofdf2
are a highly nested MultiIndex, where instead of each string being a label, each string is actually the name of a separate *index level*(!) The fix for this comes from stackoverflow.com/q/14507794, and is this:df.columns = [' '.join(col).strip() for col in df.columns.values]
. You mentioneddf2
is the result of many merges; I suspect those merges are not written optimally, leading to this pathological MultiIndex situation.
– Peter Leimbigler
Nov 21 '18 at 23:21
1
Thanks @PeterLeimbigler!! That's exactly what happened. I solved the issue as you recommended.
– PAstudilloE
Nov 21 '18 at 23:26
What is the output of
df2.columns
?– Peter Leimbigler
Nov 21 '18 at 22:47
What is the output of
df2.columns
?– Peter Leimbigler
Nov 21 '18 at 22:47
as an output I need to know: how many unique 'id_sup' and the sum of 'total_t' of every item of cl_city, cl_ind and same_city. Basically a table with columns: cl_city | cl_ind | same_city | count of unique id_sup | sum of total_T
– PAstudilloE
Nov 21 '18 at 22:54
as an output I need to know: how many unique 'id_sup' and the sum of 'total_t' of every item of cl_city, cl_ind and same_city. Basically a table with columns: cl_city | cl_ind | same_city | count of unique id_sup | sum of total_T
– PAstudilloE
Nov 21 '18 at 22:54
Possible duplicate of stackoverflow.com/questions/43298192/…
– Kevin Fang
Nov 21 '18 at 22:58
Possible duplicate of stackoverflow.com/questions/43298192/…
– Kevin Fang
Nov 21 '18 at 22:58
1
1
@PAstudilloE, thanks! This does reveal the immediate cause of the error: for some terrible reason, the columns of
df2
are a highly nested MultiIndex, where instead of each string being a label, each string is actually the name of a separate *index level*(!) The fix for this comes from stackoverflow.com/q/14507794, and is this: df.columns = [' '.join(col).strip() for col in df.columns.values]
. You mentioned df2
is the result of many merges; I suspect those merges are not written optimally, leading to this pathological MultiIndex situation.– Peter Leimbigler
Nov 21 '18 at 23:21
@PAstudilloE, thanks! This does reveal the immediate cause of the error: for some terrible reason, the columns of
df2
are a highly nested MultiIndex, where instead of each string being a label, each string is actually the name of a separate *index level*(!) The fix for this comes from stackoverflow.com/q/14507794, and is this: df.columns = [' '.join(col).strip() for col in df.columns.values]
. You mentioned df2
is the result of many merges; I suspect those merges are not written optimally, leading to this pathological MultiIndex situation.– Peter Leimbigler
Nov 21 '18 at 23:21
1
1
Thanks @PeterLeimbigler!! That's exactly what happened. I solved the issue as you recommended.
– PAstudilloE
Nov 21 '18 at 23:26
Thanks @PeterLeimbigler!! That's exactly what happened. I solved the issue as you recommended.
– PAstudilloE
Nov 21 '18 at 23:26
|
show 5 more comments
1 Answer
1
active
oldest
votes
I don't know why python is showing me that error, df2 is the result of merging several previous datasets and it does not have any duplicate columns.
I solve this issue in a silly way but it worked. I converted df2 to a CSV file and then I load it again. After that, everything is working fine. [But I can't figure out, why python is showing that error]. Hope it helps.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53421406%2fpandas-issues-with-groupby-error-valueerror-grouper-for-something-not-1-d%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I don't know why python is showing me that error, df2 is the result of merging several previous datasets and it does not have any duplicate columns.
I solve this issue in a silly way but it worked. I converted df2 to a CSV file and then I load it again. After that, everything is working fine. [But I can't figure out, why python is showing that error]. Hope it helps.
add a comment |
I don't know why python is showing me that error, df2 is the result of merging several previous datasets and it does not have any duplicate columns.
I solve this issue in a silly way but it worked. I converted df2 to a CSV file and then I load it again. After that, everything is working fine. [But I can't figure out, why python is showing that error]. Hope it helps.
add a comment |
I don't know why python is showing me that error, df2 is the result of merging several previous datasets and it does not have any duplicate columns.
I solve this issue in a silly way but it worked. I converted df2 to a CSV file and then I load it again. After that, everything is working fine. [But I can't figure out, why python is showing that error]. Hope it helps.
I don't know why python is showing me that error, df2 is the result of merging several previous datasets and it does not have any duplicate columns.
I solve this issue in a silly way but it worked. I converted df2 to a CSV file and then I load it again. After that, everything is working fine. [But I can't figure out, why python is showing that error]. Hope it helps.
answered Nov 21 '18 at 23:09
PAstudilloEPAstudilloE
142111
142111
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53421406%2fpandas-issues-with-groupby-error-valueerror-grouper-for-something-not-1-d%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What is the output of
df2.columns
?– Peter Leimbigler
Nov 21 '18 at 22:47
as an output I need to know: how many unique 'id_sup' and the sum of 'total_t' of every item of cl_city, cl_ind and same_city. Basically a table with columns: cl_city | cl_ind | same_city | count of unique id_sup | sum of total_T
– PAstudilloE
Nov 21 '18 at 22:54
Possible duplicate of stackoverflow.com/questions/43298192/…
– Kevin Fang
Nov 21 '18 at 22:58
1
@PAstudilloE, thanks! This does reveal the immediate cause of the error: for some terrible reason, the columns of
df2
are a highly nested MultiIndex, where instead of each string being a label, each string is actually the name of a separate *index level*(!) The fix for this comes from stackoverflow.com/q/14507794, and is this:df.columns = [' '.join(col).strip() for col in df.columns.values]
. You mentioneddf2
is the result of many merges; I suspect those merges are not written optimally, leading to this pathological MultiIndex situation.– Peter Leimbigler
Nov 21 '18 at 23:21
1
Thanks @PeterLeimbigler!! That's exactly what happened. I solved the issue as you recommended.
– PAstudilloE
Nov 21 '18 at 23:26