Pandas: issues with groupby. Error: 'ValueError: Grouper for not 1-dimensional'

I've been looking for this error here, but I only found this solution 1 (which doesn't work in my case). Does anybody can guide me on how to solve it?.

My dataset (df2) looks like this:

      id_cl  id_sup total_t cl_ind  cl_city  sup_ind  sup_city  same_city

0   1000135 1797029  414.85  I5610  11308.0    G4711   10901.0   no

1   1000135 1798069  19.76   I5610  11308.0    G4719   10901.0   no

2   1000135 1923186  302.73  I5610  11308.0    G4630   10901.0   no

3   1000135 2502927  1262.86 I5610  11308.0    G4630   11308.0   yes

4   1000135 2504288  155.04  I5610  11308.0    G4711   11308.0   yes

I need to group this dataset as follows:

df_sup = df2.groupby(['cl_city','cl_ind','same_city']).agg({'id_sup':'nunique', 'total_t':'sum'})

But when performing this, I'm getting this error!:

ValueError: Grouper for 'cl_city' not 1-dimensional

As a result I need something like this:

                                 id_sup      total_t

cl_city     cl_ind  same_city       

  10701      A0112         no         2    21964.22

                          yes        31     3530.40

             A0122         no      2374 23328061.47

                          yes      1228  2684408.12

             A0127         no        11    19962.68

                          yes         7      915.44

             A0163         no       357   574827.97

                          yes       140     60385.7

edited Nov 21 '18 at 23:03

asked Nov 21 '18 at 22:37

PAstudilloE

142111

What is the output of df2.columns?

– Peter Leimbigler
Nov 21 '18 at 22:47

as an output I need to know: how many unique 'id_sup' and the sum of 'total_t' of every item of cl_city, cl_ind and same_city. Basically a table with columns: cl_city | cl_ind | same_city | count of unique id_sup | sum of total_T

– PAstudilloE
Nov 21 '18 at 22:54

Possible duplicate of stackoverflow.com/questions/43298192/…

– Kevin Fang
Nov 21 '18 at 22:58

1

@PAstudilloE, thanks! This does reveal the immediate cause of the error: for some terrible reason, the columns of df2 are a highly nested MultiIndex, where instead of each string being a label, each string is actually the name of a separate *index level*(!) The fix for this comes from stackoverflow.com/q/14507794, and is this: df.columns = [' '.join(col).strip() for col in df.columns.values]. You mentioned df2 is the result of many merges; I suspect those merges are not written optimally, leading to this pathological MultiIndex situation.

– Peter Leimbigler
Nov 21 '18 at 23:21

1

Thanks @PeterLeimbigler!! That's exactly what happened. I solved the issue as you recommended.

– PAstudilloE
Nov 21 '18 at 23:26

|
show 5 more comments

I've been looking for this error here, but I only found this solution 1 (which doesn't work in my case). Does anybody can guide me on how to solve it?.

My dataset (df2) looks like this:

      id_cl  id_sup total_t cl_ind  cl_city  sup_ind  sup_city  same_city

0   1000135 1797029  414.85  I5610  11308.0    G4711   10901.0   no

1   1000135 1798069  19.76   I5610  11308.0    G4719   10901.0   no

2   1000135 1923186  302.73  I5610  11308.0    G4630   10901.0   no

3   1000135 2502927  1262.86 I5610  11308.0    G4630   11308.0   yes

4   1000135 2504288  155.04  I5610  11308.0    G4711   11308.0   yes

I need to group this dataset as follows:

df_sup = df2.groupby(['cl_city','cl_ind','same_city']).agg({'id_sup':'nunique', 'total_t':'sum'})

But when performing this, I'm getting this error!:

ValueError: Grouper for 'cl_city' not 1-dimensional

As a result I need something like this:

                                 id_sup      total_t

cl_city     cl_ind  same_city       

  10701      A0112         no         2    21964.22

                          yes        31     3530.40

             A0122         no      2374 23328061.47

                          yes      1228  2684408.12

             A0127         no        11    19962.68

                          yes         7      915.44

             A0163         no       357   574827.97

                          yes       140     60385.7

edited Nov 21 '18 at 23:03

asked Nov 21 '18 at 22:37

PAstudilloE

142111

What is the output of df2.columns?

– Peter Leimbigler
Nov 21 '18 at 22:47

as an output I need to know: how many unique 'id_sup' and the sum of 'total_t' of every item of cl_city, cl_ind and same_city. Basically a table with columns: cl_city | cl_ind | same_city | count of unique id_sup | sum of total_T

– PAstudilloE
Nov 21 '18 at 22:54

Possible duplicate of stackoverflow.com/questions/43298192/…

– Kevin Fang
Nov 21 '18 at 22:58

1

@PAstudilloE, thanks! This does reveal the immediate cause of the error: for some terrible reason, the columns of df2 are a highly nested MultiIndex, where instead of each string being a label, each string is actually the name of a separate *index level*(!) The fix for this comes from stackoverflow.com/q/14507794, and is this: df.columns = [' '.join(col).strip() for col in df.columns.values]. You mentioned df2 is the result of many merges; I suspect those merges are not written optimally, leading to this pathological MultiIndex situation.

– Peter Leimbigler
Nov 21 '18 at 23:21

1

Thanks @PeterLeimbigler!! That's exactly what happened. I solved the issue as you recommended.

– PAstudilloE
Nov 21 '18 at 23:26

|
show 5 more comments

I've been looking for this error here, but I only found this solution 1 (which doesn't work in my case). Does anybody can guide me on how to solve it?.

My dataset (df2) looks like this:

      id_cl  id_sup total_t cl_ind  cl_city  sup_ind  sup_city  same_city

0   1000135 1797029  414.85  I5610  11308.0    G4711   10901.0   no

1   1000135 1798069  19.76   I5610  11308.0    G4719   10901.0   no

2   1000135 1923186  302.73  I5610  11308.0    G4630   10901.0   no

3   1000135 2502927  1262.86 I5610  11308.0    G4630   11308.0   yes

4   1000135 2504288  155.04  I5610  11308.0    G4711   11308.0   yes

I need to group this dataset as follows:

df_sup = df2.groupby(['cl_city','cl_ind','same_city']).agg({'id_sup':'nunique', 'total_t':'sum'})

But when performing this, I'm getting this error!:

ValueError: Grouper for 'cl_city' not 1-dimensional

As a result I need something like this:

                                 id_sup      total_t

cl_city     cl_ind  same_city       

  10701      A0112         no         2    21964.22

                          yes        31     3530.40

             A0122         no      2374 23328061.47

                          yes      1228  2684408.12

             A0127         no        11    19962.68

                          yes         7      915.44

             A0163         no       357   574827.97

                          yes       140     60385.7

edited Nov 21 '18 at 23:03

asked Nov 21 '18 at 22:37

PAstudilloE

142111

I've been looking for this error here, but I only found this solution 1 (which doesn't work in my case). Does anybody can guide me on how to solve it?.

My dataset (df2) looks like this:

      id_cl  id_sup total_t cl_ind  cl_city  sup_ind  sup_city  same_city

0   1000135 1797029  414.85  I5610  11308.0    G4711   10901.0   no

1   1000135 1798069  19.76   I5610  11308.0    G4719   10901.0   no

2   1000135 1923186  302.73  I5610  11308.0    G4630   10901.0   no

3   1000135 2502927  1262.86 I5610  11308.0    G4630   11308.0   yes

4   1000135 2504288  155.04  I5610  11308.0    G4711   11308.0   yes

I need to group this dataset as follows:

df_sup = df2.groupby(['cl_city','cl_ind','same_city']).agg({'id_sup':'nunique', 'total_t':'sum'})

But when performing this, I'm getting this error!:

ValueError: Grouper for 'cl_city' not 1-dimensional

As a result I need something like this:

                                 id_sup      total_t

cl_city     cl_ind  same_city       

  10701      A0112         no         2    21964.22

                          yes        31     3530.40

             A0122         no      2374 23328061.47

                          yes      1228  2684408.12

             A0127         no        11    19962.68

                          yes         7      915.44

             A0163         no       357   574827.97

                          yes       140     60385.7

python pandas

edited Nov 21 '18 at 23:03

asked Nov 21 '18 at 22:37

PAstudilloE

142111

edited Nov 21 '18 at 23:03

asked Nov 21 '18 at 22:37

PAstudilloE

142111

edited Nov 21 '18 at 23:03

asked Nov 21 '18 at 22:37

PAstudilloE

142111

asked Nov 21 '18 at 22:37

PAstudilloE

142111

asked Nov 21 '18 at 22:37

PAstudilloE

142111

What is the output of df2.columns?

– Peter Leimbigler
Nov 21 '18 at 22:47

as an output I need to know: how many unique 'id_sup' and the sum of 'total_t' of every item of cl_city, cl_ind and same_city. Basically a table with columns: cl_city | cl_ind | same_city | count of unique id_sup | sum of total_T

– PAstudilloE
Nov 21 '18 at 22:54

Possible duplicate of stackoverflow.com/questions/43298192/…

– Kevin Fang
Nov 21 '18 at 22:58

1

@PAstudilloE, thanks! This does reveal the immediate cause of the error: for some terrible reason, the columns of df2 are a highly nested MultiIndex, where instead of each string being a label, each string is actually the name of a separate *index level*(!) The fix for this comes from stackoverflow.com/q/14507794, and is this: df.columns = [' '.join(col).strip() for col in df.columns.values]. You mentioned df2 is the result of many merges; I suspect those merges are not written optimally, leading to this pathological MultiIndex situation.

– Peter Leimbigler
Nov 21 '18 at 23:21

1

Thanks @PeterLeimbigler!! That's exactly what happened. I solved the issue as you recommended.

– PAstudilloE
Nov 21 '18 at 23:26

|
show 5 more comments

What is the output of df2.columns?

– Peter Leimbigler
Nov 21 '18 at 22:47

as an output I need to know: how many unique 'id_sup' and the sum of 'total_t' of every item of cl_city, cl_ind and same_city. Basically a table with columns: cl_city | cl_ind | same_city | count of unique id_sup | sum of total_T

– PAstudilloE
Nov 21 '18 at 22:54

Possible duplicate of stackoverflow.com/questions/43298192/…

– Kevin Fang
Nov 21 '18 at 22:58

1

@PAstudilloE, thanks! This does reveal the immediate cause of the error: for some terrible reason, the columns of df2 are a highly nested MultiIndex, where instead of each string being a label, each string is actually the name of a separate *index level*(!) The fix for this comes from stackoverflow.com/q/14507794, and is this: df.columns = [' '.join(col).strip() for col in df.columns.values]. You mentioned df2 is the result of many merges; I suspect those merges are not written optimally, leading to this pathological MultiIndex situation.

– Peter Leimbigler
Nov 21 '18 at 23:21

1

Thanks @PeterLeimbigler!! That's exactly what happened. I solved the issue as you recommended.

– PAstudilloE
Nov 21 '18 at 23:26

What is the output of df2.columns?

– Peter Leimbigler
Nov 21 '18 at 22:47

as an output I need to know: how many unique 'id_sup' and the sum of 'total_t' of every item of cl_city, cl_ind and same_city. Basically a table with columns: cl_city | cl_ind | same_city | count of unique id_sup | sum of total_T

– PAstudilloE
Nov 21 '18 at 22:54

Possible duplicate of stackoverflow.com/questions/43298192/…

– Kevin Fang
Nov 21 '18 at 22:58

@PAstudilloE, thanks! This does reveal the immediate cause of the error: for some terrible reason, the columns of df2 are a highly nested MultiIndex, where instead of each string being a label, each string is actually the name of a separate *index level*(!) The fix for this comes from stackoverflow.com/q/14507794, and is this: df.columns = [' '.join(col).strip() for col in df.columns.values]. You mentioned df2 is the result of many merges; I suspect those merges are not written optimally, leading to this pathological MultiIndex situation.

– Peter Leimbigler
Nov 21 '18 at 23:21

Thanks @PeterLeimbigler!! That's exactly what happened. I solved the issue as you recommended.

– PAstudilloE
Nov 21 '18 at 23:26

|
show 5 more comments

1 Answer
1

active

oldest

votes

I don't know why python is showing me that error, df2 is the result of merging several previous datasets and it does not have any duplicate columns.

I solve this issue in a silly way but it worked. I converted df2 to a CSV file and then I load it again. After that, everything is working fine. [But I can't figure out, why python is showing that error]. Hope it helps.

answered Nov 21 '18 at 23:09

PAstudilloE

142111

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53421406%2fpandas-issues-with-groupby-error-valueerror-grouper-for-something-not-1-d%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

I don't know why python is showing me that error, df2 is the result of merging several previous datasets and it does not have any duplicate columns.

answered Nov 21 '18 at 23:09

PAstudilloE

142111

add a comment |

I don't know why python is showing me that error, df2 is the result of merging several previous datasets and it does not have any duplicate columns.

answered Nov 21 '18 at 23:09

PAstudilloE

142111

add a comment |

I don't know why python is showing me that error, df2 is the result of merging several previous datasets and it does not have any duplicate columns.

answered Nov 21 '18 at 23:09

PAstudilloE

142111

I don't know why python is showing me that error, df2 is the result of merging several previous datasets and it does not have any duplicate columns.

answered Nov 21 '18 at 23:09

PAstudilloE

142111

answered Nov 21 '18 at 23:09

PAstudilloE

142111

answered Nov 21 '18 at 23:09

PAstudilloE

142111

answered Nov 21 '18 at 23:09

PAstudilloE

142111

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrgtkky