How to transform the result of a Pandas `GROUPBY` function to the original dataframe

Suppose I have a Pandas DataFrame with 6 columns and a custom function that takes counts of the elements in 2 or 3 columns and produces a boolean output. When a groupby object is created from the original dataframe and the custom function is applied df.groupby('col1').apply(myfunc), the result is a series whose length is equal to the number of categories of col1. How do I expand this output to match the length of the original dataframe? I tried transform, but was not able to use the custom function myfunc with it.

EDIT:

Here is an example code:

A = pd.DataFrame({'X':['a','b','c','a','c'], 'Y':['at','bt','ct','at','ct'], 'Z':['q','q','r','r','s']})

print (A)



def myfunc(df):

    return ((df['Z'].nunique()>=2) and (df['Y'].nunique()<2))



A.groupby('X').apply(myfunc)

Output

I would like to expand this output as a new column Result such that where there is a in column X, the Result will be True.

edited Nov 16 at 3:24

asked Nov 16 at 2:58

bluetooth

768

Could you show us some of your code?
– user7374610
Nov 16 at 3:00

@user7374610, I just added a simple sample code.
– bluetooth
Nov 16 at 3:25

add a comment |

EDIT:

Here is an example code:

A = pd.DataFrame({'X':['a','b','c','a','c'], 'Y':['at','bt','ct','at','ct'], 'Z':['q','q','r','r','s']})

print (A)



def myfunc(df):

    return ((df['Z'].nunique()>=2) and (df['Y'].nunique()<2))



A.groupby('X').apply(myfunc)

Output

I would like to expand this output as a new column Result such that where there is a in column X, the Result will be True.

edited Nov 16 at 3:24

asked Nov 16 at 2:58

bluetooth

768

Could you show us some of your code?
– user7374610
Nov 16 at 3:00

@user7374610, I just added a simple sample code.
– bluetooth
Nov 16 at 3:25

add a comment |

EDIT:

Here is an example code:

A = pd.DataFrame({'X':['a','b','c','a','c'], 'Y':['at','bt','ct','at','ct'], 'Z':['q','q','r','r','s']})

print (A)



def myfunc(df):

    return ((df['Z'].nunique()>=2) and (df['Y'].nunique()<2))



A.groupby('X').apply(myfunc)

Output

I would like to expand this output as a new column Result such that where there is a in column X, the Result will be True.

edited Nov 16 at 3:24

asked Nov 16 at 2:58

bluetooth

768

EDIT:

Here is an example code:

A = pd.DataFrame({'X':['a','b','c','a','c'], 'Y':['at','bt','ct','at','ct'], 'Z':['q','q','r','r','s']})

print (A)



def myfunc(df):

    return ((df['Z'].nunique()>=2) and (df['Y'].nunique()<2))



A.groupby('X').apply(myfunc)

Output

I would like to expand this output as a new column Result such that where there is a in column X, the Result will be True.

python pandas dataframe

edited Nov 16 at 3:24

asked Nov 16 at 2:58

bluetooth

768

edited Nov 16 at 3:24

asked Nov 16 at 2:58

bluetooth

768

edited Nov 16 at 3:24

asked Nov 16 at 2:58

bluetooth

768

asked Nov 16 at 2:58

bluetooth

768

asked Nov 16 at 2:58

bluetooth

768

Could you show us some of your code?
– user7374610
Nov 16 at 3:00

@user7374610, I just added a simple sample code.
– bluetooth
Nov 16 at 3:25

add a comment |

Could you show us some of your code?
– user7374610
Nov 16 at 3:00

@user7374610, I just added a simple sample code.
– bluetooth
Nov 16 at 3:25

Could you show us some of your code?
– user7374610
Nov 16 at 3:00

@user7374610, I just added a simple sample code.
– bluetooth
Nov 16 at 3:25

add a comment |

2 Answers
2

active

oldest

votes

You can map the groupby back to the original dataframe

A['Result'] = A['X'].map(A.groupby('X').apply(myfunc))

Result would look like:

    X   Y   Z   Result

0   a   at  q   True

1   b   bt  q   False

2   c   ct  r   True

3   a   at  r   True

4   c   ct  s   True

answered Nov 16 at 3:32

user7374610

6981422

add a comment |

My solution may not be the best one, which uses a loop, but it's pretty good I think.

The core idea is you can traverse all the sub-dataframe (gdf) by for i, gdf in gp. Then add the column result (in my example it is c) for each sub-dataframe. Finally concat all the sub-dataframe into one.

Here is an example:

import pandas as pd

df = pd.DataFrame({'a':[1,2,1,2],'b':['a','b','c','d']})

gp = df.groupby('a')  # group

s = gp.apply(sum)['a'] # apply a func

adf = 



# then create a new dataframe

for i, gdf in gp:

    tdf = gdf.copy()

    tdf.loc[:,'c'] = s.loc[i]

    adf.append(tdf)

pd.concat(adf)

from:

to:

    a   b   c

0   1   a   2

2   1   c   2

1   2   b   4

3   2   d   4

edited Nov 16 at 3:39

answered Nov 16 at 3:32

Zealseeker

352114

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53330793%2fhow-to-transform-the-result-of-a-pandas-groupby-function-to-the-original-dataf%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

You can map the groupby back to the original dataframe

A['Result'] = A['X'].map(A.groupby('X').apply(myfunc))

Result would look like:

    X   Y   Z   Result

0   a   at  q   True

1   b   bt  q   False

2   c   ct  r   True

3   a   at  r   True

4   c   ct  s   True

answered Nov 16 at 3:32

user7374610

6981422

add a comment |

You can map the groupby back to the original dataframe

A['Result'] = A['X'].map(A.groupby('X').apply(myfunc))

Result would look like:

    X   Y   Z   Result

0   a   at  q   True

1   b   bt  q   False

2   c   ct  r   True

3   a   at  r   True

4   c   ct  s   True

answered Nov 16 at 3:32

user7374610

6981422

add a comment |

You can map the groupby back to the original dataframe

A['Result'] = A['X'].map(A.groupby('X').apply(myfunc))

Result would look like:

    X   Y   Z   Result

0   a   at  q   True

1   b   bt  q   False

2   c   ct  r   True

3   a   at  r   True

4   c   ct  s   True

answered Nov 16 at 3:32

user7374610

6981422

You can map the groupby back to the original dataframe

A['Result'] = A['X'].map(A.groupby('X').apply(myfunc))

Result would look like:

    X   Y   Z   Result

0   a   at  q   True

1   b   bt  q   False

2   c   ct  r   True

3   a   at  r   True

4   c   ct  s   True

answered Nov 16 at 3:32

user7374610

6981422

answered Nov 16 at 3:32

user7374610

6981422

answered Nov 16 at 3:32

user7374610

6981422

answered Nov 16 at 3:32

user7374610

6981422

add a comment |

My solution may not be the best one, which uses a loop, but it's pretty good I think.

Here is an example:

import pandas as pd

df = pd.DataFrame({'a':[1,2,1,2],'b':['a','b','c','d']})

gp = df.groupby('a')  # group

s = gp.apply(sum)['a'] # apply a func

adf = 



# then create a new dataframe

for i, gdf in gp:

    tdf = gdf.copy()

    tdf.loc[:,'c'] = s.loc[i]

    adf.append(tdf)

pd.concat(adf)

from:

to:

    a   b   c

0   1   a   2

2   1   c   2

1   2   b   4

3   2   d   4

edited Nov 16 at 3:39

answered Nov 16 at 3:32

Zealseeker

352114

add a comment |

My solution may not be the best one, which uses a loop, but it's pretty good I think.

Here is an example:

import pandas as pd

df = pd.DataFrame({'a':[1,2,1,2],'b':['a','b','c','d']})

gp = df.groupby('a')  # group

s = gp.apply(sum)['a'] # apply a func

adf = 



# then create a new dataframe

for i, gdf in gp:

    tdf = gdf.copy()

    tdf.loc[:,'c'] = s.loc[i]

    adf.append(tdf)

pd.concat(adf)

from:

to:

    a   b   c

0   1   a   2

2   1   c   2

1   2   b   4

3   2   d   4

edited Nov 16 at 3:39

answered Nov 16 at 3:32

Zealseeker

352114

add a comment |

My solution may not be the best one, which uses a loop, but it's pretty good I think.

Here is an example:

import pandas as pd

df = pd.DataFrame({'a':[1,2,1,2],'b':['a','b','c','d']})

gp = df.groupby('a')  # group

s = gp.apply(sum)['a'] # apply a func

adf = 



# then create a new dataframe

for i, gdf in gp:

    tdf = gdf.copy()

    tdf.loc[:,'c'] = s.loc[i]

    adf.append(tdf)

pd.concat(adf)

from:

to:

    a   b   c

0   1   a   2

2   1   c   2

1   2   b   4

3   2   d   4

edited Nov 16 at 3:39

answered Nov 16 at 3:32

Zealseeker

352114

My solution may not be the best one, which uses a loop, but it's pretty good I think.

Here is an example:

import pandas as pd

df = pd.DataFrame({'a':[1,2,1,2],'b':['a','b','c','d']})

gp = df.groupby('a')  # group

s = gp.apply(sum)['a'] # apply a func

adf = 



# then create a new dataframe

for i, gdf in gp:

    tdf = gdf.copy()

    tdf.loc[:,'c'] = s.loc[i]

    adf.append(tdf)

pd.concat(adf)

from:

to:

    a   b   c

0   1   a   2

2   1   c   2

1   2   b   4

3   2   d   4

edited Nov 16 at 3:39

answered Nov 16 at 3:32

Zealseeker

352114

edited Nov 16 at 3:39

answered Nov 16 at 3:32

Zealseeker

352114

answered Nov 16 at 3:32

Zealseeker

352114

answered Nov 16 at 3:32

Zealseeker

352114

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrgtkky