How to transform the result of a Pandas `GROUPBY` function to the original dataframe












0














Suppose I have a Pandas DataFrame with 6 columns and a custom function that takes counts of the elements in 2 or 3 columns and produces a boolean output. When a groupby object is created from the original dataframe and the custom function is applied df.groupby('col1').apply(myfunc), the result is a series whose length is equal to the number of categories of col1. How do I expand this output to match the length of the original dataframe? I tried transform, but was not able to use the custom function myfunc with it.



EDIT:



Here is an example code:



A = pd.DataFrame({'X':['a','b','c','a','c'], 'Y':['at','bt','ct','at','ct'], 'Z':['q','q','r','r','s']})
print (A)

def myfunc(df):
return ((df['Z'].nunique()>=2) and (df['Y'].nunique()<2))

A.groupby('X').apply(myfunc)


Output



I would like to expand this output as a new column Result such that where there is a in column X, the Result will be True.










share|improve this question
























  • Could you show us some of your code?
    – user7374610
    Nov 16 at 3:00










  • @user7374610, I just added a simple sample code.
    – bluetooth
    Nov 16 at 3:25
















0














Suppose I have a Pandas DataFrame with 6 columns and a custom function that takes counts of the elements in 2 or 3 columns and produces a boolean output. When a groupby object is created from the original dataframe and the custom function is applied df.groupby('col1').apply(myfunc), the result is a series whose length is equal to the number of categories of col1. How do I expand this output to match the length of the original dataframe? I tried transform, but was not able to use the custom function myfunc with it.



EDIT:



Here is an example code:



A = pd.DataFrame({'X':['a','b','c','a','c'], 'Y':['at','bt','ct','at','ct'], 'Z':['q','q','r','r','s']})
print (A)

def myfunc(df):
return ((df['Z'].nunique()>=2) and (df['Y'].nunique()<2))

A.groupby('X').apply(myfunc)


Output



I would like to expand this output as a new column Result such that where there is a in column X, the Result will be True.










share|improve this question
























  • Could you show us some of your code?
    – user7374610
    Nov 16 at 3:00










  • @user7374610, I just added a simple sample code.
    – bluetooth
    Nov 16 at 3:25














0












0








0







Suppose I have a Pandas DataFrame with 6 columns and a custom function that takes counts of the elements in 2 or 3 columns and produces a boolean output. When a groupby object is created from the original dataframe and the custom function is applied df.groupby('col1').apply(myfunc), the result is a series whose length is equal to the number of categories of col1. How do I expand this output to match the length of the original dataframe? I tried transform, but was not able to use the custom function myfunc with it.



EDIT:



Here is an example code:



A = pd.DataFrame({'X':['a','b','c','a','c'], 'Y':['at','bt','ct','at','ct'], 'Z':['q','q','r','r','s']})
print (A)

def myfunc(df):
return ((df['Z'].nunique()>=2) and (df['Y'].nunique()<2))

A.groupby('X').apply(myfunc)


Output



I would like to expand this output as a new column Result such that where there is a in column X, the Result will be True.










share|improve this question















Suppose I have a Pandas DataFrame with 6 columns and a custom function that takes counts of the elements in 2 or 3 columns and produces a boolean output. When a groupby object is created from the original dataframe and the custom function is applied df.groupby('col1').apply(myfunc), the result is a series whose length is equal to the number of categories of col1. How do I expand this output to match the length of the original dataframe? I tried transform, but was not able to use the custom function myfunc with it.



EDIT:



Here is an example code:



A = pd.DataFrame({'X':['a','b','c','a','c'], 'Y':['at','bt','ct','at','ct'], 'Z':['q','q','r','r','s']})
print (A)

def myfunc(df):
return ((df['Z'].nunique()>=2) and (df['Y'].nunique()<2))

A.groupby('X').apply(myfunc)


Output



I would like to expand this output as a new column Result such that where there is a in column X, the Result will be True.







python pandas dataframe






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 16 at 3:24

























asked Nov 16 at 2:58









bluetooth

768




768












  • Could you show us some of your code?
    – user7374610
    Nov 16 at 3:00










  • @user7374610, I just added a simple sample code.
    – bluetooth
    Nov 16 at 3:25


















  • Could you show us some of your code?
    – user7374610
    Nov 16 at 3:00










  • @user7374610, I just added a simple sample code.
    – bluetooth
    Nov 16 at 3:25
















Could you show us some of your code?
– user7374610
Nov 16 at 3:00




Could you show us some of your code?
– user7374610
Nov 16 at 3:00












@user7374610, I just added a simple sample code.
– bluetooth
Nov 16 at 3:25




@user7374610, I just added a simple sample code.
– bluetooth
Nov 16 at 3:25












2 Answers
2






active

oldest

votes


















1














You can map the groupby back to the original dataframe



A['Result'] = A['X'].map(A.groupby('X').apply(myfunc))


Result would look like:



    X   Y   Z   Result
0 a at q True
1 b bt q False
2 c ct r True
3 a at r True
4 c ct s True





share|improve this answer





























    0














    My solution may not be the best one, which uses a loop, but it's pretty good I think.



    The core idea is you can traverse all the sub-dataframe (gdf) by for i, gdf in gp. Then add the column result (in my example it is c) for each sub-dataframe. Finally concat all the sub-dataframe into one.



    Here is an example:



    import pandas as pd
    df = pd.DataFrame({'a':[1,2,1,2],'b':['a','b','c','d']})
    gp = df.groupby('a') # group
    s = gp.apply(sum)['a'] # apply a func
    adf =

    # then create a new dataframe
    for i, gdf in gp:
    tdf = gdf.copy()
    tdf.loc[:,'c'] = s.loc[i]
    adf.append(tdf)
    pd.concat(adf)


    from:



        a   b
    0 1 a
    1 2 b
    2 1 c
    3 2 d


    to:



        a   b   c
    0 1 a 2
    2 1 c 2
    1 2 b 4
    3 2 d 4





    share|improve this answer























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53330793%2fhow-to-transform-the-result-of-a-pandas-groupby-function-to-the-original-dataf%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      1














      You can map the groupby back to the original dataframe



      A['Result'] = A['X'].map(A.groupby('X').apply(myfunc))


      Result would look like:



          X   Y   Z   Result
      0 a at q True
      1 b bt q False
      2 c ct r True
      3 a at r True
      4 c ct s True





      share|improve this answer


























        1














        You can map the groupby back to the original dataframe



        A['Result'] = A['X'].map(A.groupby('X').apply(myfunc))


        Result would look like:



            X   Y   Z   Result
        0 a at q True
        1 b bt q False
        2 c ct r True
        3 a at r True
        4 c ct s True





        share|improve this answer
























          1












          1








          1






          You can map the groupby back to the original dataframe



          A['Result'] = A['X'].map(A.groupby('X').apply(myfunc))


          Result would look like:



              X   Y   Z   Result
          0 a at q True
          1 b bt q False
          2 c ct r True
          3 a at r True
          4 c ct s True





          share|improve this answer












          You can map the groupby back to the original dataframe



          A['Result'] = A['X'].map(A.groupby('X').apply(myfunc))


          Result would look like:



              X   Y   Z   Result
          0 a at q True
          1 b bt q False
          2 c ct r True
          3 a at r True
          4 c ct s True






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 16 at 3:32









          user7374610

          6981422




          6981422

























              0














              My solution may not be the best one, which uses a loop, but it's pretty good I think.



              The core idea is you can traverse all the sub-dataframe (gdf) by for i, gdf in gp. Then add the column result (in my example it is c) for each sub-dataframe. Finally concat all the sub-dataframe into one.



              Here is an example:



              import pandas as pd
              df = pd.DataFrame({'a':[1,2,1,2],'b':['a','b','c','d']})
              gp = df.groupby('a') # group
              s = gp.apply(sum)['a'] # apply a func
              adf =

              # then create a new dataframe
              for i, gdf in gp:
              tdf = gdf.copy()
              tdf.loc[:,'c'] = s.loc[i]
              adf.append(tdf)
              pd.concat(adf)


              from:



                  a   b
              0 1 a
              1 2 b
              2 1 c
              3 2 d


              to:



                  a   b   c
              0 1 a 2
              2 1 c 2
              1 2 b 4
              3 2 d 4





              share|improve this answer




























                0














                My solution may not be the best one, which uses a loop, but it's pretty good I think.



                The core idea is you can traverse all the sub-dataframe (gdf) by for i, gdf in gp. Then add the column result (in my example it is c) for each sub-dataframe. Finally concat all the sub-dataframe into one.



                Here is an example:



                import pandas as pd
                df = pd.DataFrame({'a':[1,2,1,2],'b':['a','b','c','d']})
                gp = df.groupby('a') # group
                s = gp.apply(sum)['a'] # apply a func
                adf =

                # then create a new dataframe
                for i, gdf in gp:
                tdf = gdf.copy()
                tdf.loc[:,'c'] = s.loc[i]
                adf.append(tdf)
                pd.concat(adf)


                from:



                    a   b
                0 1 a
                1 2 b
                2 1 c
                3 2 d


                to:



                    a   b   c
                0 1 a 2
                2 1 c 2
                1 2 b 4
                3 2 d 4





                share|improve this answer


























                  0












                  0








                  0






                  My solution may not be the best one, which uses a loop, but it's pretty good I think.



                  The core idea is you can traverse all the sub-dataframe (gdf) by for i, gdf in gp. Then add the column result (in my example it is c) for each sub-dataframe. Finally concat all the sub-dataframe into one.



                  Here is an example:



                  import pandas as pd
                  df = pd.DataFrame({'a':[1,2,1,2],'b':['a','b','c','d']})
                  gp = df.groupby('a') # group
                  s = gp.apply(sum)['a'] # apply a func
                  adf =

                  # then create a new dataframe
                  for i, gdf in gp:
                  tdf = gdf.copy()
                  tdf.loc[:,'c'] = s.loc[i]
                  adf.append(tdf)
                  pd.concat(adf)


                  from:



                      a   b
                  0 1 a
                  1 2 b
                  2 1 c
                  3 2 d


                  to:



                      a   b   c
                  0 1 a 2
                  2 1 c 2
                  1 2 b 4
                  3 2 d 4





                  share|improve this answer














                  My solution may not be the best one, which uses a loop, but it's pretty good I think.



                  The core idea is you can traverse all the sub-dataframe (gdf) by for i, gdf in gp. Then add the column result (in my example it is c) for each sub-dataframe. Finally concat all the sub-dataframe into one.



                  Here is an example:



                  import pandas as pd
                  df = pd.DataFrame({'a':[1,2,1,2],'b':['a','b','c','d']})
                  gp = df.groupby('a') # group
                  s = gp.apply(sum)['a'] # apply a func
                  adf =

                  # then create a new dataframe
                  for i, gdf in gp:
                  tdf = gdf.copy()
                  tdf.loc[:,'c'] = s.loc[i]
                  adf.append(tdf)
                  pd.concat(adf)


                  from:



                      a   b
                  0 1 a
                  1 2 b
                  2 1 c
                  3 2 d


                  to:



                      a   b   c
                  0 1 a 2
                  2 1 c 2
                  1 2 b 4
                  3 2 d 4






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 16 at 3:39

























                  answered Nov 16 at 3:32









                  Zealseeker

                  352114




                  352114






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53330793%2fhow-to-transform-the-result-of-a-pandas-groupby-function-to-the-original-dataf%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      How to change which sound is reproduced for terminal bell?

                      Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents

                      Can I use Tabulator js library in my java Spring + Thymeleaf project?