Pandas Dataframe: Finding entries that share values (e.g. all games that contain a player)











up vote
4
down vote

favorite












I have a CSV file of game history for my badminton club. I would like to be able to find out information about games that contain a given player (e.g. who did "Bill" play most with?). Here's an example of what two rounds of three games might look like:



import pandas as pd
player_data = player_data = pd.DataFrame(data=[
('2018-06-12', 1, 1, 1, 'Adam'),
('2018-06-12', 1, 1, 2, 'Bill'),
('2018-06-12', 1, 1, 3, 'Cindy'),
('2018-06-12', 1, 1, 4, 'Derek'),
('2018-06-12', 1, 2, 1, 'Edward'),
('2018-06-12', 1, 2, 2, 'Fred'),
('2018-06-12', 1, 2, 3, 'George'),
('2018-06-12', 1, 2, 4, 'Harry'),
('2018-06-12', 1, 3, 1, 'Ian'),
('2018-06-12', 1, 3, 2, 'Jack'),
('2018-06-12', 1, 3, 3, 'Karl'),
('2018-06-12', 1, 3, 4, 'Laura'),
('2018-06-12', 2, 1, 1, 'Karl'),
('2018-06-12', 2, 1, 2, 'Cindy'),
('2018-06-12', 2, 1, 3, 'Bill'),
('2018-06-12', 2, 1, 4, 'Derek'),
('2018-06-12', 2, 2, 1, 'Max'),
('2018-06-12', 2, 2, 2, 'George'),
('2018-06-12', 2, 2, 3, 'Fred'),
('2018-06-12', 2, 2, 4, 'Ian'),
('2018-06-12', 2, 3, 1, 'Nigel'),
('2018-06-12', 3, 3, 2, 'Edward'),
('2018-06-12', 3, 3, 3, 'Harry'),
('2018-06-12', 3, 3, 4, 'Adam')],
columns=['Date', 'Round #', 'Court #', 'Space', 'Name'])


However, as each row is an individual player's entry, simply locating by name, e.g.



player_data.loc[player_data['Name'] == 'Bill']


is just going to return only Bill's individual entries, like so:



    Date    Round # Court # Space   Name

1 2018-06-12 1 1 2 Bill
14 2018-06-12 2 1 3 Bill


... when what I want is a new dataframe that contains ALL entries of games that Bill has played in, such that in this case it would display as:



Date    Round # Court # Space   Name
0 2018-06-12 1 1 1 Adam
1 2018-06-12 1 1 2 Bill
2 2018-06-12 1 1 3 Cindy
3 2018-06-12 1 1 4 Derek
12 2018-06-12 2 1 1 Karl
13 2018-06-12 2 1 2 Cindy
14 2018-06-12 2 1 3 Bill
15 2018-06-12 2 1 4 Derek


I'm thinking it might be easier to convert the original dataframe to one where each entry is a separate game with all the player names for that game listed in a tuple, so then it'd be relatively simple to check "if name in Names"? e.g.



Date    Round # Court # Names
0 2018-06-12 1 1 (Adam, Bill, Cindy, Derek)


... but maybe that'd cause other problems.










share|improve this question


























    up vote
    4
    down vote

    favorite












    I have a CSV file of game history for my badminton club. I would like to be able to find out information about games that contain a given player (e.g. who did "Bill" play most with?). Here's an example of what two rounds of three games might look like:



    import pandas as pd
    player_data = player_data = pd.DataFrame(data=[
    ('2018-06-12', 1, 1, 1, 'Adam'),
    ('2018-06-12', 1, 1, 2, 'Bill'),
    ('2018-06-12', 1, 1, 3, 'Cindy'),
    ('2018-06-12', 1, 1, 4, 'Derek'),
    ('2018-06-12', 1, 2, 1, 'Edward'),
    ('2018-06-12', 1, 2, 2, 'Fred'),
    ('2018-06-12', 1, 2, 3, 'George'),
    ('2018-06-12', 1, 2, 4, 'Harry'),
    ('2018-06-12', 1, 3, 1, 'Ian'),
    ('2018-06-12', 1, 3, 2, 'Jack'),
    ('2018-06-12', 1, 3, 3, 'Karl'),
    ('2018-06-12', 1, 3, 4, 'Laura'),
    ('2018-06-12', 2, 1, 1, 'Karl'),
    ('2018-06-12', 2, 1, 2, 'Cindy'),
    ('2018-06-12', 2, 1, 3, 'Bill'),
    ('2018-06-12', 2, 1, 4, 'Derek'),
    ('2018-06-12', 2, 2, 1, 'Max'),
    ('2018-06-12', 2, 2, 2, 'George'),
    ('2018-06-12', 2, 2, 3, 'Fred'),
    ('2018-06-12', 2, 2, 4, 'Ian'),
    ('2018-06-12', 2, 3, 1, 'Nigel'),
    ('2018-06-12', 3, 3, 2, 'Edward'),
    ('2018-06-12', 3, 3, 3, 'Harry'),
    ('2018-06-12', 3, 3, 4, 'Adam')],
    columns=['Date', 'Round #', 'Court #', 'Space', 'Name'])


    However, as each row is an individual player's entry, simply locating by name, e.g.



    player_data.loc[player_data['Name'] == 'Bill']


    is just going to return only Bill's individual entries, like so:



        Date    Round # Court # Space   Name

    1 2018-06-12 1 1 2 Bill
    14 2018-06-12 2 1 3 Bill


    ... when what I want is a new dataframe that contains ALL entries of games that Bill has played in, such that in this case it would display as:



    Date    Round # Court # Space   Name
    0 2018-06-12 1 1 1 Adam
    1 2018-06-12 1 1 2 Bill
    2 2018-06-12 1 1 3 Cindy
    3 2018-06-12 1 1 4 Derek
    12 2018-06-12 2 1 1 Karl
    13 2018-06-12 2 1 2 Cindy
    14 2018-06-12 2 1 3 Bill
    15 2018-06-12 2 1 4 Derek


    I'm thinking it might be easier to convert the original dataframe to one where each entry is a separate game with all the player names for that game listed in a tuple, so then it'd be relatively simple to check "if name in Names"? e.g.



    Date    Round # Court # Names
    0 2018-06-12 1 1 (Adam, Bill, Cindy, Derek)


    ... but maybe that'd cause other problems.










    share|improve this question
























      up vote
      4
      down vote

      favorite









      up vote
      4
      down vote

      favorite











      I have a CSV file of game history for my badminton club. I would like to be able to find out information about games that contain a given player (e.g. who did "Bill" play most with?). Here's an example of what two rounds of three games might look like:



      import pandas as pd
      player_data = player_data = pd.DataFrame(data=[
      ('2018-06-12', 1, 1, 1, 'Adam'),
      ('2018-06-12', 1, 1, 2, 'Bill'),
      ('2018-06-12', 1, 1, 3, 'Cindy'),
      ('2018-06-12', 1, 1, 4, 'Derek'),
      ('2018-06-12', 1, 2, 1, 'Edward'),
      ('2018-06-12', 1, 2, 2, 'Fred'),
      ('2018-06-12', 1, 2, 3, 'George'),
      ('2018-06-12', 1, 2, 4, 'Harry'),
      ('2018-06-12', 1, 3, 1, 'Ian'),
      ('2018-06-12', 1, 3, 2, 'Jack'),
      ('2018-06-12', 1, 3, 3, 'Karl'),
      ('2018-06-12', 1, 3, 4, 'Laura'),
      ('2018-06-12', 2, 1, 1, 'Karl'),
      ('2018-06-12', 2, 1, 2, 'Cindy'),
      ('2018-06-12', 2, 1, 3, 'Bill'),
      ('2018-06-12', 2, 1, 4, 'Derek'),
      ('2018-06-12', 2, 2, 1, 'Max'),
      ('2018-06-12', 2, 2, 2, 'George'),
      ('2018-06-12', 2, 2, 3, 'Fred'),
      ('2018-06-12', 2, 2, 4, 'Ian'),
      ('2018-06-12', 2, 3, 1, 'Nigel'),
      ('2018-06-12', 3, 3, 2, 'Edward'),
      ('2018-06-12', 3, 3, 3, 'Harry'),
      ('2018-06-12', 3, 3, 4, 'Adam')],
      columns=['Date', 'Round #', 'Court #', 'Space', 'Name'])


      However, as each row is an individual player's entry, simply locating by name, e.g.



      player_data.loc[player_data['Name'] == 'Bill']


      is just going to return only Bill's individual entries, like so:



          Date    Round # Court # Space   Name

      1 2018-06-12 1 1 2 Bill
      14 2018-06-12 2 1 3 Bill


      ... when what I want is a new dataframe that contains ALL entries of games that Bill has played in, such that in this case it would display as:



      Date    Round # Court # Space   Name
      0 2018-06-12 1 1 1 Adam
      1 2018-06-12 1 1 2 Bill
      2 2018-06-12 1 1 3 Cindy
      3 2018-06-12 1 1 4 Derek
      12 2018-06-12 2 1 1 Karl
      13 2018-06-12 2 1 2 Cindy
      14 2018-06-12 2 1 3 Bill
      15 2018-06-12 2 1 4 Derek


      I'm thinking it might be easier to convert the original dataframe to one where each entry is a separate game with all the player names for that game listed in a tuple, so then it'd be relatively simple to check "if name in Names"? e.g.



      Date    Round # Court # Names
      0 2018-06-12 1 1 (Adam, Bill, Cindy, Derek)


      ... but maybe that'd cause other problems.










      share|improve this question













      I have a CSV file of game history for my badminton club. I would like to be able to find out information about games that contain a given player (e.g. who did "Bill" play most with?). Here's an example of what two rounds of three games might look like:



      import pandas as pd
      player_data = player_data = pd.DataFrame(data=[
      ('2018-06-12', 1, 1, 1, 'Adam'),
      ('2018-06-12', 1, 1, 2, 'Bill'),
      ('2018-06-12', 1, 1, 3, 'Cindy'),
      ('2018-06-12', 1, 1, 4, 'Derek'),
      ('2018-06-12', 1, 2, 1, 'Edward'),
      ('2018-06-12', 1, 2, 2, 'Fred'),
      ('2018-06-12', 1, 2, 3, 'George'),
      ('2018-06-12', 1, 2, 4, 'Harry'),
      ('2018-06-12', 1, 3, 1, 'Ian'),
      ('2018-06-12', 1, 3, 2, 'Jack'),
      ('2018-06-12', 1, 3, 3, 'Karl'),
      ('2018-06-12', 1, 3, 4, 'Laura'),
      ('2018-06-12', 2, 1, 1, 'Karl'),
      ('2018-06-12', 2, 1, 2, 'Cindy'),
      ('2018-06-12', 2, 1, 3, 'Bill'),
      ('2018-06-12', 2, 1, 4, 'Derek'),
      ('2018-06-12', 2, 2, 1, 'Max'),
      ('2018-06-12', 2, 2, 2, 'George'),
      ('2018-06-12', 2, 2, 3, 'Fred'),
      ('2018-06-12', 2, 2, 4, 'Ian'),
      ('2018-06-12', 2, 3, 1, 'Nigel'),
      ('2018-06-12', 3, 3, 2, 'Edward'),
      ('2018-06-12', 3, 3, 3, 'Harry'),
      ('2018-06-12', 3, 3, 4, 'Adam')],
      columns=['Date', 'Round #', 'Court #', 'Space', 'Name'])


      However, as each row is an individual player's entry, simply locating by name, e.g.



      player_data.loc[player_data['Name'] == 'Bill']


      is just going to return only Bill's individual entries, like so:



          Date    Round # Court # Space   Name

      1 2018-06-12 1 1 2 Bill
      14 2018-06-12 2 1 3 Bill


      ... when what I want is a new dataframe that contains ALL entries of games that Bill has played in, such that in this case it would display as:



      Date    Round # Court # Space   Name
      0 2018-06-12 1 1 1 Adam
      1 2018-06-12 1 1 2 Bill
      2 2018-06-12 1 1 3 Cindy
      3 2018-06-12 1 1 4 Derek
      12 2018-06-12 2 1 1 Karl
      13 2018-06-12 2 1 2 Cindy
      14 2018-06-12 2 1 3 Bill
      15 2018-06-12 2 1 4 Derek


      I'm thinking it might be easier to convert the original dataframe to one where each entry is a separate game with all the player names for that game listed in a tuple, so then it'd be relatively simple to check "if name in Names"? e.g.



      Date    Round # Court # Names
      0 2018-06-12 1 1 (Adam, Bill, Cindy, Derek)


      ... but maybe that'd cause other problems.







      python python-3.x pandas






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 15 at 1:49









      Plato's Cave

      327




      327
























          2 Answers
          2






          active

          oldest

          votes

















          up vote
          3
          down vote



          accepted










          After the filter using merge



          s1=player_data.loc[player_data['Name'] == 'Bill',['Date','Round #','Court #']]
          s2=s1.merge(player_data,how='left')
          s2
          Out[12]:
          Date Round # Court # Space Name
          0 2018-06-12 1 1 1 Adam
          1 2018-06-12 1 1 2 Bill
          2 2018-06-12 1 1 3 Cindy
          3 2018-06-12 1 1 4 Derek
          4 2018-06-12 2 1 1 Karl
          5 2018-06-12 2 1 2 Cindy
          6 2018-06-12 2 1 3 Bill
          7 2018-06-12 2 1 4 Derek





          share|improve this answer




























            up vote
            0
            down vote













            My method for this is :



            bill_player_data = player_data.loc[player_data['Name'] == 'Bill']
            ro = bill_player_data['Round #']
            co = bill_player_data['Court #']
            bill = player_data.loc[player_data['Round #'].isin(ro)]
            bill = bill.loc[bill['Court #'].isin(co)]
            bill





            share|improve this answer























            • This didn't work: the 'bill' dataframe was the same as the original 'player_data' dataframe
              – Plato's Cave
              Nov 15 at 2:19










            • sorry, edited the code. I forgot to group the cour, cheers
              – Railey Shahril
              Nov 15 at 2:43











            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53311311%2fpandas-dataframe-finding-entries-that-share-values-e-g-all-games-that-contain%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            3
            down vote



            accepted










            After the filter using merge



            s1=player_data.loc[player_data['Name'] == 'Bill',['Date','Round #','Court #']]
            s2=s1.merge(player_data,how='left')
            s2
            Out[12]:
            Date Round # Court # Space Name
            0 2018-06-12 1 1 1 Adam
            1 2018-06-12 1 1 2 Bill
            2 2018-06-12 1 1 3 Cindy
            3 2018-06-12 1 1 4 Derek
            4 2018-06-12 2 1 1 Karl
            5 2018-06-12 2 1 2 Cindy
            6 2018-06-12 2 1 3 Bill
            7 2018-06-12 2 1 4 Derek





            share|improve this answer

























              up vote
              3
              down vote



              accepted










              After the filter using merge



              s1=player_data.loc[player_data['Name'] == 'Bill',['Date','Round #','Court #']]
              s2=s1.merge(player_data,how='left')
              s2
              Out[12]:
              Date Round # Court # Space Name
              0 2018-06-12 1 1 1 Adam
              1 2018-06-12 1 1 2 Bill
              2 2018-06-12 1 1 3 Cindy
              3 2018-06-12 1 1 4 Derek
              4 2018-06-12 2 1 1 Karl
              5 2018-06-12 2 1 2 Cindy
              6 2018-06-12 2 1 3 Bill
              7 2018-06-12 2 1 4 Derek





              share|improve this answer























                up vote
                3
                down vote



                accepted







                up vote
                3
                down vote



                accepted






                After the filter using merge



                s1=player_data.loc[player_data['Name'] == 'Bill',['Date','Round #','Court #']]
                s2=s1.merge(player_data,how='left')
                s2
                Out[12]:
                Date Round # Court # Space Name
                0 2018-06-12 1 1 1 Adam
                1 2018-06-12 1 1 2 Bill
                2 2018-06-12 1 1 3 Cindy
                3 2018-06-12 1 1 4 Derek
                4 2018-06-12 2 1 1 Karl
                5 2018-06-12 2 1 2 Cindy
                6 2018-06-12 2 1 3 Bill
                7 2018-06-12 2 1 4 Derek





                share|improve this answer












                After the filter using merge



                s1=player_data.loc[player_data['Name'] == 'Bill',['Date','Round #','Court #']]
                s2=s1.merge(player_data,how='left')
                s2
                Out[12]:
                Date Round # Court # Space Name
                0 2018-06-12 1 1 1 Adam
                1 2018-06-12 1 1 2 Bill
                2 2018-06-12 1 1 3 Cindy
                3 2018-06-12 1 1 4 Derek
                4 2018-06-12 2 1 1 Karl
                5 2018-06-12 2 1 2 Cindy
                6 2018-06-12 2 1 3 Bill
                7 2018-06-12 2 1 4 Derek






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 15 at 1:53









                W-B

                97.2k73162




                97.2k73162
























                    up vote
                    0
                    down vote













                    My method for this is :



                    bill_player_data = player_data.loc[player_data['Name'] == 'Bill']
                    ro = bill_player_data['Round #']
                    co = bill_player_data['Court #']
                    bill = player_data.loc[player_data['Round #'].isin(ro)]
                    bill = bill.loc[bill['Court #'].isin(co)]
                    bill





                    share|improve this answer























                    • This didn't work: the 'bill' dataframe was the same as the original 'player_data' dataframe
                      – Plato's Cave
                      Nov 15 at 2:19










                    • sorry, edited the code. I forgot to group the cour, cheers
                      – Railey Shahril
                      Nov 15 at 2:43















                    up vote
                    0
                    down vote













                    My method for this is :



                    bill_player_data = player_data.loc[player_data['Name'] == 'Bill']
                    ro = bill_player_data['Round #']
                    co = bill_player_data['Court #']
                    bill = player_data.loc[player_data['Round #'].isin(ro)]
                    bill = bill.loc[bill['Court #'].isin(co)]
                    bill





                    share|improve this answer























                    • This didn't work: the 'bill' dataframe was the same as the original 'player_data' dataframe
                      – Plato's Cave
                      Nov 15 at 2:19










                    • sorry, edited the code. I forgot to group the cour, cheers
                      – Railey Shahril
                      Nov 15 at 2:43













                    up vote
                    0
                    down vote










                    up vote
                    0
                    down vote









                    My method for this is :



                    bill_player_data = player_data.loc[player_data['Name'] == 'Bill']
                    ro = bill_player_data['Round #']
                    co = bill_player_data['Court #']
                    bill = player_data.loc[player_data['Round #'].isin(ro)]
                    bill = bill.loc[bill['Court #'].isin(co)]
                    bill





                    share|improve this answer














                    My method for this is :



                    bill_player_data = player_data.loc[player_data['Name'] == 'Bill']
                    ro = bill_player_data['Round #']
                    co = bill_player_data['Court #']
                    bill = player_data.loc[player_data['Round #'].isin(ro)]
                    bill = bill.loc[bill['Court #'].isin(co)]
                    bill






                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Nov 15 at 2:42

























                    answered Nov 15 at 2:10









                    Railey Shahril

                    184




                    184












                    • This didn't work: the 'bill' dataframe was the same as the original 'player_data' dataframe
                      – Plato's Cave
                      Nov 15 at 2:19










                    • sorry, edited the code. I forgot to group the cour, cheers
                      – Railey Shahril
                      Nov 15 at 2:43


















                    • This didn't work: the 'bill' dataframe was the same as the original 'player_data' dataframe
                      – Plato's Cave
                      Nov 15 at 2:19










                    • sorry, edited the code. I forgot to group the cour, cheers
                      – Railey Shahril
                      Nov 15 at 2:43
















                    This didn't work: the 'bill' dataframe was the same as the original 'player_data' dataframe
                    – Plato's Cave
                    Nov 15 at 2:19




                    This didn't work: the 'bill' dataframe was the same as the original 'player_data' dataframe
                    – Plato's Cave
                    Nov 15 at 2:19












                    sorry, edited the code. I forgot to group the cour, cheers
                    – Railey Shahril
                    Nov 15 at 2:43




                    sorry, edited the code. I forgot to group the cour, cheers
                    – Railey Shahril
                    Nov 15 at 2:43


















                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53311311%2fpandas-dataframe-finding-entries-that-share-values-e-g-all-games-that-contain%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Biblatex bibliography style without URLs when DOI exists (in Overleaf with Zotero bibliography)

                    ComboBox Display Member on multiple fields

                    Is it possible to collect Nectar points via Trainline?