Creating dummies with apply in R











up vote
1
down vote

favorite












I have data about different study strategies for individuals (stored in columns labeled StrategyA, StrategyB, StrategyC. The strategies are coded 1-15. I want to create a dummy for each strategy (e.g. strategy1, strategy2, etc) because each student can list up to 3 strategies.



Example Data



   ID = c(1, 2, 3, 4, 5)
Strategy_A = c(10, 12, 13, 1, 2)
Strategy_B = c(1, 2, 1, 4, 5)
Strategy_C = c(2, 3, 6, 8, 15)
all = data.frame(ID, Strategy_A, Strategy_B, Strategy_C)


I thought about using apply and creating a function linked to the fastDummies package.



     dummies = function(x){
dummy_cols(x)
}

new = apply(all [,-1], 2, dummies)
new = as.data.frame(new)


However, this creates dummies for StrategyA_1 StrategyA_2 StrategyA_3 rather than summarizing the dummies as Strategy1 Strategy2 Strategy3. Any ideas how to fix this?










share|improve this question


















  • 1




    please describe the output you expected.
    – Darren Tsai
    5 hours ago










  • Sounds like you're interested in creating a dummy for each combination of the three variables? In which case you might need to create a another variable that combines them and then create the dummy variables from that.
    – Cleland
    5 hours ago















up vote
1
down vote

favorite












I have data about different study strategies for individuals (stored in columns labeled StrategyA, StrategyB, StrategyC. The strategies are coded 1-15. I want to create a dummy for each strategy (e.g. strategy1, strategy2, etc) because each student can list up to 3 strategies.



Example Data



   ID = c(1, 2, 3, 4, 5)
Strategy_A = c(10, 12, 13, 1, 2)
Strategy_B = c(1, 2, 1, 4, 5)
Strategy_C = c(2, 3, 6, 8, 15)
all = data.frame(ID, Strategy_A, Strategy_B, Strategy_C)


I thought about using apply and creating a function linked to the fastDummies package.



     dummies = function(x){
dummy_cols(x)
}

new = apply(all [,-1], 2, dummies)
new = as.data.frame(new)


However, this creates dummies for StrategyA_1 StrategyA_2 StrategyA_3 rather than summarizing the dummies as Strategy1 Strategy2 Strategy3. Any ideas how to fix this?










share|improve this question


















  • 1




    please describe the output you expected.
    – Darren Tsai
    5 hours ago










  • Sounds like you're interested in creating a dummy for each combination of the three variables? In which case you might need to create a another variable that combines them and then create the dummy variables from that.
    – Cleland
    5 hours ago













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I have data about different study strategies for individuals (stored in columns labeled StrategyA, StrategyB, StrategyC. The strategies are coded 1-15. I want to create a dummy for each strategy (e.g. strategy1, strategy2, etc) because each student can list up to 3 strategies.



Example Data



   ID = c(1, 2, 3, 4, 5)
Strategy_A = c(10, 12, 13, 1, 2)
Strategy_B = c(1, 2, 1, 4, 5)
Strategy_C = c(2, 3, 6, 8, 15)
all = data.frame(ID, Strategy_A, Strategy_B, Strategy_C)


I thought about using apply and creating a function linked to the fastDummies package.



     dummies = function(x){
dummy_cols(x)
}

new = apply(all [,-1], 2, dummies)
new = as.data.frame(new)


However, this creates dummies for StrategyA_1 StrategyA_2 StrategyA_3 rather than summarizing the dummies as Strategy1 Strategy2 Strategy3. Any ideas how to fix this?










share|improve this question













I have data about different study strategies for individuals (stored in columns labeled StrategyA, StrategyB, StrategyC. The strategies are coded 1-15. I want to create a dummy for each strategy (e.g. strategy1, strategy2, etc) because each student can list up to 3 strategies.



Example Data



   ID = c(1, 2, 3, 4, 5)
Strategy_A = c(10, 12, 13, 1, 2)
Strategy_B = c(1, 2, 1, 4, 5)
Strategy_C = c(2, 3, 6, 8, 15)
all = data.frame(ID, Strategy_A, Strategy_B, Strategy_C)


I thought about using apply and creating a function linked to the fastDummies package.



     dummies = function(x){
dummy_cols(x)
}

new = apply(all [,-1], 2, dummies)
new = as.data.frame(new)


However, this creates dummies for StrategyA_1 StrategyA_2 StrategyA_3 rather than summarizing the dummies as Strategy1 Strategy2 Strategy3. Any ideas how to fix this?







r apply






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked 5 hours ago









Student

305




305








  • 1




    please describe the output you expected.
    – Darren Tsai
    5 hours ago










  • Sounds like you're interested in creating a dummy for each combination of the three variables? In which case you might need to create a another variable that combines them and then create the dummy variables from that.
    – Cleland
    5 hours ago














  • 1




    please describe the output you expected.
    – Darren Tsai
    5 hours ago










  • Sounds like you're interested in creating a dummy for each combination of the three variables? In which case you might need to create a another variable that combines them and then create the dummy variables from that.
    – Cleland
    5 hours ago








1




1




please describe the output you expected.
– Darren Tsai
5 hours ago




please describe the output you expected.
– Darren Tsai
5 hours ago












Sounds like you're interested in creating a dummy for each combination of the three variables? In which case you might need to create a another variable that combines them and then create the dummy variables from that.
– Cleland
5 hours ago




Sounds like you're interested in creating a dummy for each combination of the three variables? In which case you might need to create a another variable that combines them and then create the dummy variables from that.
– Cleland
5 hours ago












2 Answers
2






active

oldest

votes

















up vote
0
down vote













After a small transformation of all, you can use dummy.data.frame() from dummies (you can also use dummy_cols() from fastDummies) and then aggregate per ID.



all <- data.frame(ID = rep(all$ID, 3),
Strategy = c(all$Strategy_A, all$Strategy_B, all$Strategy_C)) # data frame "all" with one column Strategy
library(dummies)
all <- dummy.data.frame(all, "Strategy") # or fastDummies::dummy_cols(all, "Strategy")
aggregate(. ~ ID, all, sum) # since strategies are now dummies, the sum will always be 0 or 1
# output
ID Strategy1 Strategy2 Strategy3 Strategy4 Strategy5 Strategy6 Strategy8 Strategy10 Strategy12 Strategy13 Strategy15
1 1 1 1 0 0 0 0 0 1 0 0 0
2 2 0 1 1 0 0 0 0 0 1 0 0
3 3 1 0 0 0 0 1 0 0 0 1 0
4 4 1 0 0 1 0 0 1 0 0 0 0
5 5 0 1 0 0 1 0 0 0 0 0 1





share|improve this answer






























    up vote
    0
    down vote













    I provide a method with the tidyverse way.



    library(tidyverse)

    new <- all %>% gather(select = -ID) %>%
    mutate(key = NULL, num = 1) %>%
    spread(value, num)

    # ID 1 2 3 4 5 6 8 10 12 13 15
    # 1 1 1 1 NA NA NA NA NA 1 NA NA NA
    # 2 2 NA 1 1 NA NA NA NA NA 1 NA NA
    # 3 3 1 NA NA NA NA 1 NA NA NA 1 NA
    # 4 4 1 NA NA 1 NA NA 1 NA NA NA NA
    # 5 5 NA 1 NA NA 1 NA NA NA NA NA 1

    new[is.na(new)] <- 0
    new

    # ID 1 2 3 4 5 6 8 10 12 13 15
    # 1 1 1 1 0 0 0 0 0 1 0 0 0
    # 2 2 0 1 1 0 0 0 0 0 1 0 0
    # 3 3 1 0 0 0 0 1 0 0 0 1 0
    # 4 4 1 0 0 1 0 0 1 0 0 0 0
    # 5 5 0 1 0 0 1 0 0 0 0 0 1





    share|improve this answer





















      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














       

      draft saved


      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53266035%2fcreating-dummies-with-apply-in-r%23new-answer', 'question_page');
      }
      );

      Post as a guest
































      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      0
      down vote













      After a small transformation of all, you can use dummy.data.frame() from dummies (you can also use dummy_cols() from fastDummies) and then aggregate per ID.



      all <- data.frame(ID = rep(all$ID, 3),
      Strategy = c(all$Strategy_A, all$Strategy_B, all$Strategy_C)) # data frame "all" with one column Strategy
      library(dummies)
      all <- dummy.data.frame(all, "Strategy") # or fastDummies::dummy_cols(all, "Strategy")
      aggregate(. ~ ID, all, sum) # since strategies are now dummies, the sum will always be 0 or 1
      # output
      ID Strategy1 Strategy2 Strategy3 Strategy4 Strategy5 Strategy6 Strategy8 Strategy10 Strategy12 Strategy13 Strategy15
      1 1 1 1 0 0 0 0 0 1 0 0 0
      2 2 0 1 1 0 0 0 0 0 1 0 0
      3 3 1 0 0 0 0 1 0 0 0 1 0
      4 4 1 0 0 1 0 0 1 0 0 0 0
      5 5 0 1 0 0 1 0 0 0 0 0 1





      share|improve this answer



























        up vote
        0
        down vote













        After a small transformation of all, you can use dummy.data.frame() from dummies (you can also use dummy_cols() from fastDummies) and then aggregate per ID.



        all <- data.frame(ID = rep(all$ID, 3),
        Strategy = c(all$Strategy_A, all$Strategy_B, all$Strategy_C)) # data frame "all" with one column Strategy
        library(dummies)
        all <- dummy.data.frame(all, "Strategy") # or fastDummies::dummy_cols(all, "Strategy")
        aggregate(. ~ ID, all, sum) # since strategies are now dummies, the sum will always be 0 or 1
        # output
        ID Strategy1 Strategy2 Strategy3 Strategy4 Strategy5 Strategy6 Strategy8 Strategy10 Strategy12 Strategy13 Strategy15
        1 1 1 1 0 0 0 0 0 1 0 0 0
        2 2 0 1 1 0 0 0 0 0 1 0 0
        3 3 1 0 0 0 0 1 0 0 0 1 0
        4 4 1 0 0 1 0 0 1 0 0 0 0
        5 5 0 1 0 0 1 0 0 0 0 0 1





        share|improve this answer

























          up vote
          0
          down vote










          up vote
          0
          down vote









          After a small transformation of all, you can use dummy.data.frame() from dummies (you can also use dummy_cols() from fastDummies) and then aggregate per ID.



          all <- data.frame(ID = rep(all$ID, 3),
          Strategy = c(all$Strategy_A, all$Strategy_B, all$Strategy_C)) # data frame "all" with one column Strategy
          library(dummies)
          all <- dummy.data.frame(all, "Strategy") # or fastDummies::dummy_cols(all, "Strategy")
          aggregate(. ~ ID, all, sum) # since strategies are now dummies, the sum will always be 0 or 1
          # output
          ID Strategy1 Strategy2 Strategy3 Strategy4 Strategy5 Strategy6 Strategy8 Strategy10 Strategy12 Strategy13 Strategy15
          1 1 1 1 0 0 0 0 0 1 0 0 0
          2 2 0 1 1 0 0 0 0 0 1 0 0
          3 3 1 0 0 0 0 1 0 0 0 1 0
          4 4 1 0 0 1 0 0 1 0 0 0 0
          5 5 0 1 0 0 1 0 0 0 0 0 1





          share|improve this answer














          After a small transformation of all, you can use dummy.data.frame() from dummies (you can also use dummy_cols() from fastDummies) and then aggregate per ID.



          all <- data.frame(ID = rep(all$ID, 3),
          Strategy = c(all$Strategy_A, all$Strategy_B, all$Strategy_C)) # data frame "all" with one column Strategy
          library(dummies)
          all <- dummy.data.frame(all, "Strategy") # or fastDummies::dummy_cols(all, "Strategy")
          aggregate(. ~ ID, all, sum) # since strategies are now dummies, the sum will always be 0 or 1
          # output
          ID Strategy1 Strategy2 Strategy3 Strategy4 Strategy5 Strategy6 Strategy8 Strategy10 Strategy12 Strategy13 Strategy15
          1 1 1 1 0 0 0 0 0 1 0 0 0
          2 2 0 1 1 0 0 0 0 0 1 0 0
          3 3 1 0 0 0 0 1 0 0 0 1 0
          4 4 1 0 0 1 0 0 1 0 0 0 0
          5 5 0 1 0 0 1 0 0 0 0 0 1






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 4 hours ago

























          answered 4 hours ago









          ANG

          3,8572620




          3,8572620
























              up vote
              0
              down vote













              I provide a method with the tidyverse way.



              library(tidyverse)

              new <- all %>% gather(select = -ID) %>%
              mutate(key = NULL, num = 1) %>%
              spread(value, num)

              # ID 1 2 3 4 5 6 8 10 12 13 15
              # 1 1 1 1 NA NA NA NA NA 1 NA NA NA
              # 2 2 NA 1 1 NA NA NA NA NA 1 NA NA
              # 3 3 1 NA NA NA NA 1 NA NA NA 1 NA
              # 4 4 1 NA NA 1 NA NA 1 NA NA NA NA
              # 5 5 NA 1 NA NA 1 NA NA NA NA NA 1

              new[is.na(new)] <- 0
              new

              # ID 1 2 3 4 5 6 8 10 12 13 15
              # 1 1 1 1 0 0 0 0 0 1 0 0 0
              # 2 2 0 1 1 0 0 0 0 0 1 0 0
              # 3 3 1 0 0 0 0 1 0 0 0 1 0
              # 4 4 1 0 0 1 0 0 1 0 0 0 0
              # 5 5 0 1 0 0 1 0 0 0 0 0 1





              share|improve this answer

























                up vote
                0
                down vote













                I provide a method with the tidyverse way.



                library(tidyverse)

                new <- all %>% gather(select = -ID) %>%
                mutate(key = NULL, num = 1) %>%
                spread(value, num)

                # ID 1 2 3 4 5 6 8 10 12 13 15
                # 1 1 1 1 NA NA NA NA NA 1 NA NA NA
                # 2 2 NA 1 1 NA NA NA NA NA 1 NA NA
                # 3 3 1 NA NA NA NA 1 NA NA NA 1 NA
                # 4 4 1 NA NA 1 NA NA 1 NA NA NA NA
                # 5 5 NA 1 NA NA 1 NA NA NA NA NA 1

                new[is.na(new)] <- 0
                new

                # ID 1 2 3 4 5 6 8 10 12 13 15
                # 1 1 1 1 0 0 0 0 0 1 0 0 0
                # 2 2 0 1 1 0 0 0 0 0 1 0 0
                # 3 3 1 0 0 0 0 1 0 0 0 1 0
                # 4 4 1 0 0 1 0 0 1 0 0 0 0
                # 5 5 0 1 0 0 1 0 0 0 0 0 1





                share|improve this answer























                  up vote
                  0
                  down vote










                  up vote
                  0
                  down vote









                  I provide a method with the tidyverse way.



                  library(tidyverse)

                  new <- all %>% gather(select = -ID) %>%
                  mutate(key = NULL, num = 1) %>%
                  spread(value, num)

                  # ID 1 2 3 4 5 6 8 10 12 13 15
                  # 1 1 1 1 NA NA NA NA NA 1 NA NA NA
                  # 2 2 NA 1 1 NA NA NA NA NA 1 NA NA
                  # 3 3 1 NA NA NA NA 1 NA NA NA 1 NA
                  # 4 4 1 NA NA 1 NA NA 1 NA NA NA NA
                  # 5 5 NA 1 NA NA 1 NA NA NA NA NA 1

                  new[is.na(new)] <- 0
                  new

                  # ID 1 2 3 4 5 6 8 10 12 13 15
                  # 1 1 1 1 0 0 0 0 0 1 0 0 0
                  # 2 2 0 1 1 0 0 0 0 0 1 0 0
                  # 3 3 1 0 0 0 0 1 0 0 0 1 0
                  # 4 4 1 0 0 1 0 0 1 0 0 0 0
                  # 5 5 0 1 0 0 1 0 0 0 0 0 1





                  share|improve this answer












                  I provide a method with the tidyverse way.



                  library(tidyverse)

                  new <- all %>% gather(select = -ID) %>%
                  mutate(key = NULL, num = 1) %>%
                  spread(value, num)

                  # ID 1 2 3 4 5 6 8 10 12 13 15
                  # 1 1 1 1 NA NA NA NA NA 1 NA NA NA
                  # 2 2 NA 1 1 NA NA NA NA NA 1 NA NA
                  # 3 3 1 NA NA NA NA 1 NA NA NA 1 NA
                  # 4 4 1 NA NA 1 NA NA 1 NA NA NA NA
                  # 5 5 NA 1 NA NA 1 NA NA NA NA NA 1

                  new[is.na(new)] <- 0
                  new

                  # ID 1 2 3 4 5 6 8 10 12 13 15
                  # 1 1 1 1 0 0 0 0 0 1 0 0 0
                  # 2 2 0 1 1 0 0 0 0 0 1 0 0
                  # 3 3 1 0 0 0 0 1 0 0 0 1 0
                  # 4 4 1 0 0 1 0 0 1 0 0 0 0
                  # 5 5 0 1 0 0 1 0 0 0 0 0 1






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 4 hours ago









                  Darren Tsai

                  742116




                  742116






























                       

                      draft saved


                      draft discarded



















































                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53266035%2fcreating-dummies-with-apply-in-r%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest




















































































                      Popular posts from this blog

                      Biblatex bibliography style without URLs when DOI exists (in Overleaf with Zotero bibliography)

                      ComboBox Display Member on multiple fields

                      Is it possible to collect Nectar points via Trainline?