missing_compare error from “finalfit” package











up vote
1
down vote

favorite












I am trying to get this command 'missing_compare' from the 'finalfit' package to work for my dataset:



proced<- c(1,NA,0,1,0,1,0)
asa<- c(4,3,4,2,5,1,NA)
albumin<- c(NA, NA, 3.572, NA, NA, NA, 4.262)
death<- c(0,0,1,0,1,1,0)
bmi<- c(26.04, NA, 31.23, 36.93, 28.9, NA, 30.01)
dataframe = data.frame(proced, as, albumin, death, bmi)


(This data frame is actually a lot bigger)



Then:



  dataframe$death = factor(dataframe$death)
dataframe$proced = factor(dataframe$proced)
dataframe$asa = factor(dataframe$asa)


And then:



explanatory = c("proced", "asa", 
"bmi", "albumin")
dependent = "death"


dataframe %>%
summary_factorlist(dependent, explanatory,
na_include=TRUE, p=TRUE)


But I can't get this to work:



 dataframe %>% 
missing_compare(dependent, explanatory)


I get this error when I try to do the missing_compare command with my entire dataset:



Error in `[.default`(x, , 2) : subscript out of bounds
In addition: Warning messages:
1: In cor(x, rank(y)) : the standard deviation is zero
2: In cor(x, rank(y)) : the standard deviation is zero


Help!










share|improve this question






















  • Can you use SO edit facilities to fix the code that produces an error at the step where dataframe is assigned? Error in data.frame(proced, as, albumin, death, bmi) : arguments imply differing number of rows: 7, 0
    – 42-
    Nov 15 at 0:32















up vote
1
down vote

favorite












I am trying to get this command 'missing_compare' from the 'finalfit' package to work for my dataset:



proced<- c(1,NA,0,1,0,1,0)
asa<- c(4,3,4,2,5,1,NA)
albumin<- c(NA, NA, 3.572, NA, NA, NA, 4.262)
death<- c(0,0,1,0,1,1,0)
bmi<- c(26.04, NA, 31.23, 36.93, 28.9, NA, 30.01)
dataframe = data.frame(proced, as, albumin, death, bmi)


(This data frame is actually a lot bigger)



Then:



  dataframe$death = factor(dataframe$death)
dataframe$proced = factor(dataframe$proced)
dataframe$asa = factor(dataframe$asa)


And then:



explanatory = c("proced", "asa", 
"bmi", "albumin")
dependent = "death"


dataframe %>%
summary_factorlist(dependent, explanatory,
na_include=TRUE, p=TRUE)


But I can't get this to work:



 dataframe %>% 
missing_compare(dependent, explanatory)


I get this error when I try to do the missing_compare command with my entire dataset:



Error in `[.default`(x, , 2) : subscript out of bounds
In addition: Warning messages:
1: In cor(x, rank(y)) : the standard deviation is zero
2: In cor(x, rank(y)) : the standard deviation is zero


Help!










share|improve this question






















  • Can you use SO edit facilities to fix the code that produces an error at the step where dataframe is assigned? Error in data.frame(proced, as, albumin, death, bmi) : arguments imply differing number of rows: 7, 0
    – 42-
    Nov 15 at 0:32













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I am trying to get this command 'missing_compare' from the 'finalfit' package to work for my dataset:



proced<- c(1,NA,0,1,0,1,0)
asa<- c(4,3,4,2,5,1,NA)
albumin<- c(NA, NA, 3.572, NA, NA, NA, 4.262)
death<- c(0,0,1,0,1,1,0)
bmi<- c(26.04, NA, 31.23, 36.93, 28.9, NA, 30.01)
dataframe = data.frame(proced, as, albumin, death, bmi)


(This data frame is actually a lot bigger)



Then:



  dataframe$death = factor(dataframe$death)
dataframe$proced = factor(dataframe$proced)
dataframe$asa = factor(dataframe$asa)


And then:



explanatory = c("proced", "asa", 
"bmi", "albumin")
dependent = "death"


dataframe %>%
summary_factorlist(dependent, explanatory,
na_include=TRUE, p=TRUE)


But I can't get this to work:



 dataframe %>% 
missing_compare(dependent, explanatory)


I get this error when I try to do the missing_compare command with my entire dataset:



Error in `[.default`(x, , 2) : subscript out of bounds
In addition: Warning messages:
1: In cor(x, rank(y)) : the standard deviation is zero
2: In cor(x, rank(y)) : the standard deviation is zero


Help!










share|improve this question













I am trying to get this command 'missing_compare' from the 'finalfit' package to work for my dataset:



proced<- c(1,NA,0,1,0,1,0)
asa<- c(4,3,4,2,5,1,NA)
albumin<- c(NA, NA, 3.572, NA, NA, NA, 4.262)
death<- c(0,0,1,0,1,1,0)
bmi<- c(26.04, NA, 31.23, 36.93, 28.9, NA, 30.01)
dataframe = data.frame(proced, as, albumin, death, bmi)


(This data frame is actually a lot bigger)



Then:



  dataframe$death = factor(dataframe$death)
dataframe$proced = factor(dataframe$proced)
dataframe$asa = factor(dataframe$asa)


And then:



explanatory = c("proced", "asa", 
"bmi", "albumin")
dependent = "death"


dataframe %>%
summary_factorlist(dependent, explanatory,
na_include=TRUE, p=TRUE)


But I can't get this to work:



 dataframe %>% 
missing_compare(dependent, explanatory)


I get this error when I try to do the missing_compare command with my entire dataset:



Error in `[.default`(x, , 2) : subscript out of bounds
In addition: Warning messages:
1: In cor(x, rank(y)) : the standard deviation is zero
2: In cor(x, rank(y)) : the standard deviation is zero


Help!







r missing-data






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 14 at 22:06









CBA

61




61












  • Can you use SO edit facilities to fix the code that produces an error at the step where dataframe is assigned? Error in data.frame(proced, as, albumin, death, bmi) : arguments imply differing number of rows: 7, 0
    – 42-
    Nov 15 at 0:32


















  • Can you use SO edit facilities to fix the code that produces an error at the step where dataframe is assigned? Error in data.frame(proced, as, albumin, death, bmi) : arguments imply differing number of rows: 7, 0
    – 42-
    Nov 15 at 0:32
















Can you use SO edit facilities to fix the code that produces an error at the step where dataframe is assigned? Error in data.frame(proced, as, albumin, death, bmi) : arguments imply differing number of rows: 7, 0
– 42-
Nov 15 at 0:32




Can you use SO edit facilities to fix the code that produces an error at the step where dataframe is assigned? Error in data.frame(proced, as, albumin, death, bmi) : arguments imply differing number of rows: 7, 0
– 42-
Nov 15 at 0:32












2 Answers
2






active

oldest

votes

















up vote
1
down vote













As stated by @astrofunkswag, the purpose of this function is to compare the distribution of missingness across a particular variable.



You would be better starting with a visualisation of your missing data e.g.




dataframe %>%
missing_pairs(dependent, explanatory)



That will help you understand what data you have.



One sweats away at extensive vignettes and people accuse you of sparse documentation :)
http://finalfit.org/articles/missing.html



Let me know if you still can't get it to work.






share|improve this answer




























    up vote
    1
    down vote













    Your dependent variable death has no missing values, which is the point of using the missing_compare function. Check out the documentation for that function for more info, though it is pretty sparse.



    The missing_compare function compares an explanatory variable when the dependent variable is missing to the explanatory variable when the dependent variable is not missing. It applies tests to analyze whether these 2 are from the same distribution.



    Using your example to illustrate this (note I reduce the number of DV's for simplicity):



    explanatory = c("proced", "bmi")
    dependent = "death"

    dataframe2 <- dataframe
    dataframe2$death[3:4] = NA

    dataframe2 %>%
    missing_compare(dependent, explanatory)

    Missing data analysis: death Not missing Missing p
    2 proced 0 2 (66.7) 1 (33.3) 1.000
    3 1 2 (66.7) 1 (33.3)
    1 bmi Mean (SD) 28.3 (2) 34.1 (4) 0.058
    Warning message:
    In chisq.test(tab, correct = FALSE) :
    Chi-squared approximation may be incorrect


    I added 2 NA values to the dependent variable death, and the code runs. So for example the function is comparing the bmi values when death is missing to the values when it is not. The p column indicates whether the difference between the groups is statistically significant (Chi-Squared and Kruskal-Wallis for the continuous vars). I'd caution against relying solely on a p-value for this type of analysis, but that is irrelevant to how the code works.



    And welcome to Stack Overflow!



    Edit: great vignette






    share|improve this answer























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53309447%2fmissing-compare-error-from-finalfit-package%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      1
      down vote













      As stated by @astrofunkswag, the purpose of this function is to compare the distribution of missingness across a particular variable.



      You would be better starting with a visualisation of your missing data e.g.




      dataframe %>%
      missing_pairs(dependent, explanatory)



      That will help you understand what data you have.



      One sweats away at extensive vignettes and people accuse you of sparse documentation :)
      http://finalfit.org/articles/missing.html



      Let me know if you still can't get it to work.






      share|improve this answer

























        up vote
        1
        down vote













        As stated by @astrofunkswag, the purpose of this function is to compare the distribution of missingness across a particular variable.



        You would be better starting with a visualisation of your missing data e.g.




        dataframe %>%
        missing_pairs(dependent, explanatory)



        That will help you understand what data you have.



        One sweats away at extensive vignettes and people accuse you of sparse documentation :)
        http://finalfit.org/articles/missing.html



        Let me know if you still can't get it to work.






        share|improve this answer























          up vote
          1
          down vote










          up vote
          1
          down vote









          As stated by @astrofunkswag, the purpose of this function is to compare the distribution of missingness across a particular variable.



          You would be better starting with a visualisation of your missing data e.g.




          dataframe %>%
          missing_pairs(dependent, explanatory)



          That will help you understand what data you have.



          One sweats away at extensive vignettes and people accuse you of sparse documentation :)
          http://finalfit.org/articles/missing.html



          Let me know if you still can't get it to work.






          share|improve this answer












          As stated by @astrofunkswag, the purpose of this function is to compare the distribution of missingness across a particular variable.



          You would be better starting with a visualisation of your missing data e.g.




          dataframe %>%
          missing_pairs(dependent, explanatory)



          That will help you understand what data you have.



          One sweats away at extensive vignettes and people accuse you of sparse documentation :)
          http://finalfit.org/articles/missing.html



          Let me know if you still can't get it to work.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 26 at 21:51









          Ewen

          622510




          622510
























              up vote
              1
              down vote













              Your dependent variable death has no missing values, which is the point of using the missing_compare function. Check out the documentation for that function for more info, though it is pretty sparse.



              The missing_compare function compares an explanatory variable when the dependent variable is missing to the explanatory variable when the dependent variable is not missing. It applies tests to analyze whether these 2 are from the same distribution.



              Using your example to illustrate this (note I reduce the number of DV's for simplicity):



              explanatory = c("proced", "bmi")
              dependent = "death"

              dataframe2 <- dataframe
              dataframe2$death[3:4] = NA

              dataframe2 %>%
              missing_compare(dependent, explanatory)

              Missing data analysis: death Not missing Missing p
              2 proced 0 2 (66.7) 1 (33.3) 1.000
              3 1 2 (66.7) 1 (33.3)
              1 bmi Mean (SD) 28.3 (2) 34.1 (4) 0.058
              Warning message:
              In chisq.test(tab, correct = FALSE) :
              Chi-squared approximation may be incorrect


              I added 2 NA values to the dependent variable death, and the code runs. So for example the function is comparing the bmi values when death is missing to the values when it is not. The p column indicates whether the difference between the groups is statistically significant (Chi-Squared and Kruskal-Wallis for the continuous vars). I'd caution against relying solely on a p-value for this type of analysis, but that is irrelevant to how the code works.



              And welcome to Stack Overflow!



              Edit: great vignette






              share|improve this answer



























                up vote
                1
                down vote













                Your dependent variable death has no missing values, which is the point of using the missing_compare function. Check out the documentation for that function for more info, though it is pretty sparse.



                The missing_compare function compares an explanatory variable when the dependent variable is missing to the explanatory variable when the dependent variable is not missing. It applies tests to analyze whether these 2 are from the same distribution.



                Using your example to illustrate this (note I reduce the number of DV's for simplicity):



                explanatory = c("proced", "bmi")
                dependent = "death"

                dataframe2 <- dataframe
                dataframe2$death[3:4] = NA

                dataframe2 %>%
                missing_compare(dependent, explanatory)

                Missing data analysis: death Not missing Missing p
                2 proced 0 2 (66.7) 1 (33.3) 1.000
                3 1 2 (66.7) 1 (33.3)
                1 bmi Mean (SD) 28.3 (2) 34.1 (4) 0.058
                Warning message:
                In chisq.test(tab, correct = FALSE) :
                Chi-squared approximation may be incorrect


                I added 2 NA values to the dependent variable death, and the code runs. So for example the function is comparing the bmi values when death is missing to the values when it is not. The p column indicates whether the difference between the groups is statistically significant (Chi-Squared and Kruskal-Wallis for the continuous vars). I'd caution against relying solely on a p-value for this type of analysis, but that is irrelevant to how the code works.



                And welcome to Stack Overflow!



                Edit: great vignette






                share|improve this answer

























                  up vote
                  1
                  down vote










                  up vote
                  1
                  down vote









                  Your dependent variable death has no missing values, which is the point of using the missing_compare function. Check out the documentation for that function for more info, though it is pretty sparse.



                  The missing_compare function compares an explanatory variable when the dependent variable is missing to the explanatory variable when the dependent variable is not missing. It applies tests to analyze whether these 2 are from the same distribution.



                  Using your example to illustrate this (note I reduce the number of DV's for simplicity):



                  explanatory = c("proced", "bmi")
                  dependent = "death"

                  dataframe2 <- dataframe
                  dataframe2$death[3:4] = NA

                  dataframe2 %>%
                  missing_compare(dependent, explanatory)

                  Missing data analysis: death Not missing Missing p
                  2 proced 0 2 (66.7) 1 (33.3) 1.000
                  3 1 2 (66.7) 1 (33.3)
                  1 bmi Mean (SD) 28.3 (2) 34.1 (4) 0.058
                  Warning message:
                  In chisq.test(tab, correct = FALSE) :
                  Chi-squared approximation may be incorrect


                  I added 2 NA values to the dependent variable death, and the code runs. So for example the function is comparing the bmi values when death is missing to the values when it is not. The p column indicates whether the difference between the groups is statistically significant (Chi-Squared and Kruskal-Wallis for the continuous vars). I'd caution against relying solely on a p-value for this type of analysis, but that is irrelevant to how the code works.



                  And welcome to Stack Overflow!



                  Edit: great vignette






                  share|improve this answer














                  Your dependent variable death has no missing values, which is the point of using the missing_compare function. Check out the documentation for that function for more info, though it is pretty sparse.



                  The missing_compare function compares an explanatory variable when the dependent variable is missing to the explanatory variable when the dependent variable is not missing. It applies tests to analyze whether these 2 are from the same distribution.



                  Using your example to illustrate this (note I reduce the number of DV's for simplicity):



                  explanatory = c("proced", "bmi")
                  dependent = "death"

                  dataframe2 <- dataframe
                  dataframe2$death[3:4] = NA

                  dataframe2 %>%
                  missing_compare(dependent, explanatory)

                  Missing data analysis: death Not missing Missing p
                  2 proced 0 2 (66.7) 1 (33.3) 1.000
                  3 1 2 (66.7) 1 (33.3)
                  1 bmi Mean (SD) 28.3 (2) 34.1 (4) 0.058
                  Warning message:
                  In chisq.test(tab, correct = FALSE) :
                  Chi-squared approximation may be incorrect


                  I added 2 NA values to the dependent variable death, and the code runs. So for example the function is comparing the bmi values when death is missing to the values when it is not. The p column indicates whether the difference between the groups is statistically significant (Chi-Squared and Kruskal-Wallis for the continuous vars). I'd caution against relying solely on a p-value for this type of analysis, but that is irrelevant to how the code works.



                  And welcome to Stack Overflow!



                  Edit: great vignette







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 26 at 22:06

























                  answered Nov 14 at 22:44









                  astrofunkswag

                  516210




                  516210






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.





                      Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                      Please pay close attention to the following guidance:


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53309447%2fmissing-compare-error-from-finalfit-package%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      How to change which sound is reproduced for terminal bell?

                      Can I use Tabulator js library in my java Spring + Thymeleaf project?

                      Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents