Comparing Character Variables of IDs Across 2 Data Frames





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







1















merge1 <- within(merge(df1, df2, by=c("ID"),all=F),
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(df1$D != df2$D | df1$D == "TOT" | df2$D == "TOT", 1, 0)
})[,c("ID","AD","BD","CD","DC")]


I want to compare statistics of IDs across two data sets. Imagine each df representing data from one year. This works exactly like I want it to except when I try to add the "DC" variable using an ifelse statement. Some information about the data set is that they are not of equal length and IDs that existed in df1 may not exist in df2 and vice versa. The D variable in each data frame is comprised of organizations. However, I want the new merged df to be a binary of whether or not the ID changed organizations or not. This is why I have an ifelse statement where if D from df1 does not match D from df2 then I want it to output 1. Also if D from either or both data frames happen to be labeled as TOT then I want it to output 1. I only want it to output 0 if df1$D = df2$D and TOT is not assigned to the ID. Can ifelse( statements be used this way or am I doing something wrong? I am a bit new to R so I appreciate the help in advance.



Edit



Here is the resulting error code:



Error in `[<-.data.frame`(`*tmp*`, nl, value = list(TmC = c(1, 1, 0, 1,  : 
replacement element 1 has 486 rows, need 576
In addition: Warning messages:
1: In if (all.x) all.x <- (nxx <- length(m$x.alone)) > 0L :
the condition has length > 1 and only the first element will be used
2: In if (all.y) all.y <- (nyy <- length(m$y.alone)) > 0L :
the condition has length > 1 and only the first element will be used
3: In is.na(e1) | is.na(e2) :
longer object length is not a multiple of shorter object length
4: In `!=.default`(nbasumadv1617$Tm, nbasumadv1516$Tm) :
longer object length is not a multiple of shorter object length
5: In nbasumadv1617$Tm != nbasumadv1516$Tm | nbasumadv1617$Tm == "TOT" | :
longer object length is not a multiple of shorter object length


I described a simplified version of the variable names I have here. It worked as I wanted to without the fourth "DC" variable which is actually labeled TmC in my actual code. I believe this code actually produces nothing since it shows me nothing different from what I created without that fourth variable. The first two error messages still showed up without the fourth "DC" variable but that's fine. The last 3 error messages and the TmC replacement element mismatch error are new.










share|improve this question

























  • Please describe current undesired result or error. What does code do now?

    – Parfait
    Nov 23 '18 at 5:01













  • @Parfait Sorry about that. Edit made.

    – Nisode
    Nov 23 '18 at 5:35


















1















merge1 <- within(merge(df1, df2, by=c("ID"),all=F),
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(df1$D != df2$D | df1$D == "TOT" | df2$D == "TOT", 1, 0)
})[,c("ID","AD","BD","CD","DC")]


I want to compare statistics of IDs across two data sets. Imagine each df representing data from one year. This works exactly like I want it to except when I try to add the "DC" variable using an ifelse statement. Some information about the data set is that they are not of equal length and IDs that existed in df1 may not exist in df2 and vice versa. The D variable in each data frame is comprised of organizations. However, I want the new merged df to be a binary of whether or not the ID changed organizations or not. This is why I have an ifelse statement where if D from df1 does not match D from df2 then I want it to output 1. Also if D from either or both data frames happen to be labeled as TOT then I want it to output 1. I only want it to output 0 if df1$D = df2$D and TOT is not assigned to the ID. Can ifelse( statements be used this way or am I doing something wrong? I am a bit new to R so I appreciate the help in advance.



Edit



Here is the resulting error code:



Error in `[<-.data.frame`(`*tmp*`, nl, value = list(TmC = c(1, 1, 0, 1,  : 
replacement element 1 has 486 rows, need 576
In addition: Warning messages:
1: In if (all.x) all.x <- (nxx <- length(m$x.alone)) > 0L :
the condition has length > 1 and only the first element will be used
2: In if (all.y) all.y <- (nyy <- length(m$y.alone)) > 0L :
the condition has length > 1 and only the first element will be used
3: In is.na(e1) | is.na(e2) :
longer object length is not a multiple of shorter object length
4: In `!=.default`(nbasumadv1617$Tm, nbasumadv1516$Tm) :
longer object length is not a multiple of shorter object length
5: In nbasumadv1617$Tm != nbasumadv1516$Tm | nbasumadv1617$Tm == "TOT" | :
longer object length is not a multiple of shorter object length


I described a simplified version of the variable names I have here. It worked as I wanted to without the fourth "DC" variable which is actually labeled TmC in my actual code. I believe this code actually produces nothing since it shows me nothing different from what I created without that fourth variable. The first two error messages still showed up without the fourth "DC" variable but that's fine. The last 3 error messages and the TmC replacement element mismatch error are new.










share|improve this question

























  • Please describe current undesired result or error. What does code do now?

    – Parfait
    Nov 23 '18 at 5:01













  • @Parfait Sorry about that. Edit made.

    – Nisode
    Nov 23 '18 at 5:35














1












1








1








merge1 <- within(merge(df1, df2, by=c("ID"),all=F),
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(df1$D != df2$D | df1$D == "TOT" | df2$D == "TOT", 1, 0)
})[,c("ID","AD","BD","CD","DC")]


I want to compare statistics of IDs across two data sets. Imagine each df representing data from one year. This works exactly like I want it to except when I try to add the "DC" variable using an ifelse statement. Some information about the data set is that they are not of equal length and IDs that existed in df1 may not exist in df2 and vice versa. The D variable in each data frame is comprised of organizations. However, I want the new merged df to be a binary of whether or not the ID changed organizations or not. This is why I have an ifelse statement where if D from df1 does not match D from df2 then I want it to output 1. Also if D from either or both data frames happen to be labeled as TOT then I want it to output 1. I only want it to output 0 if df1$D = df2$D and TOT is not assigned to the ID. Can ifelse( statements be used this way or am I doing something wrong? I am a bit new to R so I appreciate the help in advance.



Edit



Here is the resulting error code:



Error in `[<-.data.frame`(`*tmp*`, nl, value = list(TmC = c(1, 1, 0, 1,  : 
replacement element 1 has 486 rows, need 576
In addition: Warning messages:
1: In if (all.x) all.x <- (nxx <- length(m$x.alone)) > 0L :
the condition has length > 1 and only the first element will be used
2: In if (all.y) all.y <- (nyy <- length(m$y.alone)) > 0L :
the condition has length > 1 and only the first element will be used
3: In is.na(e1) | is.na(e2) :
longer object length is not a multiple of shorter object length
4: In `!=.default`(nbasumadv1617$Tm, nbasumadv1516$Tm) :
longer object length is not a multiple of shorter object length
5: In nbasumadv1617$Tm != nbasumadv1516$Tm | nbasumadv1617$Tm == "TOT" | :
longer object length is not a multiple of shorter object length


I described a simplified version of the variable names I have here. It worked as I wanted to without the fourth "DC" variable which is actually labeled TmC in my actual code. I believe this code actually produces nothing since it shows me nothing different from what I created without that fourth variable. The first two error messages still showed up without the fourth "DC" variable but that's fine. The last 3 error messages and the TmC replacement element mismatch error are new.










share|improve this question
















merge1 <- within(merge(df1, df2, by=c("ID"),all=F),
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(df1$D != df2$D | df1$D == "TOT" | df2$D == "TOT", 1, 0)
})[,c("ID","AD","BD","CD","DC")]


I want to compare statistics of IDs across two data sets. Imagine each df representing data from one year. This works exactly like I want it to except when I try to add the "DC" variable using an ifelse statement. Some information about the data set is that they are not of equal length and IDs that existed in df1 may not exist in df2 and vice versa. The D variable in each data frame is comprised of organizations. However, I want the new merged df to be a binary of whether or not the ID changed organizations or not. This is why I have an ifelse statement where if D from df1 does not match D from df2 then I want it to output 1. Also if D from either or both data frames happen to be labeled as TOT then I want it to output 1. I only want it to output 0 if df1$D = df2$D and TOT is not assigned to the ID. Can ifelse( statements be used this way or am I doing something wrong? I am a bit new to R so I appreciate the help in advance.



Edit



Here is the resulting error code:



Error in `[<-.data.frame`(`*tmp*`, nl, value = list(TmC = c(1, 1, 0, 1,  : 
replacement element 1 has 486 rows, need 576
In addition: Warning messages:
1: In if (all.x) all.x <- (nxx <- length(m$x.alone)) > 0L :
the condition has length > 1 and only the first element will be used
2: In if (all.y) all.y <- (nyy <- length(m$y.alone)) > 0L :
the condition has length > 1 and only the first element will be used
3: In is.na(e1) | is.na(e2) :
longer object length is not a multiple of shorter object length
4: In `!=.default`(nbasumadv1617$Tm, nbasumadv1516$Tm) :
longer object length is not a multiple of shorter object length
5: In nbasumadv1617$Tm != nbasumadv1516$Tm | nbasumadv1617$Tm == "TOT" | :
longer object length is not a multiple of shorter object length


I described a simplified version of the variable names I have here. It worked as I wanted to without the fourth "DC" variable which is actually labeled TmC in my actual code. I believe this code actually produces nothing since it shows me nothing different from what I created without that fourth variable. The first two error messages still showed up without the fourth "DC" variable but that's fine. The last 3 error messages and the TmC replacement element mismatch error are new.







r






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 23 '18 at 13:47







Nisode

















asked Nov 23 '18 at 4:20









NisodeNisode

83




83













  • Please describe current undesired result or error. What does code do now?

    – Parfait
    Nov 23 '18 at 5:01













  • @Parfait Sorry about that. Edit made.

    – Nisode
    Nov 23 '18 at 5:35



















  • Please describe current undesired result or error. What does code do now?

    – Parfait
    Nov 23 '18 at 5:01













  • @Parfait Sorry about that. Edit made.

    – Nisode
    Nov 23 '18 at 5:35

















Please describe current undesired result or error. What does code do now?

– Parfait
Nov 23 '18 at 5:01







Please describe current undesired result or error. What does code do now?

– Parfait
Nov 23 '18 at 5:01















@Parfait Sorry about that. Edit made.

– Nisode
Nov 23 '18 at 5:35





@Parfait Sorry about that. Edit made.

– Nisode
Nov 23 '18 at 5:35












1 Answer
1






active

oldest

votes


















0














Simply adjust the ifelse call to use the actual merged columns, D.x and D.y, and not their original sources. When you merged, you rendered all columns equal length since by definition a data frame is a list of equal length atomic vectors. But referencing df1$D and df2$D as you originally do can involve different lengths and hence your error.



merge1 <- within(merge(df1, df2, by=c("ID"), all=FALSE),
{
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(D.x != D.y | D.x == "TOT" | D.y == "TOT", 1, 0)
}
)[,c("ID","AD","BD","CD","DC")]





share|improve this answer
























  • Thanks a bunch! This does exactly what I need it to. The solution was much simpler than I thought it would be. I don't know why I didn't think of following the previous merged variable conventions.

    – Nisode
    Nov 24 '18 at 1:59











  • Glad to help! Haha...All part of the learning process. We all did many facepalms!

    – Parfait
    Nov 24 '18 at 2:52












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440609%2fcomparing-character-variables-of-ids-across-2-data-frames%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














Simply adjust the ifelse call to use the actual merged columns, D.x and D.y, and not their original sources. When you merged, you rendered all columns equal length since by definition a data frame is a list of equal length atomic vectors. But referencing df1$D and df2$D as you originally do can involve different lengths and hence your error.



merge1 <- within(merge(df1, df2, by=c("ID"), all=FALSE),
{
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(D.x != D.y | D.x == "TOT" | D.y == "TOT", 1, 0)
}
)[,c("ID","AD","BD","CD","DC")]





share|improve this answer
























  • Thanks a bunch! This does exactly what I need it to. The solution was much simpler than I thought it would be. I don't know why I didn't think of following the previous merged variable conventions.

    – Nisode
    Nov 24 '18 at 1:59











  • Glad to help! Haha...All part of the learning process. We all did many facepalms!

    – Parfait
    Nov 24 '18 at 2:52
















0














Simply adjust the ifelse call to use the actual merged columns, D.x and D.y, and not their original sources. When you merged, you rendered all columns equal length since by definition a data frame is a list of equal length atomic vectors. But referencing df1$D and df2$D as you originally do can involve different lengths and hence your error.



merge1 <- within(merge(df1, df2, by=c("ID"), all=FALSE),
{
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(D.x != D.y | D.x == "TOT" | D.y == "TOT", 1, 0)
}
)[,c("ID","AD","BD","CD","DC")]





share|improve this answer
























  • Thanks a bunch! This does exactly what I need it to. The solution was much simpler than I thought it would be. I don't know why I didn't think of following the previous merged variable conventions.

    – Nisode
    Nov 24 '18 at 1:59











  • Glad to help! Haha...All part of the learning process. We all did many facepalms!

    – Parfait
    Nov 24 '18 at 2:52














0












0








0







Simply adjust the ifelse call to use the actual merged columns, D.x and D.y, and not their original sources. When you merged, you rendered all columns equal length since by definition a data frame is a list of equal length atomic vectors. But referencing df1$D and df2$D as you originally do can involve different lengths and hence your error.



merge1 <- within(merge(df1, df2, by=c("ID"), all=FALSE),
{
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(D.x != D.y | D.x == "TOT" | D.y == "TOT", 1, 0)
}
)[,c("ID","AD","BD","CD","DC")]





share|improve this answer













Simply adjust the ifelse call to use the actual merged columns, D.x and D.y, and not their original sources. When you merged, you rendered all columns equal length since by definition a data frame is a list of equal length atomic vectors. But referencing df1$D and df2$D as you originally do can involve different lengths and hence your error.



merge1 <- within(merge(df1, df2, by=c("ID"), all=FALSE),
{
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(D.x != D.y | D.x == "TOT" | D.y == "TOT", 1, 0)
}
)[,c("ID","AD","BD","CD","DC")]






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 23 '18 at 16:07









ParfaitParfait

54.5k104872




54.5k104872













  • Thanks a bunch! This does exactly what I need it to. The solution was much simpler than I thought it would be. I don't know why I didn't think of following the previous merged variable conventions.

    – Nisode
    Nov 24 '18 at 1:59











  • Glad to help! Haha...All part of the learning process. We all did many facepalms!

    – Parfait
    Nov 24 '18 at 2:52



















  • Thanks a bunch! This does exactly what I need it to. The solution was much simpler than I thought it would be. I don't know why I didn't think of following the previous merged variable conventions.

    – Nisode
    Nov 24 '18 at 1:59











  • Glad to help! Haha...All part of the learning process. We all did many facepalms!

    – Parfait
    Nov 24 '18 at 2:52

















Thanks a bunch! This does exactly what I need it to. The solution was much simpler than I thought it would be. I don't know why I didn't think of following the previous merged variable conventions.

– Nisode
Nov 24 '18 at 1:59





Thanks a bunch! This does exactly what I need it to. The solution was much simpler than I thought it would be. I don't know why I didn't think of following the previous merged variable conventions.

– Nisode
Nov 24 '18 at 1:59













Glad to help! Haha...All part of the learning process. We all did many facepalms!

– Parfait
Nov 24 '18 at 2:52





Glad to help! Haha...All part of the learning process. We all did many facepalms!

– Parfait
Nov 24 '18 at 2:52




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440609%2fcomparing-character-variables-of-ids-across-2-data-frames%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Biblatex bibliography style without URLs when DOI exists (in Overleaf with Zotero bibliography)

ComboBox Display Member on multiple fields

Is it possible to collect Nectar points via Trainline?