Comparing Character Variables of IDs Across 2 Data Frames
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
merge1 <- within(merge(df1, df2, by=c("ID"),all=F),
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(df1$D != df2$D | df1$D == "TOT" | df2$D == "TOT", 1, 0)
})[,c("ID","AD","BD","CD","DC")]
I want to compare statistics of IDs across two data sets. Imagine each df representing data from one year. This works exactly like I want it to except when I try to add the "DC" variable using an ifelse statement. Some information about the data set is that they are not of equal length and IDs that existed in df1 may not exist in df2 and vice versa. The D variable in each data frame is comprised of organizations. However, I want the new merged df to be a binary of whether or not the ID changed organizations or not. This is why I have an ifelse statement where if D from df1 does not match D from df2 then I want it to output 1. Also if D from either or both data frames happen to be labeled as TOT then I want it to output 1. I only want it to output 0 if df1$D = df2$D and TOT is not assigned to the ID. Can ifelse( statements be used this way or am I doing something wrong? I am a bit new to R so I appreciate the help in advance.
Edit
Here is the resulting error code:
Error in `[<-.data.frame`(`*tmp*`, nl, value = list(TmC = c(1, 1, 0, 1, :
replacement element 1 has 486 rows, need 576
In addition: Warning messages:
1: In if (all.x) all.x <- (nxx <- length(m$x.alone)) > 0L :
the condition has length > 1 and only the first element will be used
2: In if (all.y) all.y <- (nyy <- length(m$y.alone)) > 0L :
the condition has length > 1 and only the first element will be used
3: In is.na(e1) | is.na(e2) :
longer object length is not a multiple of shorter object length
4: In `!=.default`(nbasumadv1617$Tm, nbasumadv1516$Tm) :
longer object length is not a multiple of shorter object length
5: In nbasumadv1617$Tm != nbasumadv1516$Tm | nbasumadv1617$Tm == "TOT" | :
longer object length is not a multiple of shorter object length
I described a simplified version of the variable names I have here. It worked as I wanted to without the fourth "DC" variable which is actually labeled TmC in my actual code. I believe this code actually produces nothing since it shows me nothing different from what I created without that fourth variable. The first two error messages still showed up without the fourth "DC" variable but that's fine. The last 3 error messages and the TmC replacement element mismatch error are new.
r
add a comment |
merge1 <- within(merge(df1, df2, by=c("ID"),all=F),
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(df1$D != df2$D | df1$D == "TOT" | df2$D == "TOT", 1, 0)
})[,c("ID","AD","BD","CD","DC")]
I want to compare statistics of IDs across two data sets. Imagine each df representing data from one year. This works exactly like I want it to except when I try to add the "DC" variable using an ifelse statement. Some information about the data set is that they are not of equal length and IDs that existed in df1 may not exist in df2 and vice versa. The D variable in each data frame is comprised of organizations. However, I want the new merged df to be a binary of whether or not the ID changed organizations or not. This is why I have an ifelse statement where if D from df1 does not match D from df2 then I want it to output 1. Also if D from either or both data frames happen to be labeled as TOT then I want it to output 1. I only want it to output 0 if df1$D = df2$D and TOT is not assigned to the ID. Can ifelse( statements be used this way or am I doing something wrong? I am a bit new to R so I appreciate the help in advance.
Edit
Here is the resulting error code:
Error in `[<-.data.frame`(`*tmp*`, nl, value = list(TmC = c(1, 1, 0, 1, :
replacement element 1 has 486 rows, need 576
In addition: Warning messages:
1: In if (all.x) all.x <- (nxx <- length(m$x.alone)) > 0L :
the condition has length > 1 and only the first element will be used
2: In if (all.y) all.y <- (nyy <- length(m$y.alone)) > 0L :
the condition has length > 1 and only the first element will be used
3: In is.na(e1) | is.na(e2) :
longer object length is not a multiple of shorter object length
4: In `!=.default`(nbasumadv1617$Tm, nbasumadv1516$Tm) :
longer object length is not a multiple of shorter object length
5: In nbasumadv1617$Tm != nbasumadv1516$Tm | nbasumadv1617$Tm == "TOT" | :
longer object length is not a multiple of shorter object length
I described a simplified version of the variable names I have here. It worked as I wanted to without the fourth "DC" variable which is actually labeled TmC in my actual code. I believe this code actually produces nothing since it shows me nothing different from what I created without that fourth variable. The first two error messages still showed up without the fourth "DC" variable but that's fine. The last 3 error messages and the TmC replacement element mismatch error are new.
r
Please describe current undesired result or error. What does code do now?
– Parfait
Nov 23 '18 at 5:01
@Parfait Sorry about that. Edit made.
– Nisode
Nov 23 '18 at 5:35
add a comment |
merge1 <- within(merge(df1, df2, by=c("ID"),all=F),
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(df1$D != df2$D | df1$D == "TOT" | df2$D == "TOT", 1, 0)
})[,c("ID","AD","BD","CD","DC")]
I want to compare statistics of IDs across two data sets. Imagine each df representing data from one year. This works exactly like I want it to except when I try to add the "DC" variable using an ifelse statement. Some information about the data set is that they are not of equal length and IDs that existed in df1 may not exist in df2 and vice versa. The D variable in each data frame is comprised of organizations. However, I want the new merged df to be a binary of whether or not the ID changed organizations or not. This is why I have an ifelse statement where if D from df1 does not match D from df2 then I want it to output 1. Also if D from either or both data frames happen to be labeled as TOT then I want it to output 1. I only want it to output 0 if df1$D = df2$D and TOT is not assigned to the ID. Can ifelse( statements be used this way or am I doing something wrong? I am a bit new to R so I appreciate the help in advance.
Edit
Here is the resulting error code:
Error in `[<-.data.frame`(`*tmp*`, nl, value = list(TmC = c(1, 1, 0, 1, :
replacement element 1 has 486 rows, need 576
In addition: Warning messages:
1: In if (all.x) all.x <- (nxx <- length(m$x.alone)) > 0L :
the condition has length > 1 and only the first element will be used
2: In if (all.y) all.y <- (nyy <- length(m$y.alone)) > 0L :
the condition has length > 1 and only the first element will be used
3: In is.na(e1) | is.na(e2) :
longer object length is not a multiple of shorter object length
4: In `!=.default`(nbasumadv1617$Tm, nbasumadv1516$Tm) :
longer object length is not a multiple of shorter object length
5: In nbasumadv1617$Tm != nbasumadv1516$Tm | nbasumadv1617$Tm == "TOT" | :
longer object length is not a multiple of shorter object length
I described a simplified version of the variable names I have here. It worked as I wanted to without the fourth "DC" variable which is actually labeled TmC in my actual code. I believe this code actually produces nothing since it shows me nothing different from what I created without that fourth variable. The first two error messages still showed up without the fourth "DC" variable but that's fine. The last 3 error messages and the TmC replacement element mismatch error are new.
r
merge1 <- within(merge(df1, df2, by=c("ID"),all=F),
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(df1$D != df2$D | df1$D == "TOT" | df2$D == "TOT", 1, 0)
})[,c("ID","AD","BD","CD","DC")]
I want to compare statistics of IDs across two data sets. Imagine each df representing data from one year. This works exactly like I want it to except when I try to add the "DC" variable using an ifelse statement. Some information about the data set is that they are not of equal length and IDs that existed in df1 may not exist in df2 and vice versa. The D variable in each data frame is comprised of organizations. However, I want the new merged df to be a binary of whether or not the ID changed organizations or not. This is why I have an ifelse statement where if D from df1 does not match D from df2 then I want it to output 1. Also if D from either or both data frames happen to be labeled as TOT then I want it to output 1. I only want it to output 0 if df1$D = df2$D and TOT is not assigned to the ID. Can ifelse( statements be used this way or am I doing something wrong? I am a bit new to R so I appreciate the help in advance.
Edit
Here is the resulting error code:
Error in `[<-.data.frame`(`*tmp*`, nl, value = list(TmC = c(1, 1, 0, 1, :
replacement element 1 has 486 rows, need 576
In addition: Warning messages:
1: In if (all.x) all.x <- (nxx <- length(m$x.alone)) > 0L :
the condition has length > 1 and only the first element will be used
2: In if (all.y) all.y <- (nyy <- length(m$y.alone)) > 0L :
the condition has length > 1 and only the first element will be used
3: In is.na(e1) | is.na(e2) :
longer object length is not a multiple of shorter object length
4: In `!=.default`(nbasumadv1617$Tm, nbasumadv1516$Tm) :
longer object length is not a multiple of shorter object length
5: In nbasumadv1617$Tm != nbasumadv1516$Tm | nbasumadv1617$Tm == "TOT" | :
longer object length is not a multiple of shorter object length
I described a simplified version of the variable names I have here. It worked as I wanted to without the fourth "DC" variable which is actually labeled TmC in my actual code. I believe this code actually produces nothing since it shows me nothing different from what I created without that fourth variable. The first two error messages still showed up without the fourth "DC" variable but that's fine. The last 3 error messages and the TmC replacement element mismatch error are new.
r
r
edited Nov 23 '18 at 13:47
Nisode
asked Nov 23 '18 at 4:20
NisodeNisode
83
83
Please describe current undesired result or error. What does code do now?
– Parfait
Nov 23 '18 at 5:01
@Parfait Sorry about that. Edit made.
– Nisode
Nov 23 '18 at 5:35
add a comment |
Please describe current undesired result or error. What does code do now?
– Parfait
Nov 23 '18 at 5:01
@Parfait Sorry about that. Edit made.
– Nisode
Nov 23 '18 at 5:35
Please describe current undesired result or error. What does code do now?
– Parfait
Nov 23 '18 at 5:01
Please describe current undesired result or error. What does code do now?
– Parfait
Nov 23 '18 at 5:01
@Parfait Sorry about that. Edit made.
– Nisode
Nov 23 '18 at 5:35
@Parfait Sorry about that. Edit made.
– Nisode
Nov 23 '18 at 5:35
add a comment |
1 Answer
1
active
oldest
votes
Simply adjust the ifelse
call to use the actual merged columns, D.x and D.y, and not their original sources. When you merged, you rendered all columns equal length since by definition a data frame is a list of equal length atomic vectors. But referencing df1$D
and df2$D
as you originally do can involve different lengths and hence your error.
merge1 <- within(merge(df1, df2, by=c("ID"), all=FALSE),
{
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(D.x != D.y | D.x == "TOT" | D.y == "TOT", 1, 0)
}
)[,c("ID","AD","BD","CD","DC")]
Thanks a bunch! This does exactly what I need it to. The solution was much simpler than I thought it would be. I don't know why I didn't think of following the previous merged variable conventions.
– Nisode
Nov 24 '18 at 1:59
Glad to help! Haha...All part of the learning process. We all did many facepalms!
– Parfait
Nov 24 '18 at 2:52
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440609%2fcomparing-character-variables-of-ids-across-2-data-frames%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Simply adjust the ifelse
call to use the actual merged columns, D.x and D.y, and not their original sources. When you merged, you rendered all columns equal length since by definition a data frame is a list of equal length atomic vectors. But referencing df1$D
and df2$D
as you originally do can involve different lengths and hence your error.
merge1 <- within(merge(df1, df2, by=c("ID"), all=FALSE),
{
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(D.x != D.y | D.x == "TOT" | D.y == "TOT", 1, 0)
}
)[,c("ID","AD","BD","CD","DC")]
Thanks a bunch! This does exactly what I need it to. The solution was much simpler than I thought it would be. I don't know why I didn't think of following the previous merged variable conventions.
– Nisode
Nov 24 '18 at 1:59
Glad to help! Haha...All part of the learning process. We all did many facepalms!
– Parfait
Nov 24 '18 at 2:52
add a comment |
Simply adjust the ifelse
call to use the actual merged columns, D.x and D.y, and not their original sources. When you merged, you rendered all columns equal length since by definition a data frame is a list of equal length atomic vectors. But referencing df1$D
and df2$D
as you originally do can involve different lengths and hence your error.
merge1 <- within(merge(df1, df2, by=c("ID"), all=FALSE),
{
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(D.x != D.y | D.x == "TOT" | D.y == "TOT", 1, 0)
}
)[,c("ID","AD","BD","CD","DC")]
Thanks a bunch! This does exactly what I need it to. The solution was much simpler than I thought it would be. I don't know why I didn't think of following the previous merged variable conventions.
– Nisode
Nov 24 '18 at 1:59
Glad to help! Haha...All part of the learning process. We all did many facepalms!
– Parfait
Nov 24 '18 at 2:52
add a comment |
Simply adjust the ifelse
call to use the actual merged columns, D.x and D.y, and not their original sources. When you merged, you rendered all columns equal length since by definition a data frame is a list of equal length atomic vectors. But referencing df1$D
and df2$D
as you originally do can involve different lengths and hence your error.
merge1 <- within(merge(df1, df2, by=c("ID"), all=FALSE),
{
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(D.x != D.y | D.x == "TOT" | D.y == "TOT", 1, 0)
}
)[,c("ID","AD","BD","CD","DC")]
Simply adjust the ifelse
call to use the actual merged columns, D.x and D.y, and not their original sources. When you merged, you rendered all columns equal length since by definition a data frame is a list of equal length atomic vectors. But referencing df1$D
and df2$D
as you originally do can involve different lengths and hence your error.
merge1 <- within(merge(df1, df2, by=c("ID"), all=FALSE),
{
AD <- A.x - A.y
BD <- B.x - B.y
CD <- C.x - C.y
DC <- ifelse(D.x != D.y | D.x == "TOT" | D.y == "TOT", 1, 0)
}
)[,c("ID","AD","BD","CD","DC")]
answered Nov 23 '18 at 16:07
ParfaitParfait
54.5k104872
54.5k104872
Thanks a bunch! This does exactly what I need it to. The solution was much simpler than I thought it would be. I don't know why I didn't think of following the previous merged variable conventions.
– Nisode
Nov 24 '18 at 1:59
Glad to help! Haha...All part of the learning process. We all did many facepalms!
– Parfait
Nov 24 '18 at 2:52
add a comment |
Thanks a bunch! This does exactly what I need it to. The solution was much simpler than I thought it would be. I don't know why I didn't think of following the previous merged variable conventions.
– Nisode
Nov 24 '18 at 1:59
Glad to help! Haha...All part of the learning process. We all did many facepalms!
– Parfait
Nov 24 '18 at 2:52
Thanks a bunch! This does exactly what I need it to. The solution was much simpler than I thought it would be. I don't know why I didn't think of following the previous merged variable conventions.
– Nisode
Nov 24 '18 at 1:59
Thanks a bunch! This does exactly what I need it to. The solution was much simpler than I thought it would be. I don't know why I didn't think of following the previous merged variable conventions.
– Nisode
Nov 24 '18 at 1:59
Glad to help! Haha...All part of the learning process. We all did many facepalms!
– Parfait
Nov 24 '18 at 2:52
Glad to help! Haha...All part of the learning process. We all did many facepalms!
– Parfait
Nov 24 '18 at 2:52
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440609%2fcomparing-character-variables-of-ids-across-2-data-frames%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Please describe current undesired result or error. What does code do now?
– Parfait
Nov 23 '18 at 5:01
@Parfait Sorry about that. Edit made.
– Nisode
Nov 23 '18 at 5:35