Compare dataframe columns with conditions












1















I have 2 dataframes as below:



df1:



ID   col1   col2    
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6


df2:



col1   col2   
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6


Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2



Expected Result df:



ID   col1   col2     Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2









share|improve this question

























  • You do not have list in df2

    – Wen-Ben
    Nov 22 '18 at 1:25











  • list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.

    – Osceria
    Nov 22 '18 at 9:54











  • Edited the Question

    – Osceria
    Nov 22 '18 at 11:46
















1















I have 2 dataframes as below:



df1:



ID   col1   col2    
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6


df2:



col1   col2   
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6


Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2



Expected Result df:



ID   col1   col2     Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2









share|improve this question

























  • You do not have list in df2

    – Wen-Ben
    Nov 22 '18 at 1:25











  • list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.

    – Osceria
    Nov 22 '18 at 9:54











  • Edited the Question

    – Osceria
    Nov 22 '18 at 11:46














1












1








1








I have 2 dataframes as below:



df1:



ID   col1   col2    
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6


df2:



col1   col2   
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6


Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2



Expected Result df:



ID   col1   col2     Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2









share|improve this question
















I have 2 dataframes as below:



df1:



ID   col1   col2    
1 A1 B1
2 A2 B2
3 A3 B3
4 A4 B4
5 A5 B5
6 A6 B6


df2:



col1   col2   
A1 B1
A2 O5
H3 B3
A4 B4
A5 66
A6 C6


Expected Result: I would like to generate a result df based on the condition - Each value in col1,col2 of df1 should exist in col1,col2 values of df2



Expected Result df:



ID   col1   col2     Error
1 A1 B1 No mismatch with df2
2 A2 B2 col2 mismatch with df2
3 A3 B3 col1 mismatch with df2
4 A4 B4 No mismatch with df2
5 A5 B5 col2 mismatch with df2
6 A6 B6 col2 mismatch with df2






python pandas dataframe






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 22 '18 at 11:45







Osceria

















asked Nov 21 '18 at 23:54









OsceriaOsceria

599




599













  • You do not have list in df2

    – Wen-Ben
    Nov 22 '18 at 1:25











  • list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.

    – Osceria
    Nov 22 '18 at 9:54











  • Edited the Question

    – Osceria
    Nov 22 '18 at 11:46



















  • You do not have list in df2

    – Wen-Ben
    Nov 22 '18 at 1:25











  • list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.

    – Osceria
    Nov 22 '18 at 9:54











  • Edited the Question

    – Osceria
    Nov 22 '18 at 11:46

















You do not have list in df2

– Wen-Ben
Nov 22 '18 at 1:25





You do not have list in df2

– Wen-Ben
Nov 22 '18 at 1:25













list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.

– Osceria
Nov 22 '18 at 9:54





list is a column in df1 and its value list1 and list2 are just dropdownlist names ; the accepted values are given in columns list1,list2 in df2. So, the data from column "value" of df1 based on its list value should be checked with df2 list1 & list2 values.

– Osceria
Nov 22 '18 at 9:54













Edited the Question

– Osceria
Nov 22 '18 at 11:46





Edited the Question

– Osceria
Nov 22 '18 at 11:46












2 Answers
2






active

oldest

votes


















0














Create helper DataFrame with dictionary comprehension and comparing with isin:



m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True


And then numpy.where with mask by any for test at least one True per rows and dot with matrix multiplication for get column names:



df1['Error'] = np.where(m.any(axis=1), 
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2





share|improve this answer
























  • m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?

    – Osceria
    Nov 22 '18 at 14:37













  • code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}

    – Osceria
    Nov 22 '18 at 14:41











  • @Osceria - yes, you are right. You can also pass columns to dict comprehension like m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})

    – jezrael
    Nov 22 '18 at 14:43






  • 1





    yeah, it works in this way too

    – Osceria
    Nov 22 '18 at 15:07



















0














Something like this should do the trick but there may be an easier way.



diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)

def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)

df1['Error'] = diff.apply(m, axis=1)





share|improve this answer
























  • When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"

    – Osceria
    Nov 22 '18 at 10:23











  • Edited the Question

    – Osceria
    Nov 22 '18 at 11:46











  • @Osceria do you get the same error with the following reproducible datasets: df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})

    – leoburgy
    Nov 22 '18 at 12:01











  • It's because your df1 and df2 had different columns, right? I noticed you edited the question now, does it work with those dataframes?

    – lieblos
    Nov 22 '18 at 12:47













  • If I run what I answered with the dataframes above, it seems like it works.

    – lieblos
    Nov 22 '18 at 12:49












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422071%2fcompare-dataframe-columns-with-conditions%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














Create helper DataFrame with dictionary comprehension and comparing with isin:



m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True


And then numpy.where with mask by any for test at least one True per rows and dot with matrix multiplication for get column names:



df1['Error'] = np.where(m.any(axis=1), 
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2





share|improve this answer
























  • m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?

    – Osceria
    Nov 22 '18 at 14:37













  • code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}

    – Osceria
    Nov 22 '18 at 14:41











  • @Osceria - yes, you are right. You can also pass columns to dict comprehension like m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})

    – jezrael
    Nov 22 '18 at 14:43






  • 1





    yeah, it works in this way too

    – Osceria
    Nov 22 '18 at 15:07
















0














Create helper DataFrame with dictionary comprehension and comparing with isin:



m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True


And then numpy.where with mask by any for test at least one True per rows and dot with matrix multiplication for get column names:



df1['Error'] = np.where(m.any(axis=1), 
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2





share|improve this answer
























  • m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?

    – Osceria
    Nov 22 '18 at 14:37













  • code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}

    – Osceria
    Nov 22 '18 at 14:41











  • @Osceria - yes, you are right. You can also pass columns to dict comprehension like m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})

    – jezrael
    Nov 22 '18 at 14:43






  • 1





    yeah, it works in this way too

    – Osceria
    Nov 22 '18 at 15:07














0












0








0







Create helper DataFrame with dictionary comprehension and comparing with isin:



m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True


And then numpy.where with mask by any for test at least one True per rows and dot with matrix multiplication for get column names:



df1['Error'] = np.where(m.any(axis=1), 
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2





share|improve this answer













Create helper DataFrame with dictionary comprehension and comparing with isin:



m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']})
print (m)
col1 col2
0 False False
1 False True
2 True False
3 False False
4 False True
5 False True


And then numpy.where with mask by any for test at least one True per rows and dot with matrix multiplication for get column names:



df1['Error'] = np.where(m.any(axis=1), 
m.dot(m.columns + ', ').str.rstrip(', ') + ' mismatch with df2',
'No mismatch with df2')
print (df1)
ID col1 col2 Error
0 1 A1 B1 No mismatch with df2
1 2 A2 B2 col2 mismatch with df2
2 3 A3 B3 col1 mismatch with df2
3 4 A4 B4 No mismatch with df2
4 5 A5 B5 col2 mismatch with df2
5 6 A6 B6 col2 mismatch with df2






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 22 '18 at 12:08









jezraeljezrael

352k26317391




352k26317391













  • m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?

    – Osceria
    Nov 22 '18 at 14:37













  • code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}

    – Osceria
    Nov 22 '18 at 14:41











  • @Osceria - yes, you are right. You can also pass columns to dict comprehension like m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})

    – jezrael
    Nov 22 '18 at 14:43






  • 1





    yeah, it works in this way too

    – Osceria
    Nov 22 '18 at 15:07



















  • m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?

    – Osceria
    Nov 22 '18 at 14:37













  • code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}

    – Osceria
    Nov 22 '18 at 14:41











  • @Osceria - yes, you are right. You can also pass columns to dict comprehension like m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})

    – jezrael
    Nov 22 '18 at 14:43






  • 1





    yeah, it works in this way too

    – Osceria
    Nov 22 '18 at 15:07

















m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?

– Osceria
Nov 22 '18 at 14:37







m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in ['col1','col2']}) - col1 and col2 are hard-coded. When I try to pass the column names directly from the dataframe using df.cols, it says the below error "ValueError: Must pass DataFrame with boolean values only" - Any help with this?

– Osceria
Nov 22 '18 at 14:37















code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}

– Osceria
Nov 22 '18 at 14:41





code should work if I pass all the columns from the dataframe like this lovcols = df2.columns m = pd.DataFrame({c: ~dfCSDataset[c].isin(dfLOVRules[c]) for c in [lovcols]}

– Osceria
Nov 22 '18 at 14:41













@Osceria - yes, you are right. You can also pass columns to dict comprehension like m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})

– jezrael
Nov 22 '18 at 14:43





@Osceria - yes, you are right. You can also pass columns to dict comprehension like m = pd.DataFrame({c: ~df1[c].isin(df2[c]) for c in df2.columns})

– jezrael
Nov 22 '18 at 14:43




1




1





yeah, it works in this way too

– Osceria
Nov 22 '18 at 15:07





yeah, it works in this way too

– Osceria
Nov 22 '18 at 15:07













0














Something like this should do the trick but there may be an easier way.



diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)

def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)

df1['Error'] = diff.apply(m, axis=1)





share|improve this answer
























  • When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"

    – Osceria
    Nov 22 '18 at 10:23











  • Edited the Question

    – Osceria
    Nov 22 '18 at 11:46











  • @Osceria do you get the same error with the following reproducible datasets: df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})

    – leoburgy
    Nov 22 '18 at 12:01











  • It's because your df1 and df2 had different columns, right? I noticed you edited the question now, does it work with those dataframes?

    – lieblos
    Nov 22 '18 at 12:47













  • If I run what I answered with the dataframes above, it seems like it works.

    – lieblos
    Nov 22 '18 at 12:49
















0














Something like this should do the trick but there may be an easier way.



diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)

def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)

df1['Error'] = diff.apply(m, axis=1)





share|improve this answer
























  • When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"

    – Osceria
    Nov 22 '18 at 10:23











  • Edited the Question

    – Osceria
    Nov 22 '18 at 11:46











  • @Osceria do you get the same error with the following reproducible datasets: df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})

    – leoburgy
    Nov 22 '18 at 12:01











  • It's because your df1 and df2 had different columns, right? I noticed you edited the question now, does it work with those dataframes?

    – lieblos
    Nov 22 '18 at 12:47













  • If I run what I answered with the dataframes above, it seems like it works.

    – lieblos
    Nov 22 '18 at 12:49














0












0








0







Something like this should do the trick but there may be an easier way.



diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)

def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)

df1['Error'] = diff.apply(m, axis=1)





share|improve this answer













Something like this should do the trick but there may be an easier way.



diff = pd.concat([df1[col] == df2[col] for col in df1], axis=1)

def m(row):
mismatches =
for col in diff.columns:
if not row[col]:
mismatches.append(col)
if mismatches == :
return 'No mismatch'
return 'Mismatches: ' + ', '.join(mismatches)

df1['Error'] = diff.apply(m, axis=1)






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 22 '18 at 0:20









liebloslieblos

1029




1029













  • When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"

    – Osceria
    Nov 22 '18 at 10:23











  • Edited the Question

    – Osceria
    Nov 22 '18 at 11:46











  • @Osceria do you get the same error with the following reproducible datasets: df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})

    – leoburgy
    Nov 22 '18 at 12:01











  • It's because your df1 and df2 had different columns, right? I noticed you edited the question now, does it work with those dataframes?

    – lieblos
    Nov 22 '18 at 12:47













  • If I run what I answered with the dataframes above, it seems like it works.

    – lieblos
    Nov 22 '18 at 12:49



















  • When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"

    – Osceria
    Nov 22 '18 at 10:23











  • Edited the Question

    – Osceria
    Nov 22 '18 at 11:46











  • @Osceria do you get the same error with the following reproducible datasets: df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})

    – leoburgy
    Nov 22 '18 at 12:01











  • It's because your df1 and df2 had different columns, right? I noticed you edited the question now, does it work with those dataframes?

    – lieblos
    Nov 22 '18 at 12:47













  • If I run what I answered with the dataframes above, it seems like it works.

    – lieblos
    Nov 22 '18 at 12:49

















When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"

– Osceria
Nov 22 '18 at 10:23





When I try this, I get the error "ValueError: Can only compare identically-labeled Series objects"

– Osceria
Nov 22 '18 at 10:23













Edited the Question

– Osceria
Nov 22 '18 at 11:46





Edited the Question

– Osceria
Nov 22 '18 at 11:46













@Osceria do you get the same error with the following reproducible datasets: df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})

– leoburgy
Nov 22 '18 at 12:01





@Osceria do you get the same error with the following reproducible datasets: df1 = pd.DataFrame({'col1': ["A1", "A2", "A3", "A4", "A5", "A6"], 'col2': ["B1", "B2", "B3", "B4", "B5", "B6"]}) df2 = pd.DataFrame({'col1': ["A1", "A2", "H3", "A4", "A5", "A6"], 'col2': ["B1", "O5", "B3", "B4", "66", "C6"]})

– leoburgy
Nov 22 '18 at 12:01













It's because your df1 and df2 had different columns, right? I noticed you edited the question now, does it work with those dataframes?

– lieblos
Nov 22 '18 at 12:47







It's because your df1 and df2 had different columns, right? I noticed you edited the question now, does it work with those dataframes?

– lieblos
Nov 22 '18 at 12:47















If I run what I answered with the dataframes above, it seems like it works.

– lieblos
Nov 22 '18 at 12:49





If I run what I answered with the dataframes above, it seems like it works.

– lieblos
Nov 22 '18 at 12:49


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422071%2fcompare-dataframe-columns-with-conditions%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

mysqli_query(): Empty query in /home/lucindabrummitt/public_html/blog/wp-includes/wp-db.php on line 1924

How to change which sound is reproduced for terminal bell?

Can I use Tabulator js library in my java Spring + Thymeleaf project?