Python Pandas - difference between 'loc' and 'where'?
Just curious on the behavior of 'where' and why you would use it over 'loc'.
If I create a dataframe:
df = pd.DataFrame({'ID':[1,2,3,4,5,6,7,8,9,10],
'Run Distance':[234,35,77,787,243,5435,775,123,355,123],
'Goals':[12,23,56,7,8,0,4,2,1,34],
'Gender':['m','m','m','f','f','m','f','m','f','m']})
And then apply the 'where' function:
df2 = df.where(df['Goals']>10)
I get the following which filters out the results where Goals > 10, but leaves everything else as NaN:
Gender Goals ID Run Distance
0 m 12.0 1.0 234.0
1 m 23.0 2.0 35.0
2 m 56.0 3.0 77.0
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 m 34.0 10.0 123.0
If however I use the 'loc' function:
df2 = df.loc[df['Goals']>10]
It returns the dataframe subsetted without the NaN values:
Gender Goals ID Run Distance
0 m 12 1 234
1 m 23 2 35
2 m 56 3 77
9 m 34 10 123
So essentially I am curious why you would use 'where' over 'loc/iloc' and why it returns NaN values?
python pandas
add a comment |
Just curious on the behavior of 'where' and why you would use it over 'loc'.
If I create a dataframe:
df = pd.DataFrame({'ID':[1,2,3,4,5,6,7,8,9,10],
'Run Distance':[234,35,77,787,243,5435,775,123,355,123],
'Goals':[12,23,56,7,8,0,4,2,1,34],
'Gender':['m','m','m','f','f','m','f','m','f','m']})
And then apply the 'where' function:
df2 = df.where(df['Goals']>10)
I get the following which filters out the results where Goals > 10, but leaves everything else as NaN:
Gender Goals ID Run Distance
0 m 12.0 1.0 234.0
1 m 23.0 2.0 35.0
2 m 56.0 3.0 77.0
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 m 34.0 10.0 123.0
If however I use the 'loc' function:
df2 = df.loc[df['Goals']>10]
It returns the dataframe subsetted without the NaN values:
Gender Goals ID Run Distance
0 m 12 1 234
1 m 23 2 35
2 m 56 3 77
9 m 34 10 123
So essentially I am curious why you would use 'where' over 'loc/iloc' and why it returns NaN values?
python pandas
1
Related: Pandas mask / where methods versus NumPy np.where. Summary: Pandaswhere
rarely outperforms (or is more readable versus) the more popular NumPynp.where
, so the former is often irrelevant.
– jpp
Feb 27 at 15:10
Thank you jpp. Interesting question by you and response by 'ead'. I will look at numpy for using 'where'.
– ScoutEU
Feb 27 at 15:28
add a comment |
Just curious on the behavior of 'where' and why you would use it over 'loc'.
If I create a dataframe:
df = pd.DataFrame({'ID':[1,2,3,4,5,6,7,8,9,10],
'Run Distance':[234,35,77,787,243,5435,775,123,355,123],
'Goals':[12,23,56,7,8,0,4,2,1,34],
'Gender':['m','m','m','f','f','m','f','m','f','m']})
And then apply the 'where' function:
df2 = df.where(df['Goals']>10)
I get the following which filters out the results where Goals > 10, but leaves everything else as NaN:
Gender Goals ID Run Distance
0 m 12.0 1.0 234.0
1 m 23.0 2.0 35.0
2 m 56.0 3.0 77.0
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 m 34.0 10.0 123.0
If however I use the 'loc' function:
df2 = df.loc[df['Goals']>10]
It returns the dataframe subsetted without the NaN values:
Gender Goals ID Run Distance
0 m 12 1 234
1 m 23 2 35
2 m 56 3 77
9 m 34 10 123
So essentially I am curious why you would use 'where' over 'loc/iloc' and why it returns NaN values?
python pandas
Just curious on the behavior of 'where' and why you would use it over 'loc'.
If I create a dataframe:
df = pd.DataFrame({'ID':[1,2,3,4,5,6,7,8,9,10],
'Run Distance':[234,35,77,787,243,5435,775,123,355,123],
'Goals':[12,23,56,7,8,0,4,2,1,34],
'Gender':['m','m','m','f','f','m','f','m','f','m']})
And then apply the 'where' function:
df2 = df.where(df['Goals']>10)
I get the following which filters out the results where Goals > 10, but leaves everything else as NaN:
Gender Goals ID Run Distance
0 m 12.0 1.0 234.0
1 m 23.0 2.0 35.0
2 m 56.0 3.0 77.0
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 m 34.0 10.0 123.0
If however I use the 'loc' function:
df2 = df.loc[df['Goals']>10]
It returns the dataframe subsetted without the NaN values:
Gender Goals ID Run Distance
0 m 12 1 234
1 m 23 2 35
2 m 56 3 77
9 m 34 10 123
So essentially I am curious why you would use 'where' over 'loc/iloc' and why it returns NaN values?
python pandas
python pandas
edited Feb 27 at 8:17
ScoutEU
asked Feb 27 at 8:06
ScoutEUScoutEU
75421135
75421135
1
Related: Pandas mask / where methods versus NumPy np.where. Summary: Pandaswhere
rarely outperforms (or is more readable versus) the more popular NumPynp.where
, so the former is often irrelevant.
– jpp
Feb 27 at 15:10
Thank you jpp. Interesting question by you and response by 'ead'. I will look at numpy for using 'where'.
– ScoutEU
Feb 27 at 15:28
add a comment |
1
Related: Pandas mask / where methods versus NumPy np.where. Summary: Pandaswhere
rarely outperforms (or is more readable versus) the more popular NumPynp.where
, so the former is often irrelevant.
– jpp
Feb 27 at 15:10
Thank you jpp. Interesting question by you and response by 'ead'. I will look at numpy for using 'where'.
– ScoutEU
Feb 27 at 15:28
1
1
Related: Pandas mask / where methods versus NumPy np.where. Summary: Pandas
where
rarely outperforms (or is more readable versus) the more popular NumPy np.where
, so the former is often irrelevant.– jpp
Feb 27 at 15:10
Related: Pandas mask / where methods versus NumPy np.where. Summary: Pandas
where
rarely outperforms (or is more readable versus) the more popular NumPy np.where
, so the former is often irrelevant.– jpp
Feb 27 at 15:10
Thank you jpp. Interesting question by you and response by 'ead'. I will look at numpy for using 'where'.
– ScoutEU
Feb 27 at 15:28
Thank you jpp. Interesting question by you and response by 'ead'. I will look at numpy for using 'where'.
– ScoutEU
Feb 27 at 15:28
add a comment |
3 Answers
3
active
oldest
votes
Think of loc
as a filter - give me only the parts of the df that conform to a condition.
where
originally comes from numpy. It runs over an array and checks if each element fits a condition. So it gives you back the entire array, with a result or NaN
. A nice feature of where
is that you can also get back something different, e.g. df2 = df.where(df['Goals']>10, other='0')
, to replace values that don't meet the condition with 0.
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
3 0 0 0 0
4 0 0 0 0
5 0 0 0 0
6 0 0 0 0
7 0 0 0 0
8 0 0 0 0
9 10 123 34 m
Also, while where
is only for conditional filtering, loc
is the standard way of selecting in Pandas, along with iloc
. loc
uses row and column names, while iloc
uses their index number. So with loc
you could choose to return, say, df.loc[0:1, ['Gender', 'Goals']]
:
Gender Goals
0 m 12
1 m 23
1
That is super helpful, thank you. So 'loc' filters, and 'where' is more for where you want to change values that do not fit the condition to something else. Perfect, thank you!
– ScoutEU
Feb 27 at 8:15
add a comment |
If check docs DataFrame.where
it replace rows by condition - default by NAN
, but is possible specify value:
df2 = df.where(df['Goals']>10)
print (df2)
ID Run Distance Goals Gender
0 1.0 234.0 12.0 m
1 2.0 35.0 23.0 m
2 3.0 77.0 56.0 m
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 10.0 123.0 34.0 m
df2 = df.where(df['Goals']>10, 100)
print (df2)
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
3 100 100 100 100
4 100 100 100 100
5 100 100 100 100
6 100 100 100 100
7 100 100 100 100
8 100 100 100 100
9 10 123 34 m
Another syntax is called boolean indexing
and is for filter rows - remove rows matched condition.
df2 = df.loc[df['Goals']>10]
#alternative
df2 = df[df['Goals']>10]
print (df2)
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
9 10 123 34 m
If use loc
is possible also filter by rows by condition and columns by name(s):
s = df.loc[df['Goals']>10, 'ID']
print (s)
0 1
1 2
2 3
9 10
Name: ID, dtype: int64
df2 = df.loc[df['Goals']>10, ['ID','Gender']]
print (df2)
ID Gender
0 1 m
1 2 m
2 3 m
9 10 m
That makes a lot of sense, thank you very much. Also thanks for the tip on the alternative!
– ScoutEU
Feb 27 at 8:12
add a comment |
loc
retrieves only the rows that matches the condition.
where
returns the whole dataframe, replacing the rows that don't match the condition (NaN by default).
1
Great, thank you. 'Where' is a lot more useful than originally thought!
– ScoutEU
Feb 27 at 8:12
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54900717%2fpython-pandas-difference-between-loc-and-where%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Think of loc
as a filter - give me only the parts of the df that conform to a condition.
where
originally comes from numpy. It runs over an array and checks if each element fits a condition. So it gives you back the entire array, with a result or NaN
. A nice feature of where
is that you can also get back something different, e.g. df2 = df.where(df['Goals']>10, other='0')
, to replace values that don't meet the condition with 0.
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
3 0 0 0 0
4 0 0 0 0
5 0 0 0 0
6 0 0 0 0
7 0 0 0 0
8 0 0 0 0
9 10 123 34 m
Also, while where
is only for conditional filtering, loc
is the standard way of selecting in Pandas, along with iloc
. loc
uses row and column names, while iloc
uses their index number. So with loc
you could choose to return, say, df.loc[0:1, ['Gender', 'Goals']]
:
Gender Goals
0 m 12
1 m 23
1
That is super helpful, thank you. So 'loc' filters, and 'where' is more for where you want to change values that do not fit the condition to something else. Perfect, thank you!
– ScoutEU
Feb 27 at 8:15
add a comment |
Think of loc
as a filter - give me only the parts of the df that conform to a condition.
where
originally comes from numpy. It runs over an array and checks if each element fits a condition. So it gives you back the entire array, with a result or NaN
. A nice feature of where
is that you can also get back something different, e.g. df2 = df.where(df['Goals']>10, other='0')
, to replace values that don't meet the condition with 0.
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
3 0 0 0 0
4 0 0 0 0
5 0 0 0 0
6 0 0 0 0
7 0 0 0 0
8 0 0 0 0
9 10 123 34 m
Also, while where
is only for conditional filtering, loc
is the standard way of selecting in Pandas, along with iloc
. loc
uses row and column names, while iloc
uses their index number. So with loc
you could choose to return, say, df.loc[0:1, ['Gender', 'Goals']]
:
Gender Goals
0 m 12
1 m 23
1
That is super helpful, thank you. So 'loc' filters, and 'where' is more for where you want to change values that do not fit the condition to something else. Perfect, thank you!
– ScoutEU
Feb 27 at 8:15
add a comment |
Think of loc
as a filter - give me only the parts of the df that conform to a condition.
where
originally comes from numpy. It runs over an array and checks if each element fits a condition. So it gives you back the entire array, with a result or NaN
. A nice feature of where
is that you can also get back something different, e.g. df2 = df.where(df['Goals']>10, other='0')
, to replace values that don't meet the condition with 0.
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
3 0 0 0 0
4 0 0 0 0
5 0 0 0 0
6 0 0 0 0
7 0 0 0 0
8 0 0 0 0
9 10 123 34 m
Also, while where
is only for conditional filtering, loc
is the standard way of selecting in Pandas, along with iloc
. loc
uses row and column names, while iloc
uses their index number. So with loc
you could choose to return, say, df.loc[0:1, ['Gender', 'Goals']]
:
Gender Goals
0 m 12
1 m 23
Think of loc
as a filter - give me only the parts of the df that conform to a condition.
where
originally comes from numpy. It runs over an array and checks if each element fits a condition. So it gives you back the entire array, with a result or NaN
. A nice feature of where
is that you can also get back something different, e.g. df2 = df.where(df['Goals']>10, other='0')
, to replace values that don't meet the condition with 0.
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
3 0 0 0 0
4 0 0 0 0
5 0 0 0 0
6 0 0 0 0
7 0 0 0 0
8 0 0 0 0
9 10 123 34 m
Also, while where
is only for conditional filtering, loc
is the standard way of selecting in Pandas, along with iloc
. loc
uses row and column names, while iloc
uses their index number. So with loc
you could choose to return, say, df.loc[0:1, ['Gender', 'Goals']]
:
Gender Goals
0 m 12
1 m 23
edited Feb 27 at 8:28
answered Feb 27 at 8:11
Josh FriedlanderJosh Friedlander
2,5771928
2,5771928
1
That is super helpful, thank you. So 'loc' filters, and 'where' is more for where you want to change values that do not fit the condition to something else. Perfect, thank you!
– ScoutEU
Feb 27 at 8:15
add a comment |
1
That is super helpful, thank you. So 'loc' filters, and 'where' is more for where you want to change values that do not fit the condition to something else. Perfect, thank you!
– ScoutEU
Feb 27 at 8:15
1
1
That is super helpful, thank you. So 'loc' filters, and 'where' is more for where you want to change values that do not fit the condition to something else. Perfect, thank you!
– ScoutEU
Feb 27 at 8:15
That is super helpful, thank you. So 'loc' filters, and 'where' is more for where you want to change values that do not fit the condition to something else. Perfect, thank you!
– ScoutEU
Feb 27 at 8:15
add a comment |
If check docs DataFrame.where
it replace rows by condition - default by NAN
, but is possible specify value:
df2 = df.where(df['Goals']>10)
print (df2)
ID Run Distance Goals Gender
0 1.0 234.0 12.0 m
1 2.0 35.0 23.0 m
2 3.0 77.0 56.0 m
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 10.0 123.0 34.0 m
df2 = df.where(df['Goals']>10, 100)
print (df2)
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
3 100 100 100 100
4 100 100 100 100
5 100 100 100 100
6 100 100 100 100
7 100 100 100 100
8 100 100 100 100
9 10 123 34 m
Another syntax is called boolean indexing
and is for filter rows - remove rows matched condition.
df2 = df.loc[df['Goals']>10]
#alternative
df2 = df[df['Goals']>10]
print (df2)
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
9 10 123 34 m
If use loc
is possible also filter by rows by condition and columns by name(s):
s = df.loc[df['Goals']>10, 'ID']
print (s)
0 1
1 2
2 3
9 10
Name: ID, dtype: int64
df2 = df.loc[df['Goals']>10, ['ID','Gender']]
print (df2)
ID Gender
0 1 m
1 2 m
2 3 m
9 10 m
That makes a lot of sense, thank you very much. Also thanks for the tip on the alternative!
– ScoutEU
Feb 27 at 8:12
add a comment |
If check docs DataFrame.where
it replace rows by condition - default by NAN
, but is possible specify value:
df2 = df.where(df['Goals']>10)
print (df2)
ID Run Distance Goals Gender
0 1.0 234.0 12.0 m
1 2.0 35.0 23.0 m
2 3.0 77.0 56.0 m
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 10.0 123.0 34.0 m
df2 = df.where(df['Goals']>10, 100)
print (df2)
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
3 100 100 100 100
4 100 100 100 100
5 100 100 100 100
6 100 100 100 100
7 100 100 100 100
8 100 100 100 100
9 10 123 34 m
Another syntax is called boolean indexing
and is for filter rows - remove rows matched condition.
df2 = df.loc[df['Goals']>10]
#alternative
df2 = df[df['Goals']>10]
print (df2)
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
9 10 123 34 m
If use loc
is possible also filter by rows by condition and columns by name(s):
s = df.loc[df['Goals']>10, 'ID']
print (s)
0 1
1 2
2 3
9 10
Name: ID, dtype: int64
df2 = df.loc[df['Goals']>10, ['ID','Gender']]
print (df2)
ID Gender
0 1 m
1 2 m
2 3 m
9 10 m
That makes a lot of sense, thank you very much. Also thanks for the tip on the alternative!
– ScoutEU
Feb 27 at 8:12
add a comment |
If check docs DataFrame.where
it replace rows by condition - default by NAN
, but is possible specify value:
df2 = df.where(df['Goals']>10)
print (df2)
ID Run Distance Goals Gender
0 1.0 234.0 12.0 m
1 2.0 35.0 23.0 m
2 3.0 77.0 56.0 m
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 10.0 123.0 34.0 m
df2 = df.where(df['Goals']>10, 100)
print (df2)
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
3 100 100 100 100
4 100 100 100 100
5 100 100 100 100
6 100 100 100 100
7 100 100 100 100
8 100 100 100 100
9 10 123 34 m
Another syntax is called boolean indexing
and is for filter rows - remove rows matched condition.
df2 = df.loc[df['Goals']>10]
#alternative
df2 = df[df['Goals']>10]
print (df2)
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
9 10 123 34 m
If use loc
is possible also filter by rows by condition and columns by name(s):
s = df.loc[df['Goals']>10, 'ID']
print (s)
0 1
1 2
2 3
9 10
Name: ID, dtype: int64
df2 = df.loc[df['Goals']>10, ['ID','Gender']]
print (df2)
ID Gender
0 1 m
1 2 m
2 3 m
9 10 m
If check docs DataFrame.where
it replace rows by condition - default by NAN
, but is possible specify value:
df2 = df.where(df['Goals']>10)
print (df2)
ID Run Distance Goals Gender
0 1.0 234.0 12.0 m
1 2.0 35.0 23.0 m
2 3.0 77.0 56.0 m
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 10.0 123.0 34.0 m
df2 = df.where(df['Goals']>10, 100)
print (df2)
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
3 100 100 100 100
4 100 100 100 100
5 100 100 100 100
6 100 100 100 100
7 100 100 100 100
8 100 100 100 100
9 10 123 34 m
Another syntax is called boolean indexing
and is for filter rows - remove rows matched condition.
df2 = df.loc[df['Goals']>10]
#alternative
df2 = df[df['Goals']>10]
print (df2)
ID Run Distance Goals Gender
0 1 234 12 m
1 2 35 23 m
2 3 77 56 m
9 10 123 34 m
If use loc
is possible also filter by rows by condition and columns by name(s):
s = df.loc[df['Goals']>10, 'ID']
print (s)
0 1
1 2
2 3
9 10
Name: ID, dtype: int64
df2 = df.loc[df['Goals']>10, ['ID','Gender']]
print (df2)
ID Gender
0 1 m
1 2 m
2 3 m
9 10 m
edited Feb 27 at 8:18
answered Feb 27 at 8:09
jezraeljezrael
344k25297370
344k25297370
That makes a lot of sense, thank you very much. Also thanks for the tip on the alternative!
– ScoutEU
Feb 27 at 8:12
add a comment |
That makes a lot of sense, thank you very much. Also thanks for the tip on the alternative!
– ScoutEU
Feb 27 at 8:12
That makes a lot of sense, thank you very much. Also thanks for the tip on the alternative!
– ScoutEU
Feb 27 at 8:12
That makes a lot of sense, thank you very much. Also thanks for the tip on the alternative!
– ScoutEU
Feb 27 at 8:12
add a comment |
loc
retrieves only the rows that matches the condition.
where
returns the whole dataframe, replacing the rows that don't match the condition (NaN by default).
1
Great, thank you. 'Where' is a lot more useful than originally thought!
– ScoutEU
Feb 27 at 8:12
add a comment |
loc
retrieves only the rows that matches the condition.
where
returns the whole dataframe, replacing the rows that don't match the condition (NaN by default).
1
Great, thank you. 'Where' is a lot more useful than originally thought!
– ScoutEU
Feb 27 at 8:12
add a comment |
loc
retrieves only the rows that matches the condition.
where
returns the whole dataframe, replacing the rows that don't match the condition (NaN by default).
loc
retrieves only the rows that matches the condition.
where
returns the whole dataframe, replacing the rows that don't match the condition (NaN by default).
answered Feb 27 at 8:11
CastiCasti
8318
8318
1
Great, thank you. 'Where' is a lot more useful than originally thought!
– ScoutEU
Feb 27 at 8:12
add a comment |
1
Great, thank you. 'Where' is a lot more useful than originally thought!
– ScoutEU
Feb 27 at 8:12
1
1
Great, thank you. 'Where' is a lot more useful than originally thought!
– ScoutEU
Feb 27 at 8:12
Great, thank you. 'Where' is a lot more useful than originally thought!
– ScoutEU
Feb 27 at 8:12
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54900717%2fpython-pandas-difference-between-loc-and-where%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Related: Pandas mask / where methods versus NumPy np.where. Summary: Pandas
where
rarely outperforms (or is more readable versus) the more popular NumPynp.where
, so the former is often irrelevant.– jpp
Feb 27 at 15:10
Thank you jpp. Interesting question by you and response by 'ead'. I will look at numpy for using 'where'.
– ScoutEU
Feb 27 at 15:28