How can I select rows from a DataFrame based on values in some column in Pandas?
In SQL, I would use:
SELECT *
FROM table
WHERE colume_name = some_value
I tried to look at Pandas' documentation, but I did not immediately find the answer.
dataframepandaspython
How can I select rows from a DataFrame based on values in some column in Pandas?
In SQL, I would use:
SELECT *
FROM table
WHERE colume_name = some_value
I tried to look at Pandas' documentation, but I did not immediately find the answer.
Best Solution
To select rows whose column value equals a scalar,
some_value, use==:To select rows whose column value is in an iterable,
some_values, useisin:Combine multiple conditions with
&:Note the parentheses. Due to Python's operator precedence rules,
&binds more tightly than<=and>=. Thus, the parentheses in the last example are necessary. Without the parenthesesis parsed as
which results in a Truth value of a Series is ambiguous error.
To select rows whose column value does not equal
some_value, use!=:isinreturns a boolean Series, so to select rows whose value is not insome_values, negate the boolean Series using~:For example,
yields
If you have multiple values you want to include, put them in a list (or more generally, any iterable) and use
isin:yields
Note, however, that if you wish to do this many times, it is more efficient to make an index first, and then use
df.loc:yields
or, to include multiple values from the index use
df.index.isin:yields