Pandas isin() in python

Share this article

In Python’s Pandas, isin() is a method used to filter a DataFrame or Series based on whether its elements are present in a list, set, or another Series. It’s commonly used for filtering rows that match specific values in columns.

Example

import pandas as pd

# Sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40]}
df = pd.DataFrame(data)

# Check if 'Name' is in the list
filtered_df = df[df['Name'].isin(['Alice', 'David'])]
print(filtered_df)

Output:

    Name  Age
0  Alice   25
3  David   40

What it does

The isin() method checks whether each element in the ‘Name’ column exists in the provided list (['Alice', 'David']). It returns a boolean mask that can be used to filter the DataFrame, keeping only the rows where the values match.

  • True result: Rows where the column value is present in the provided list are returned.
  • False result: Rows where the column value is not in the list are filtered out.

Examples

Example 1: Using isin() with a list of values

import pandas as pd

# Sample DataFrame
data = {'Fruit': ['Apple', 'Banana', 'Mango', 'Orange'],
        'Quantity': [5, 3, 8, 2]}
df = pd.DataFrame(data)

# Filter rows where 'Fruit' is in the list
selected_fruits = df[df['Fruit'].isin(['Apple', 'Mango'])]
print(selected_fruits)

Output:

   Fruit  Quantity
0  Apple         5
2  Mango         8

This filters the DataFrame to include only rows where ‘Fruit’ is either ‘Apple’ or ‘Mango’.

Example 2: Using isin() with a Series

import pandas as pd

# Sample DataFrame
data = {'City': ['New York', 'Los Angeles', 'Chicago', 'Houston'],
        'Population': [8000000, 4000000, 2700000, 2300000]}
df = pd.DataFrame(data)

# Another Series of cities
cities_to_check = pd.Series(['Chicago', 'Houston'])

# Filter using `isin()`
selected_cities = df[df['City'].isin(cities_to_check)]
print(selected_cities)

Output:

       City  Population
2   Chicago     2700000
3   Houston     2300000

Here, isin() checks if each city in the ‘City’ column is present in the cities_to_check Series.

Example 3: Inverting the filter using ~

import pandas as pd

# Sample DataFrame
data = {'Color': ['Red', 'Blue', 'Green', 'Yellow'],
        'Code': [1, 2, 3, 4]}
df = pd.DataFrame(data)

# Exclude rows where 'Color' is in the list
excluded_colors = df[~df['Color'].isin(['Red', 'Green'])]
print(excluded_colors)

Output:

    Color  Code
1    Blue     2
3  Yellow     4

Using ~ (the NOT operator) inverts the mask, returning rows where the values are not in the provided list.

Example 4: Using isin() with multiple columns

import pandas as pd

# Sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data)

# Filter using conditions on multiple columns
selected_rows = df[df['Name'].isin(['Alice', 'David']) & df['City'].isin(['New York', 'Houston'])]
print(selected_rows)

Output:

    Name  Age    City
0  Alice   25  New York
3  David   40  Houston

This example shows how to use isin() on multiple columns to filter data based on complex conditions.