Pandas Pivot Table in pyhton

Share this article

In Python, a Pandas pivot table is a powerful tool used to summarize, aggregate, and analyze data in a flexible way. It allows you to quickly group and manipulate data by creating a table that reshapes it based on specified columns.

Example

Here’s how to create a simple pivot table in Pandas:

import pandas as pd

# Sample data
data = {
    'Region': ['East', 'West', 'East', 'West', 'East'],
    'Product': ['A', 'B', 'A', 'A', 'B'],
    'Sales': [100, 200, 150, 300, 250]
}

df = pd.DataFrame(data)

# Creating a pivot table
pivot_table = df.pivot_table(values='Sales', index='Region', columns='Product', aggfunc='sum')
print(pivot_table)

What it does

  • DataFrame: Uses a sample dataset containing sales data for different regions and products.
  • pivot_table(): Aggregates the data by summing the sales for each region and product. The index parameter specifies the rows, the columns parameter specifies the columns, and values specifies the data to aggregate.

Examples

Example 1: Basic pivot table with sum aggregation

pivot_table = df.pivot_table(values='Sales', index='Region', aggfunc='sum')
print(pivot_table)

This example creates a pivot table summarizing total sales for each region. The aggfunc='sum' calculates the total sales within each region.

Example 2: Using different aggregation functions

pivot_table = df.pivot_table(values='Sales', index='Region', columns='Product', aggfunc='mean')
print(pivot_table)

Here, the pivot table calculates the mean (average) sales for each product in each region using aggfunc='mean'. You can use other functions like 'count', 'max', or 'min'.

Example 3: Filling missing values

pivot_table = df.pivot_table(values='Sales', index='Region', columns='Product', aggfunc='sum', fill_value=0)
print(pivot_table)

This example adds the fill_value parameter to fill missing values with 0, ensuring that the table remains clean and complete.

Example 4: Adding margins (totals)

pivot_table = df.pivot_table(values='Sales', index='Region', columns='Product', aggfunc='sum', margins=True)
print(pivot_table)

Including margins=True adds a total row and column to the pivot table, giving the sum of sales across all regions and products.