Pandas rolling() in python

Share this article

In Python, pandas.DataFrame.rolling() is a method used to calculate rolling statistics over a specified window of data. It allows you to perform calculations like moving averages, sums, or other aggregations on a dataset by sliding a fixed-size window across the data.

Example

import pandas as pd

data = [1, 2, 3, 4, 5, 6]
df = pd.DataFrame(data, columns=['Numbers'])

# Calculate the rolling mean with a window size of 3
df['Rolling_Mean'] = df['Numbers'].rolling(window=3).mean()
print(df)

This will output:

   Numbers  Rolling_Mean
0        1           NaN
1        2           NaN
2        3           2.0
3        4           3.0
4        5           4.0
5        6           5.0

What it does

The rolling() method creates a sliding window of a specified size (in this case, 3) and calculates a rolling mean for each window. The first two results are NaN because there aren’t enough values to fill the window initially.

Examples

Example 1: Calculating rolling sum

df['Rolling_Sum'] = df['Numbers'].rolling(window=2).sum()
print(df)

This calculates the sum of values within each window of size 2. It will display:

   Numbers  Rolling_Sum
0        1           NaN
1        2           3.0
2        3           5.0
3        4           7.0
4        5           9.0
5        6          11.0

The sum is computed for every window of 2 values, starting from the second row.

Example 2: Rolling with different window sizes

df['Rolling_Max'] = df['Numbers'].rolling(window=4).max()
print(df)

This finds the maximum value in each rolling window of size 4:

   Numbers  Rolling_Max
0        1           NaN
1        2           NaN
2        3           NaN
3        4           4.0
4        5           5.0
5        6           6.0

The maximum value is calculated for every window of 4, starting from the fourth row.

Example 3: Applying a custom function

df['Rolling_Custom'] = df['Numbers'].rolling(window=3).apply(lambda x: x.max() - x.min())
print(df)

This example uses a custom function to find the difference between the maximum and minimum values in each rolling window of size 3:

   Numbers  Rolling_Custom
0        1             NaN
1        2             NaN
2        3             2.0
3        4             2.0
4        5             2.0
5        6             2.0

It calculates the range (max – min) for each window.

Example 4: Rolling with a specified minimum number of periods

df['Rolling_Mean_Min_Periods'] = df['Numbers'].rolling(window=3, min_periods=1).mean()
print(df)

Here, min_periods=1 allows the rolling calculation even if there aren’t enough values to fill the entire window:

   Numbers  Rolling_Mean_Min_Periods
0        1                      1.0
1        2                      1.5
2        3                      2.0
3        4                      3.0
4        5                      4.0
5        6                      5.0

It provides a mean even for windows that don’t have enough values, which can be useful when dealing with incomplete datasets.