Applying Custom Functions to GroupBy Object

February 23, 2018

Applying Custom Functions to GroupBy Object

If you are interested to learn about in-built, commonly used groupby operations, visit - link to commonly used groupby functions -

If what you are intending to do cannot be accomplished through the standard groupby operations, Pandas also provides us with the option to apply a custom function to each group within a DataFrameGroupBy object.


Import Libraries

import pandas as pd
import seaborn as sns # to retrieve the tips dataset


Load Data

The tips dataset contains information about how much people tipped, as well as their gender and whether they are smokers.

tips = sns.load_dataset('tips')
tips.head()
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4


Recap - Applying In-built Functions to GroupBy Objects

Suppose we want to calculate the average tip amount, broken by gender and whether if the patron was a smoker. This should yield 4 values.

gb_sex_smoker_tips = tips.groupby(['sex', 'smoker'])['tip'].mean().reset_index()
gb_sex_smoker_tips = gb_sex_smoker_tips.rename(columns={'tip': 'avg_tip'})
gb_sex_smoker_tips
sex smoker avg_tip
0 Male Yes 3.051167
1 Male No 3.113402
2 Female Yes 2.931515
3 Female No 2.773519


Applying Custom Functions

We will recreate the results above to demonstrate how we can apply custom functions to a DataFrameGroupBy object. This also allows us to gain a better understanding of what happens under the hood of the in-built functions.

def group_average_tip(g):
    total_tips = g['tip'].sum()
    count = len(g)
    
    return (total_tips / count)

group_average = tips.groupby(['sex', 'smoker']).apply(group_average_tip)
group_average = group_average.reset_index().rename(columns={0: 'avg_tip'})
group_average
sex smoker avg_tip
0 Male Yes 3.051167
1 Male No 3.113402
2 Female Yes 2.931515
3 Female No 2.773519
comments powered by Disqus