Appending Rows to DataFrame

February 1, 2018

Appending Rows to DataFrame

Suppose you have multiple files that contain the same columns (for instance, a machine that records and generates a file on a daily basis), and you want to combine them to perform trend analysis. What is the easiest way to do this?

Enter Pandas’ very aptly named append function.

Import Libraries

import pandas as pd

Load Data

We simuate data generated by a teacher taking daily attendance of her students, and load them to a DataFrame.

mon = {'Day': ['Mon', 'Mon', 'Mon'], 
       'Name': ['Amy', 'Barry', 'Cory'], 
       'Status': ['Absent', 'Present', 'Present']}
tue = {'Day': ['Tue', 'Tue', 'Tue'], 
       'Name': ['Amy', 'Barry', 'Cory'], 
       'Status': ['Present', 'Present', 'Present']}

mon_df = pd.DataFrame(mon)
tue_df = pd.DataFrame(tue)

To combine data for both days, we simply call the append function using the first DataFrame, and supply the second DataFrame as our input to the function.

combined = mon_df.append(tue_df)
combined.head(6)

	Day	Name	Status
0	Mon	Amy	Absent
1	Mon	Barry	Present
2	Mon	Cory	Present
0	Tue	Amy	Present
1	Tue	Barry	Present
2	Tue	Cory	Present

What happens when the columns of the DataFrames do not match? We will simulate this by creating a third DataFrame with an additional column (Punctual), while at the same time removing the Status column.

wed = {'Day': ['Wed', 'Wed', 'Wed'], 
       'Name': ['Amy', 'Barry', 'Cory'], 
       'Punctual': ['Yes', 'Yes', 'No']}

wed_df = pd.DataFrame(wed)

For columns that are not common between the appended DataFrames, Pandas will autofill rows of the non-common columns with np.nan.

combined = combined.append(wed_df)
combined

	Day	Name	Punctual	Status
0	Mon	Amy	NaN	Absent
1	Mon	Barry	NaN	Present
2	Mon	Cory	NaN	Present
0	Tue	Amy	NaN	Present
1	Tue	Barry	NaN	Present
2	Tue	Cory	NaN	Present
0	Wed	Amy	Yes	NaN
1	Wed	Barry	Yes	NaN
2	Wed	Cory	No	NaN

Appending Rows to DataFrame

February 1, 2018