Python – Pandas Groupby and sum have two variables-

Pandas Groupby and sum have two variables-… here is a solution to the problem.

Pandas Groupby and sum have two variables-

I’m grouping and summing two variables. The second variable is the year, for my example, only two years (2015 and 2016). For the second row of the sum, the first variable (ID#) is now displayed. How do I force the display?

Code:

totals = df.groupby(by=['id', 'year'])['sales'].sum()
print(totals)

Output sample:

1234567             2015             596407.81
                    2016            7224148.34

How do I get the ID of the second row to be 1234567?

Solution

In groupby with parameter as_index=False:

totals = df.groupby(by=['id', 'year'], as_index=False)['sales'].sum()
print(totals)

or reset_index :

totals = df.groupby(by=['id', 'year'])['sales'].sum().reset_index()
print(totals)

The reason you don’t see the last value in the first column is MultiIndex

Related Problems and Solutions