Pandas Groupby and sum have two variables-… here is a solution to the problem.
Pandas Groupby and sum have two variables-
I’m grouping and summing two variables. The second variable is the year, for my example, only two years (2015 and 2016). For the second row of the sum, the first variable (ID#) is now displayed. How do I force the display?
Code:
totals = df.groupby(by=['id', 'year'])['sales'].sum()
print(totals)
Output sample:
1234567 2015 596407.81
2016 7224148.34
How do I get the ID of the second row to be 1234567?
Solution
In groupby
with parameter as_index=False
:
totals = df.groupby(by=['id', 'year'], as_index=False)['sales'].sum()
print(totals)
or reset_index
:
totals = df.groupby(by=['id', 'year'])['sales'].sum().reset_index()
print(totals)
The reason you don’t see the last value in the first column is MultiIndex