How do I reshape this DataFrame in Python?… here is a solution to the problem.
How do I reshape this DataFrame in Python?
I
have a DataFrame df_sale
in Python and I want to reshape it, calculate the sum of the price
column and add a new column total
df_sale
b_no a_id price c_id
120 24 50 2
120 56 100 2
120 90 25 2
120 45 20 2
231 89 55 3
231 45 20 3
231 10 250 3
Abnormal output after shaping:
b_no a_id_1 a_id_2 a_id_3 a_id_4 total c_id
120 24 56 90 45 195 2
231 89 45 10 0 325 3
So far, I’ve tried using 120
and
on sum()
231df_sale['price']
, respectively. I don’t understand how I should reshape the data, add new column headers, and get the totals without being computationally efficient. Thank you.
Solution
This may not be the cleanest method (at all), but it will get the results you want:
reshaped_df = (df.groupby('b_no')[['price', 'c_id']]
.first()
.join(df.groupby('b_no')['a_id']
.apply(list)
.apply(pd. Series)
.add_prefix('a_id_'))
.drop('price',1)
.join(df.groupby('b_no')['price'].sum().to_frame('total'))
.fillna(0))
>>> reshaped_df
c_id a_id_0 a_id_1 a_id_2 a_id_3 total
b_no
120 2 24.0 56.0 90.0 45.0 195
231 3 89.0 45.0 10.0 0.0 325