Group by data frames and concatenate strings on multiple columns in python… here is a solution to the problem.
Group by data frames and concatenate strings on multiple columns in python
I have the following data frame
A,B,C,D
91102,1,john,
91102,2,john,
91102,3,john,
91102,1,,mary
91102,2,,mary
91102,3,,mary
91103,1,sarah,
91103,2,sarah,
91103,3,sarah,
91103,1,,khan
91103,2,,khan
91103,3,,khan
I want to
group by column A and column B and want to get the desired output as shown below
A,B,C,D
91102,1,john,mary
91102,2,john,mary
91102,3,john,mary
91103,1,sarah,khan
91103,2,sarah,khan
91103,3,sarah,khan
I tried below but did not give the desired output
df=df.groupby(['A', 'B'], as_index=False).agg('' .join)
Solution
In Groupby
, you can backfill and then take the first row of the group.
df.groupby(['A','B'], as_index=False).apply(lambda x: x.bfill().iloc[0])
Result
A B C D
0 91102 1 john mary
1 91102 2 john mary
2 91102 3 john mary
3 91103 1 sarah khan
4 91103 2 sarah khan
5 91103 3 sarah khan