Python Pandas new data frame columns with group by and criteria

Python Pandas new data frame columns with group by and criteria … here is a solution to the problem.

Python Pandas new data frame columns with group by and criteria

I have a Pandas data frame like this.

player  count1  count2
A       1       1
A       2       1
A       3       1
A       4       2
A       5       2
B       1       1
B       2       2
B       3       2
B       4       2

The column player contains the name, count1 is additive and the column count2 contains other counts.

I now want to create a new column with the value count1, where column count2 contains the value 2 first

Therefore, the result should be like this:

player  count1  count2  new
A       1       1       4
A       2       1       4
A       3       1       4
A       4       2       4
A       5       2       4
B       1       1       2
B       2       2       2
B       3       2       2
B       4       2       2

I

tried doing it with transform, but I couldn’t figure out how to combine it with count2-based criteria for the column (and the value of the count1 column).

It works like this without groupby, but I don’t know where and how to add groupby:

df['new'] = df.loc[matches['count2'] == 2, 'count1'].min()

Solution

Use map by series :

s = df[df['count2'] == 2].drop_duplicates(['player']).set_index('player')['count1']

df['new'] = df['player'].map(s)
print (df)
  player  count1  count2  new
0      A       1       1    4
1      A       2       1    4
2      A       3       1    4
3      A       4       2    4
4      A       5       2    4
5      B       1       1    2
6      B       2       2    2
7      B       3       2    2
8      B       4       2    2

Details:

Start filtering by boolean indexing 2 lines:

print (df[df['count2'] == 2])
  player  count1  count2
3      A       4       2
4      A       5       2
6      B       2       2
7      B       3       2
8      B       4       2

Then drop_duplicates by Remove spoofing of player columns:

print (df[df['count2'] == 2].drop_duplicates(['player']))
  player  count1  count2
3      A       4       2
6      B       2       2

Related Problems and Solutions