Updates the pandas data frame with a value equal to the sum of the same df and another df… here is a solution to the problem.
Updates the pandas data frame with a value equal to the sum of the same df and another df
I have two data frames like this
df1
posting_period name sales profit
1 client1 50.00 10.00
1 client2 100.00 20.00
2 client1 150.00 30.00
df2 (this df does not have the 'profit' column as in df1)
posting_period name sales
1 client1 10.00
2 client1 20.00
I want to update client1’s sales in df1 with the sum of client1’s sales in df1
and client1's
sales in df2
, where posting_periods matches. That is
desired result
posting_period name sales profit
1 client1 60.00 10.00
1 client2 100.00 20.00
2 client1 170.00 30.00
The actual dataframe I’m working with is much larger, but these examples capture what I’m struggling to accomplish. I came up with a very roundabout approach that not only doesn’t work, but it’s not very pythonic. Another challenge is the additional columns in DF1 instead of the additional columns in DF2
. I hope someone can come up with an alternative. Thanks!
Solution
Start by creating a series that maps index columns from df2
to sales
:
idx_cols = ['posting_period', 'name']
s = df2.set_index(idx_cols)['sales']
Then update df1['sales']:
: with this series
df1['sales'] += pd. Series(df1.set_index(idx_cols).index.map(s.get)).fillna(0)
Result:
print(df1)
posting_period name sales profit
0 1 client1 60.0 10.0
1 1 client2 100.0 20.0
2 2 client1 170.0 30.0