Python Pandas scrolls the sum bit value at the top of the window

Python Pandas scrolls the sum bit value at the top of the window … here is a solution to the problem.

Python Pandas scrolls the sum bit value at the top of the window

I want to scroll the way with Pandas. I need to tweak it a bit, however, I want to put the “value” at the top of the “window”.

Currently, I’m using this :

self.df['new_col'] = self.df['Zone3'].rolling(4).sum()

The one that produced this:

       Date     Time     Period  Zone4  Zone3  new_col
0   2018-02-23  00:00     900     11      2      NaN
1   2018-02-23  00:15     900     11      1      NaN
2   2018-02-23  00:30     900      7      3      NaN
3   2018-02-23  00:45     900      2      0      6.0
4   2018-02-23  01:00     900      3      2      6.0
5   2018-02-23  01:15     900      7      0      5.0
6   2018-02-23  01:30     900      2      4      6.0

What I really want is:

       Date     Time     Period  Zone4  Zone3  new_col
0   2018-02-23  00:00     900     11      2      6.0
1   2018-02-23  00:15     900     11      1      6.0
2   2018-02-23  00:30     900      7      3      5.0
3   2018-02-23  00:45     900      2      0      6.0
4   2018-02-23  01:00     900      3      2      NaN
5   2018-02-23  01:15     900      7      0      NaN
6   2018-02-23  01:30     900      2      4      NaN

Note that the sum value is at the beginning of the window (position 1), not at the end (position 4).

If scrolling is the wrong way, fine, any method will help. I know how to do this in a “pythonic” way (using a for loop), I just want to do this in a dataframe using pandas.

Thanks in advance

Solution

Use shift :

self.df['new_col'] = self.df['Zone3'].rolling(4).sum().shift(-3)

Or more general:

N = 4
df['new_col'] = df['Zone3'].rolling(N).sum().shift(-N+1)
print (df)
         Date   Time  Period  Zone4  Zone3  new_col
0  2018-02-23  00:00     900     11      2      6.0
1  2018-02-23  00:15     900     11      1      6.0
2  2018-02-23  00:30     900      7      3      5.0
3  2018-02-23  00:45     900      2      0      6.0
4  2018-02-23  01:00     900      3      2      NaN
5  2018-02-23  01:15     900      7      0      NaN
6  2018-02-23  01:30     900      2      4      NaN

N = 2
df['new_col'] = df['Zone3'].rolling(N).sum().shift(-N+1)
print (df)
         Date   Time  Period  Zone4  Zone3  new_col
0  2018-02-23  00:00     900     11      2      3.0
1  2018-02-23  00:15     900     11      1      4.0
2  2018-02-23  00:30     900      7      3      3.0
3  2018-02-23  00:45     900      2      0      2.0
4  2018-02-23  01:00     900      3      2      2.0
5  2018-02-23  01:15     900      7      0      4.0
6  2018-02-23  01:30     900      2      4      NaN

Related Problems and Solutions