Python – Enumerates data frames based on columns

Enumerates data frames based on columns… here is a solution to the problem.

Enumerates data frames based on columns

I’m dealing with a time series data frame that looks like this, except that there are over a few thousand rows. I want to create a new column to enumerate rows with the same “symbol” value of block. i.e. row 0 will be 0, lines 1 to 23 will be 1, lines 24 to 30 will be 2 and so on… (Chronology is important) What is the most pythonic way to accomplish this? Thank you in advance

    Date       sign
0   2011-01-27  1
1   2011-01-28  -1
2   2011-01-31  -1
3   2011-02-01  -1
4   2011-02-02  -1
5   2011-02-07  -1
6   2011-02-08  -1
7   2011-02-09  -1
8   2011-02-10  -1
9   2011-02-11  -1
10  2011-02-14  -1
11  2011-02-15  -1
12  2011-02-16  -1
13  2011-02-17  -1
14  2011-02-18  -1
15  2011-02-21  -1
16  2011-02-22  -1
17  2011-02-23  -1
18  2011-02-24  -1
19  2011-02-25  -1
20  2011-02-28  -1
21  2011-03-01  -1
22  2011-03-02  -1
23  2011-03-03  -1
24  2011-03-04  1
25  2011-03-07  1
26  2011-03-08  1
27  2011-03-09  1
28  2011-03-10  1
29  2011-03-11  1
30  2011-03-14  1
31  2011-03-15  -1
32  2011-03-16  -1
33  2011-03-17  -1
34  2011-03-18  -1
35  2011-03-21  -1
36  2011-03-22  -1
37  2011-03-23  -1
38  2011-03-24  -1
39  2011-03-25  -1
40  2011-03-28  -1
41  2011-03-29  1
42  2011-03-30  1

Solution

You can get the cumsum at the symbol change, use diff to get:

df['new_column'] = (df.sign.diff()!=0).cumsum()-1

>>> df
          Date  sign  new_column
0   2011-01-27     1      0
1   2011-01-28    -1      1
2   2011-01-31    -1      1
3   2011-02-01    -1      1
4   2011-02-02    -1      1
5   2011-02-07    -1      1
6   2011-02-08    -1      1
7   2011-02-09    -1      1
8   2011-02-10    -1      1
9   2011-02-11    -1      1
10  2011-02-14    -1      1
11  2011-02-15    -1      1
12  2011-02-16    -1      1
13  2011-02-17    -1      1
14  2011-02-18    -1      1
15  2011-02-21    -1      1
16  2011-02-22    -1      1
17  2011-02-23    -1      1
18  2011-02-24    -1      1
19  2011-02-25    -1      1
20  2011-02-28    -1      1
21  2011-03-01    -1      1
22  2011-03-02    -1      1
23  2011-03-03    -1      1
24  2011-03-04     1      2
25  2011-03-07     1      2
26  2011-03-08     1      2
27  2011-03-09     1      2
28  2011-03-10     1      2
29  2011-03-11     1      2
30  2011-03-14     1      2
31  2011-03-15    -1      3
32  2011-03-16    -1      3
33  2011-03-17    -1      3
34  2011-03-18    -1      3
35  2011-03-21    -1      3
36  2011-03-22    -1      3
37  2011-03-23    -1      3
38  2011-03-24    -1      3
39  2011-03-25    -1      3
40  2011-03-28    -1      3
41  2011-03-29     1      4
42  2011-03-30     1      4

Related Problems and Solutions