Python – If the same consecutive value, grouped by the remainder of the day to show True

If the same consecutive value, grouped by the remainder of the day to show True… here is a solution to the problem.

If the same consecutive value, grouped by the remainder of the day to show True

Given the following DataFrame. How to add a new val column that shows True for the rest of the day when two consecutive “y" appear in the day val column (otherwise False).

  • The logic is reset every day.

This is close to but True should be seen on every line of the day after the condition.

Code

df_so = pd. DataFrame(
    {
        "val": list("yynnnyyynn")
    },
    index=pd.date_range(start="1/1/2018", periods=10, freq="6h"),
)

val
2018-01-01 00:00:00 y
2018-01-01 06:00:00 y
2018-01-01 12:00:00 n
2018-01-01 18:00:00 n
2018-01-02 00:00:00 n
2018-01-02 06:00:00 y
2018-01-02 12:00:00 y
2018-01-02 18:00:00 y
2018-01-03 00:00:00 n
2018-01-03 06:00:00 n

Expected output

                    val  out
2018-01-01 00:00:00  y   False
2018-01-01 06:00:00  y   False
2018-01-01 12:00:00  n   True
2018-01-01 18:00:00  n   True
2018-01-02 00:00:00  n   False
2018-01-02 06:00:00  y   False
2018-01-02 12:00:00  y   False
2018-01-02 18:00:00  y   True
2018-01-03 00:00:00  n   False
2018-01-03 06:00:00  n   False

Solution

You can use cummax to check if the condition holds at some point in the past:

target = 2
df_so['out'] = (df_so['val'].eq('y')
                    .groupby(df_so.index.normalize())
                    .transform(lambda x: x.rolling(target).sum().shift().eq(target).cummax())
               )

Output:

                    val    out
2018-01-01 00:00:00   y  False
2018-01-01 06:00:00   y  False
2018-01-01 12:00:00   n   True
2018-01-01 18:00:00   n   True
2018-01-02 00:00:00   n  False
2018-01-02 06:00:00   y  False
2018-01-02 12:00:00   y  False
2018-01-02 18:00:00   y   True
2018-01-03 00:00:00   n  False
2018-01-03 06:00:00   n  False

Related Problems and Solutions