Python – How do I create a new column in a data frame based on the criteria of another column?

How do I create a new column in a data frame based on the criteria of another column?… here is a solution to the problem.

How do I create a new column in a data frame based on the criteria of another column?

I have a data frame like this:

                             TransactionId   Value
Timestamp                                     
2018-01-07 22:00:00.000         633025      674.87
2018-01-07 22:15:00.000         633025      676.11
2018-01-07 22:30:00.000         633025      677.06

I want to create a third column with 3 possible classes based on the criteria of the other two columns. I tried writing a function below but it doesn’t work – I don’t get the return value when calling df.head() after calling that function.

b = df.shape[0]
def charger_state(df):
    a = 1
    while a <= b: 
        if df. Value[a]-df. Value[(a-1)] > 0.1 :
            df['Charger State']= "Charging"
        elif df. Value[a]-df. Value[(a-1)] < 0.1 \
        and df['TransactionId'] > 0:
            df['Charger State']= "Not Charging"
        else: 
            df['Charger State']= "Vacant"
    a = a+1

Other answers around the topic don’t seem to cover the 3 classes of the new column, but I’m new to it, so I may not understand.

Solution

First, set your criteria:

c1 = df. Value.sub(df. Value.shift()).gt(0.1)
c2 = df. Value.diff().lt(0.1) & df.TransactionId.gt(0)

Now use np.select:

df.assign(ChargerState=np.select([c1, c2], ['Charging', 'Not Charging'], 'Vacant'))

                     TransactionId   Value ChargerState
Timestamp
2018-01-07 22:00:00         633025  674.87       Vacant
2018-01-07 22:15:00         633025  676.11     Charging
2018-01-07 22:30:00         633025  677.06     Charging

You may need to adjust c1 because in this example, although it has both TransactionId and Value, it appears empty because there is no previous line.

One possible option is to assume that if the device has Value and TransactionID, it has started charging, and we can finish using fillna: on c1

c1 = df. Value.sub(df. Value.shift().fillna(0)).gt(0.1)    # Notice the fillna
c2 = df. Value.diff().lt(0.1) & df.TransactionId.gt(0)

df.assign(ChargerState=np.select([c1, c2], ['Charging', 'Not Charging'], 'Vacant'))

                     TransactionId   Value ChargerState
Timestamp
2018-01-07 22:00:00         633025  674.87     Charging
2018-01-07 22:15:00         633025  676.11     Charging
2018-01-07 22:30:00         633025  677.06     Charging

Related Problems and Solutions