Replaces nan values based on row conditions… here is a solution to the problem.
Replaces nan values based on row conditions
Here is my raw data frame df_:
index_label,id_label,morning,evening,night
a,x,nan,eating,sleep
b,x,shower,eating,nan
c,x,nan,nan,nan
d,y,work,reading,travel
e,y,nan,reading,nan
f,y,work,nan,nan
g,z,shower,nan,travel
h,z,shower,eating,nan
I tried replacing the nan value with a non-value taken from the same dataframe df based on the same id_labels. Each column “morning”, “evening” needs to be cleared from the NAN. The Night column should remain unchanged.
For example, I wrote this for the “Morning” column
crit_nan_ = pd.isna(df_[['morning']])
df_nan_ = df_.loc[crit_nan_]
df_clean_ = df_.loc[~crit_nan_]
But how do I get the result data frame:
index_label,id_label,morning,evening,night
a,x,shower,eating,sleep
b,x,shower,eating,nan
c,x,shower,eating,nan
d,y,work,reading,travel
e,y,work,reading,nan
f,y,work,reading,nan
g,z,shower,eating,travel
h,z,shower,eating,nan
Solution
The resulting data frame can use df.groupby
Get & > df.fillna :
def fill_na(x):
return x.fillna(method="ffill").fillna(method="bfill")
for col in ("morning", "evening", ):
d[col] = d.groupby("id_label")[col].transform(fill_na)