Python:Pandas merger results in NaN … here is a solution to the problem.
Python:Pandas merger results in NaN
I’m trying to perform a merge with Pandas. These two files have a public key (“KEY_PLA”), which I tried to use with the left join. But unfortunately, all columns transferred from the second file to the first file have NaN values.
Here’s what I’ve done so far:
df_1 = pd.read_excel(path1, skiprows=1)
df_2 = pd.read_excel(path2, skiprows=1)
df_1.columns = ["Index", "KEY", "KEY_PLA", "INFO1", "INFO2"]
df_2.columns = ["Index", "KEY_PLA", "INFO4"]
df_1.drop(["Index"], axis=1, inplace=True)
df_2.drop(["Index"], axis=1, inplace=True)
# Merge all dataframes
df_merge = pd. DataFrame()
df_merge = df_1.merge(df_2, left_on="KEY_PLA", right_on="KEY_PLA", how="left")
print(df_merge)
This is the excel file:
What’s wrong with the code? I also checked the type and even converted the column to a string. But nothing works.
Solution
I think the problem is different types
for concatenated column KEY_PLA
, obviously one is an integer and the other is a string
.
The solution is converted to the same, for example to int
s:
print (df_1['KEY_PLA'].dtype)
object
print (df_2['KEY_PLA'].dtype)
int64
df_1['KEY_PLA'] = df_1['KEY_PLA'].astype(int)