Python – Concatenation and grouping populate NaN values

Concatenation and grouping populate NaN values… here is a solution to the problem.

Concatenation and grouping populate NaN values

I have this data frame:

df:
companycode    name    address    A     B     C     ...
1234           asd     qwe,56     Tyh   123   923
1234           asd     qwe,56     Zfhs  4828  01992
6472           yui     iop,56     Retgh 8484  8484
...

I also have one that looks like this :

df2:
companycode    A     B     C       ...
1234           Jid   4123  141
6472           Low   1312  3234
...

The name and address are always the same for a single company code

I’d like to connect, join, merge, or attach them in a way that will eventually look like this :

companycode    name    address    A     B     C     ...
1234           asd     qwe,56     Tyh   123   923
1234           asd     qwe,56     Zfhs  4828  01992
6472           yui     iop,56     Retgh 8484  8484
1234           asd     qwe,56     Jid   4123  141
6472           yui     iop,56     Low   1312  3234
...

Since the name and address

of a single company code are always the same, basically I want to concatenate df2 with df in axis=0 and pull the name and address from the original df company code to this new line. It’s confusing to write, but I think it works better visually.

What should I do?

Solution

pd.concat followed by the groupby operation should do the trick.

df = pd.concat([df1, df2], 0, ignore_index=True)\
                          .groupby('companycode').ffill()
df

A     B     C address  companycode name
0    Tyh   123   923  qwe,56         1234  asd
1   Zfhs  4828  1992  qwe,56         1234  asd
2  Retgh  8484  8484  iop,56         6472  yui
3    Jid  4123   141  qwe,56         1234  asd
4    Low  1312  3234  iop,56         6472  yui

  • ignore_index=True is set to create a new index on connection
  • The join leaves a previously non-existent NaN value in the column of df2
  • Perform a groupby operation on companycode, and then perform ffill to populate these same groups of NaNs with the correct values from.

Related Problems and Solutions