Python – Merges Pandas and outputs only selected columns

Merges Pandas and outputs only selected columns… here is a solution to the problem.

Merges Pandas and outputs only selected columns

Is there a way to merge in pandas to limit the columns you want to see?

What do I have:

df1

ID Col1 Col2 Col3 Col4
1   1    1    1    D
2   A    C    C    4
3   B    B    B    d
4   X    2    3    6

df2

ID ColA ColB ColC ColD
1   1    1    1    D
2   A    C    X    4
3   B    B    Y    d

What I want:

df_final

ID ColA ColB ColC ColD
1   NA   NA   NA   NA
2   A    C    X    4
3   B    B    Y    d
4   NA   NA   NA   NA

I want to

left concatenate the two dataframes (keep all IDs in df1), but I only want to keep the columns in df2. If Col3 from df1 is C or B, I only need the value too.

The following is valid, but the resulting DF includes all columns for both DFs.
I can add a third row to see only the columns I want, but this is a simple example. Actually, I have a larger dataset and it’s hard to manually enter all the column names I want to keep.

df=pd.merge(df1,df2,how='left',on='ID')
df_final=df[df['Col3'].isin['C','B']]

The equivalent SQL is

create table df_final as 
select b.*
from df1 a
left join df2 b
on a.ID=b.ID
where a.Col3 in ('C','B')

Solution

Mask df1: with your ISIN condition before merging

df1.where(df1. Col3.isin(['C', 'B']))[['ID']].merge(df2, how='left', on='ID')

Or,

df1.mask(~df1. Col3.isin(['C', 'B']))[['ID']].merge(df2, how='left', on='ID')

    ID ColA ColB ColC ColD
0  NaN  NaN  NaN  NaN  NaN
1    2    A    C    X    4
2    3    B    B    Y    d
3  NaN  NaN  NaN  NaN  NaN

Related Problems and Solutions