Gets a list of DataFrame column names for non-floating-point columns… here is a solution to the problem.
Gets a list of DataFrame column names for non-floating-point columns
I’m trying to get a list of column names from a DataFrame that corresponds to a column of non-float type. Now I have
categorical = (df.dtypes.values != np.dtype('float64'))
It provides me with a bool array to determine if a column name is floating, but that’s not what I’m looking for. Specifically, I want a list of column names that correspond to the “true” values in my bool array.
Solution
Use boolean indexing
df.columns
:
categorical = df.columns[(df.dtypes.values != np.dtype('float64'))]
Or get difference
select_dtypes
Number of columns selected:
categorical = df.columns.difference(df.select_dtypes('float64').columns)
Example:
df = pd. DataFrame({'A':list('abcdef'),
'B':[4,5,4,5,5,4],
'C':[7.,8,9,4,2,3],
'D':[1,3,5.,7,1,0],
'E':[5,3,6,9,2,4],
'F':list('aaabbb')})
print (df)
A B C D E F
0 a 4 7.0 1.0 5 a
1 b 5 8.0 3.0 3 a
2 c 4 9.0 5.0 6 a
3 d 5 4.0 7.0 9 b
4 e 5 2.0 1.0 2 b
5 f 4 3.0 0.0 4 b
categorical = df.columns.difference(df.select_dtypes('float64').columns)
print (categorical)
Index(['A', 'B', 'E', 'F'], dtype='object')