Python – Check the string type of the pandas dataframe column

Check the string type of the pandas dataframe column… here is a solution to the problem.

Check the string type of the pandas dataframe column

I have a fairly large pandas data frame (11k rows and 20 columns). A column has mixed data types, mostly numbers (floating-point) and a small number of strings scattered everywhere.

Before performing some statistical analysis using the data in the blended column, I subset this data frame by querying other columns (but not if a string is present). 99% of the time, once subset, this column is pure numbers, but string values rarely appear in the subset I need to capture.

What is the

most efficient way/Python way to iterate through a Pandas mixed-type column to check a string (or conversely, if the entire column is full of numeric values)?

If there is a string in the column I want to throw an error, otherwise continue.

Solution

Here’s one way. I’m not sure if it can be vectorized.

import pandas as pd

df = pd. DataFrame({'A': [1, None, 'hello', True, 'world', 'mystr', 34.11]})

df['stringy'] = [isinstance(x, str) for x in df. A]

#        A stringy
# 0      1   False
# 1   None   False
# 2  hello    True
# 3   True   False
# 4  world    True
# 5  mystr    True
# 6  34.11   False

Related Problems and Solutions