Python - Pandas read_excel : nan values forcing others in the same column to be converted to float

Pandas read_excel : nan values forcing others in the same column to be converted to float… here is a solution to the problem.

Pandas read_excel : nan values forcing others in the same column to be converted to float

Let’s say I want to read the following Excel file:

What I want is a simple solution (preferably a row) that reads excel in order to convert dates to str (or at least int), blank values are nan or nat or whatever can be pd.isnull The value detected.

If I use df = pd.read_excel(file_path), what I get is

df
Out[8]: 
              001002.XY  600123.AB  123456.YZ   555555.GO
ipo_date     20100203.0   20150605        NaN  20090501.0
delist_date         NaN   20170801        NaN         NaN

So pandas recognizes blank cells as NaN, which is great, but annoying that all other values are forced to float64, even if they are meant to just str or int. (EDIT: It seems that if a column, e.g. [1] column does not have nan, then other values are not forced to float.) However, in my case, most of the columns have blank delist_date because most stocks have an IPO date but have not yet been delisted. )

As far as I know, I tried the dtype=str keyword arg and it gave me

df
Out[10]: 
            001002.XY 600123.AB 123456.YZ 555555.GO
ipo_date     20100203  20150605       nan  20090501
delist_date       nan  20170801       nan       nan

Does it look good? That’s right, the date is now str, but it’s ridiculous that nan is now a literal string! For example

df.iloc[1, 0]
Out[12]: 
'nan'

This will make me have to add some weird things later, like df.replace.

didn’t try using the converter because it requires specifying the data type column by column, and the actual excel file I’m using is a very long spreadsheet (about 3k columns). I also don’t want to transpose the spreadsheet in Excel itself.

Can someone help? Thanks in advance.

Solution

Use dtype=object as a parameter.

Here’s a good explanation: pandas distinction between str and object types

Python – Pandas read_excel : nan values forcing others in the same column to be converted to float

Pandas read_excel : nan values forcing others in the same column to be converted to float

Solution

Related Problems and Solutions