Python – Why can’t I use Python pandas to set one family type equal to another family type

Why can’t I use Python pandas to set one family type equal to another family type… here is a solution to the problem.

Why can’t I use Python pandas to set one family type equal to another family type

I’m new to python, so forgive me if this seems like a simple question.

I have a data frame. My goal is to take the value of a data frame and convert it to another type and replace the column. Here is the code:

strtotime = {}
for x in range(0,len(results['CreationDate'])):
    strtotime[x] = datetime.strptime(results['CreationDate'][x], '%Y-%m-%dT%H:%M:%S.%f')
results['CreationDate'] = pd.to_datetime(pd. Series(strtotime))

I store the values as a dictionary, using pd. Series converts it to series, at which point I’m fairly sure I can replace one series with another :

i.e. results['CreationDate'] = pd.to_datetime(pd. Series(strtotime))

But what I get is a column of NaT instead of these neat datetimes 2015-01-01 10:59:37.403.

Then I used results['CreationDate'] = list(pd.to_datetime(pd. Series(strtotime)))

It worked perfectly, as I had hoped. So my question is, why is this so? Does it even have anything to do with object types?

Solution

When you assign a series to a DataFrame column, pandas matches the new value against the index. Your original DataFrame may have some meaningful indexes, but your new series it only has default indexes 0, 1, 2, 3… because these are the keys in the dictionary. Here’s a simple example:

>>> d = pandas. DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]}, index=[10, 11, 12])
>>> d
    A  B
10  1  4
11  2  5
12  3  6
>>> d["C"] = pandas. Series([8, 88, 888])
>>> d
    A  B   C
10  1  4 NaN
11  2  5 NaN
12  3  6 NaN
>>> d["C"] = pandas. Series([8, 88, 888], index=[10, 11, 12])
>>> d
    A  B    C
10  1  4    8
11  2  5   88
12  3  6  888

Note that assigning the wrong index to a series results in NaN, but creating a new series with the same index results in entering the value as expected.

In your case, you create a new series by applying a function to each element of the original column. Do not repeat this. Use the .map method. In this case, there is a built-in pandas function that converts a string to datetime:

results['CreationDate'] = results['CreationDate'].map(pandas.to_datetime)

.map gives a new series with the same index as the old series. (If your date doesn’t resolve correctly, you can apply a lambda that provides a format parameter for to_datetime.) )

(As piRsquared points out in the review, to_datetime actually accepts a Series parameter, so you only need to execute results[‘CreationDate'] = pandas.to_datetime(results['CreationDate']). )

Related Problems and Solutions