Python compares a list of dates with the start and end date columns in a data frame

Python compares a list of dates with the start and end date columns in a data frame … here is a solution to the problem.

Python compares a list of dates with the start and end date columns in a data frame

Question: I have a data frame with two columns: start date and end date. I also have a list of dates. So suppose the data looks like this:

data = [[1/1/2018,3/1/2018],[2/1/2018,3/1/2018],[4/1/2018,6/1/2018]]
df = pd. DataFrame(data,columns=['startdate','enddate'])

dates=[1/1/2018,2/1/2018]

What I need to do is:

1) Create a new column for each date in the date list

2) For each row in df, assign 1 if the date of the new column is between the start date and end date, and assign a 0 if not.

I

tried using zip but then I realized that the df line will have thousands of lines and the date list will contain about 24 items (spanning 2 years), so it stops when the date list runs out, i.e., 24 years old.

Here’s what the original df looked like and what it looked like after that:

Before:

   startdate    enddate
0 2018-01-01 2018-03-01
1 2018-02-01 2018-03-01
2 2018-04-01 2018-06-01

After :

  startdate   enddate 1/1/2018 2/1/2018
0  1/1/2018  3/1/2018        1        1
1  2/1/2018  3/1/2018        0        1
2  4/1/2018  6/1/2018        0        0

Any help would be appreciated, thank you!

Solution

Use numpy broadcasting

s1=df.startdate.values
s2=df.enddate.values
v=pd.to_datetime(pd. Series(dates)).values[:,None]

newdf=pd. DataFrame(((s1<=v)&(s2>=v)). T.astype(int),columns=dates,index=df.index)
pd.concat([df,newdf],axis=1)
   startdate    enddate  1/1/2018  2/1/2018
0 2018-01-01 2018-03-01         1         1
1 2018-02-01 2018-03-01         0         1
2 2018-04-01 2018-06-01         0         0

Related Problems and Solutions