Read dates from data frames based on criteria in different data frames… here is a solution to the problem.
Read dates from data frames based on criteria in different data frames
I have 2 data frames.
I need to read the value of one data frame based on the value in another data frame
Words:
words = pd. DataFrame()
words['no'] = [1,2,3,4,5,6,7,8,9]
words['word'] = ['cat', 'in', 'hat', 'the', 'dog', 'in', 'love', '!', '<3']
words
Sentence:
sentences = pd. DataFrame()
sentences['no'] =[1,2,3]
sentences['start'] = [1, 4, 6]
sentences['stop'] = [3, 5, 9]
sentences
The desired output is a text file:
cat in hat
***
the dog
***
in love ! <3
But I can’t get past this step, I tried running the following code:
for x in sentances:
print(words[‘word’][words[‘no’].between(sentences[‘start’], sentences[‘stop’], inclusive = True)
But I returned this error
File "<ipython-input-16-ae3f5333be66>", line 3
print(words['word'][words['no'].between(sentences['start'], sentences['stop'], inclusive = True)
^
SyntaxError: unexpected EOF while parsing
Solution
Set no
to the index of words
, and then use list understanding to traverse sentences
:
v = words.set_index('no')['word']
sentences = [
' '.join(v.loc[i:j]) for i, j in zip(sentences['start'], sentences['stop'])
]
or index-independent:
v = words['word'].tolist()
sentences = [
' '.join(v[i - 1:j - 1] for i, j in zip(sentences['start'], sentences['stop'])
]
['cat in hat', 'the dog', 'in love ! <3']
Saving to a file from here should be simple:
with open('file.txt', 'w') as f:
for sent in sentences:
f.write(sent + '\n')
f.write('***\n')