Find a long word that is interrupted by a new line… here is a solution to the problem.
Find a long word that is interrupted by a new line
I’m trying to search a list of words, so I generated the following code:
narrative = "Lasix 40 mg b.i.d., for three days along with potassium chloride slow release 20 mEq b.i.d. for three days, Motrin 400 mg q.8h"
meds_name_final_list = ["lasix", "potassium chloride slow release", ...]
def all_occurences(file, str):
initial = 0
while True:
initial = file.find(str, initial)
if initial == -1:
return
yield initial
initial += len(str)
offset = []
for item in meds_name_final_list:
number = list(all_occurences(narrative.lower(), item))
offset.append(number)
Expected output: A list of starting indexes in the corpus of words being searched, for example:
offset = [[1], [3, 10], [5, 50].....]
This code works well for words that are not too long, such as antibiotics, emergency ward, insulin, etc. However, the above function will not detect long words interrupted by new line spacing.
Required word: potassium chloride sustained release
Any suggestions for solving this problem?
Solution
How about this?
def all_occurences(file, str):
initial = 0
file = file.replace('\n', ' ')
while True:
initial = file.find(str, initial)
if initial == -1: return
yield initial
initial += len(str)