Python – How do I replace a value in a Pandas Column multiple times?

How do I replace a value in a Pandas Column multiple times?… here is a solution to the problem.

How do I replace a value in a Pandas Column multiple times?

I have a data frame df1

Questions                             Purpose
what is scientific name of <input>    scientific name
what is english name of <input>       english name

I have 2 lists below:

name1 = ['salt','water','sugar']
name2 = ['sodium chloride','dihydrogen monoxide','sucrose']

I want to create a new data frame by replacing <input> by values in the list depending on the purpose.

If the purpose is to replace < input with an English name> press the value in name2
Otherwise, replace <input> via name1

Expected output data frame:

Questions                                   Purpose
what is scientific name of salt             scientific name
what is scientific name of water            scientific name
what is scientific name of sugar            scientific name
what is english name of sodium chloride     english name
what is english name of dihydrogen monoxide english name
what is english name of sucrose             english name

My efforts

questions = []
purposes = []

for i, row in df1.iterrows():
    if row['Purpose'] == 'scientific name':
        for name in name1:
            ques = row['Questions'].replace('<input>', name)
            questions.append(ques)
            purposes.append(row['Purpose'])
    else:
        for name in name2:
           ques = row['Questions'].replace('<input>', name)
           questions.append(ques)
           purposes.append(row['Purpose'])

df = pd. DataFrame({'Questions':questions, 'Purpose':purposes})

The above code produces the expected output. But it’s too slow because I have a lot of questions about the original dataframe. (I also have multiple purposes, but for now, I’m only sticking to 2).

I’m looking for a more efficient solution to get rid of the for loop.

Solution

One way is to iterate through the questions and use the list to understand and replace the <input> with the corresponding name itertools.cycle :

from itertools import cycle

names = [name1, name2]
new = [[i.replace('<input>', j), purpose] 
                       for row, purpose, name in zip(df. Questions, df. Purpose, names) 
                       for i,j in zip(cycle([row]), name)]

pd. DataFrame(new, columns=df.columns) 

Questions          Purpose
0              what is scientific name of salt  scientific name
1             what is scientific name of water  scientific name
2             what is scientific name of sugar  scientific name
3      what is english name of sodium chloride     english name
4  what is english name of dihydrogen monoxide     english name
5              what is english name of sucrose     english name

Related Problems and Solutions