Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace a value in Pandas Column multiple times?

I have a dataframe df1

Questions                             Purpose
what is scientific name of <input>    scientific name
what is english name of <input>       english name

And I have 2 lists as below:

name1 = ['salt','water','sugar']
name2 = ['sodium chloride','dihydrogen monoxide','sucrose']

I want to create a new dataframe by replacing <input> by values in the list depends on the purpose.

if purpose is english name replace <input> by values in name2 else replace <input> by name1.

Expected Output DataFrame:

Questions                                   Purpose
what is scientific name of salt             scientific name
what is scientific name of water            scientific name
what is scientific name of sugar            scientific name
what is english name of sodium chloride     english name
what is english name of dihydrogen monoxide english name
what is english name of sucrose             english name

My Efforts

questions = []
purposes = []

for i, row in df1.iterrows():
    if row['Purpose'] == 'scientific name':
        for name in name1:
            ques = row['Questions'].replace('<input>', name)
            questions.append(ques)
            purposes.append(row['Purpose'])
    else:
        for name in name2:
           ques = row['Questions'].replace('<input>', name)
           questions.append(ques)
           purposes.append(row['Purpose'])

df = pd.DataFrame({'Questions':questions, 'Purpose':purposes})

The above code produces expected output. But it is too slow as I have many questions on the original dataframe. (I have multiple purposes too but for now, I'm sticking with only 2).

I am looking for a more efficient solution which may get rid of for loop.

like image 570
Sociopath Avatar asked Dec 06 '25 18:12

Sociopath


2 Answers

One way you could do it is by iterating over the Questions with a list comprehension and replacing <input> with the corresponding name. In order to repeat each Question as many times as fields thre are in namesx you can use itertools.cycle:

from itertools import cycle

names = [name1, name2]
new = [[i.replace('<input>', j), purpose] 
                       for row, purpose, name in zip(df.Questions, df.Purpose, names) 
                       for i,j in zip(cycle([row]), name)]

pd.DataFrame(new, columns=df.columns) 

                                    Questions          Purpose
0              what is scientific name of salt  scientific name
1             what is scientific name of water  scientific name
2             what is scientific name of sugar  scientific name
3      what is english name of sodium chloride     english name
4  what is english name of dihydrogen monoxide     english name
5              what is english name of sucrose     english name
like image 174
yatu Avatar answered Dec 09 '25 14:12

yatu


I did something like below using pd.concat() you can try:

names = name1+name2
df_new = pd.concat([df.loc[df.Purpose.eq('scientific name')]]*len(name1))\
    .append(pd.concat([df.loc[df.Purpose.eq('english name')]]*len(name2)),ignore_index=True)

for e,i in enumerate(names):
    df_new.Questions.loc[e]=df_new.Questions.loc[e].replace('<input>',i)
print(df_new)

                                     Questions          Purpose
0              what is scientific name of salt  scientific name
1             what is scientific name of water  scientific name
2             what is scientific name of sugar  scientific name
3      what is english name of sodium chloride     english name
4  what is english name of dihydrogen monoxide     english name
5              what is english name of sucrose     english name
like image 29
anky Avatar answered Dec 09 '25 13:12

anky



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!