I have data frame like this:
title price weight
0 Crloni Model145 $45,$50,$60 200gm,500gm,800gm
here $45 for 200gm, $50 for 500gm and $60 for 800 gm. My expected data frame will be look like this:
title price weight
0 Crloni Model145 $45 200gm
1 Crloni Model145 $50 500gm
2 Crloni Model145 $60 800gm
But now I getting data frame like this
title price weight
0 Crloni Model145 $45 200gm
1 Crloni Model145 $45 500gm
2 Crloni Model145 $45 800gm
3 Crloni Model145 $50 200gm
4 Crloni Model145 $50 500gm
5 Crloni Model145 $50 800gm
6 Crloni Model145 $45 200gm
7 Crloni Model145 $45 500gm
8 Crloni Model145 $45 800gm
9 Crloni Model145 $60 200gm
10 Crloni Model145 $60 500gm
11 Crloni Model145 $60 800gm
here is my code:
data['price']=data['price'].str.split(',')
data = data.explode('price')
data['weight']=data['weight'].str.split(',')
data = data.explode('weight')
updated question:
I applied Bill the Lizard solution. I am not getting any error but I am not seeing any price, weight in my csv when export the csv file.
data['price']=data['price'].str.split(',')
data['weight']=df['weight'].str.split(',')
data = data.set_index(['title']).apply(pd.Series.explode).reset_index()
data.to_csv('...mypath/data.csv')
see the pic:

here is BENY solution but I am getting cannot reindex from a duplicate axis
data['price'] = data['price'].str.split(',')
data['weight'] = data['weight'].str.split(',')
out = data.explode(['price','weight'])
data['description'] = data['description'].mask(data['description].shift() == data['description'])
#update2
Bill the Lizard solution worked but I am not understanding why BENY solution not working? why I am getting cannot reindex from a duplicate axis this error when applying BENY solution
update3 few row of my original excel file
category title price weight description
Shirt men-shirt 20,25,35 100gm,50gm,150gm shirt description....
pant men-pent 40,35,90 200gm,350gm,150gm pant description....
If you have a Pandas version prior to 1.3.0, where multi-column explode was added:
Since the lists after splitting the strings have the same number of elements, you can apply Series.explode to the price and weight columns to the the expected output.
import pandas as pd
df = pd.DataFrame({'title': ['Crloni Model145'],
'price': ['$45,$50,$60'],
'weight': ['200gm,500gm,800gm']})
df['price']=df['price'].str.split(',')
df['weight']=df['weight'].str.split(',')
df = df.set_index(['title']).apply(pd.Series.explode).reset_index()
print(df)
I set the index to title because I don't want explode to be applied to that column, then I reset the index at the end so title becomes a regular column again.
Output:
title price weight
0 Crloni Model145 $45 200gm
1 Crloni Model145 $50 500gm
2 Crloni Model145 $60 800gm
Update your pandas and explode now can accept two columns
df['price'] = df['price'].str.split(',')
df['weight'] = df['weight'].str.split(',')
out = df.explode(['price','weight'])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With