I have a pandas dataframe df
, which look like this:
df = pd.DataFrame({'Name':['Harry', 'Sam', 'Raj', 'Jamie', 'Rupert'],
'Country':['USA', "['USA', 'UK', 'India']", "['India', 'USA']", 'Russia', 'China']})
Name Country
Harry USA
Sam ['USA', 'UK', 'India']
Raj ['India', 'USA']
Jamie Russia
Rupert China
Some values in Country
column are list, and I want to replace those list with the first element in the list, so that it will look like this:
Name Country
Harry USA
Sam USA
Raj India
Jamie Russia
Rupert China
As you have strings, you could use a regex here:
df['Country'] = df['Country'].str.extract('((?<=\[["\'])[^"\']*|^[^"\']+$)')
output (as a new column for clarity):
Name Country Country2
0 Harry USA USA
1 Sam ['USA', 'UK', 'India'] USA
2 Raj ['India', 'USA'] India
3 Jamie Russia Russia
4 Rupert China China
regex:
( # start capturing
(?<=\[["\']) # if preceded by [" or ['
[^"\']* # get all text until " or '
| # OR
^[^"\']+$ # get whole string if it doesn't contain " or '
) # stop capturing
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With