I have this data frame:
import pandas as pd
columns = ['ID','Data']
data = [['26A20',123],
['12A20',123],
['23A20',123]]
df = pd.DataFrame.from_records(data=data, columns=columns)
>>df
ID Data
0 26A20 123
1 12A20 123
2 23A20 123
And a simple task, to remove the A:s from ID when ID starts with 26 or 23:
df.loc[df['ID'].str.startswith(('23','26'))]['ID'] = df['ID'].str.replace('A','')
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
And nothing Changes:
>>df
ID Data
0 26A20 123
1 12A20 123
2 23A20 123
Im using loc, what am I doing wrong?
Remove double ][ for avoid chained assignments:
df.loc[df['ID'].str.startswith(('23','26')), 'ID'] = df['ID'].str.replace('A','')
print (df)
ID Data
0 2620 123
1 12A20 123
2 2320 123
Also is possible filter in both sides for reduce execute of function replace:
mask = df['ID'].str.startswith(('23','26'))
df.loc[mask, 'ID'] = df.loc[mask, 'ID'].str.replace('A','')
print (df)
ID Data
0 2620 123
1 12A20 123
2 2320 123
And there is np.where() approach:
df['ID'] = np.where(df['ID'].str.startswith(('23','26')), df['ID'].str.replace('A', ''), df['ID'])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With