Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python replace string with empty if length not equal to x

I have the following dataframe:

df=pd.DataFrame({'ssn':[12345,54321,111,47895,222311],'Name':['john','mike','adam','doug','liz']})

The DataFrame contains a 'ssn' that is supposed to only contain 5 digits. I want to replace all the rows that contain less than or greater than 5 digits with blank spaces.

The desired output is as below:

   Name   ssn
0  john   12345
1  mike   54321
2  adam   
3  doug   47895
4  liz    

I referred to the following post from SO replace string if length is less than x However, on using the same solution with following commands gives me an error:

mask = df['ssn'].str.len() == 5
df['ssn'] = df['ssn'].mask(mask, df['ssn'].str.replace(df['ssn'], ''))
Traceback (most recent call last): 
TypeError: 'Series' objects are mutable, thus they cannot be hashed

I would appreciate any suggestions.

like image 347
ds_student Avatar asked Jan 26 '26 08:01

ds_student


2 Answers

You can do this with df.apply as well: df['ssn'] = df['ssn'].apply(lambda a: a if len(str(a))==5 else '').

like image 101
Mason Caiby Avatar answered Jan 27 '26 23:01

Mason Caiby


Your column ssn contains numbers not string, that is why it is not working. Try the following :

mask = df['ssn'].astype(str).str.len() != 5
df.loc[mask, 'ssn'] = ''

In [1] : print(df)
Out[1] :    Name    ssn
0  john  12345
1  mike  54321
2  adam       
3  doug  47895
4   liz      
like image 42
vlemaistre Avatar answered Jan 27 '26 22:01

vlemaistre



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!