Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python numpy create dataset column: only add value based on condition otherwise null

I try to create the column termination date. But only if the flag cancellations or lapses is set to yes the column termination date should contain the effective date, otherwise null. I receive the following error message for these three approaches.

df['Termination_Date'] = np.where((df['Cancellations'] == 'Yes') | (df['Lapses'] == 'Yes'), df['Effective Date'])
ValueError: either both or neither of x and y should be given

df['Termination_Date'] = np.where((df['Cancellations'] == 'Yes') | (df['Lapses'] == 'Yes'), df['Effective Date'], "")
TypeError: invalid type promotion

df['Termination_Date'] = np.where((df['Cancellations'] == 'Yes') | (df['Lapses'] == 'Yes'), df['Effective Date'], np.nan)
TypeError: invalid type promotion

thanks

like image 377
wuz Avatar asked Apr 23 '26 14:04

wuz


2 Answers

Is possible use alternative with Series.where?

Sample:

df = pd.DataFrame({
         'Effective Date':pd.date_range('2019-01-01', periods=6),
         'Cancellations':['Yes'] * 4 + ['No'] * 2,
         'Lapses':['yes'] * 2 + ['No'] * 4,

})

df['Termination_Date'] = df['Effective Date'].where((df['Cancellations'] == 'Yes') | 
                                                     (df['Lapses'] == 'Yes')) 

Or:

m = (df['Cancellations'] == 'Yes') | (df['Lapses'] == 'Yes')
df.loc[m, 'Termination_Date'] = df['Effective Date']

print (df)
  Effective Date Cancellations Lapses Termination_Date
0     2019-01-01           Yes    yes       2019-01-01
1     2019-01-02           Yes    yes       2019-01-02
2     2019-01-03           Yes     No       2019-01-03
3     2019-01-04           Yes     No       2019-01-04
4     2019-01-05            No     No              NaT
5     2019-01-06            No     No              NaT
like image 193
jezrael Avatar answered Apr 26 '26 03:04

jezrael


Use Series.where:

df['Termination_Date'] = df['Effective Date'].where( (df['Cancellations'] == 'Yes') |
                                                     (df['Lapses'] == 'Yes') )

or Series.mask

df['Termination_Date'] = df['Effective Date'].mask( df['Cancellations'].ne('yes')
                                              .mul(df['Lapses'].ne('Yes') )

Also we can check with DataFrame.any

df['Termination_Date'] = df['Effective Date'].where( df[['Lapses','Cancellations']].eq('Yes').any(axis = 1) )
like image 31
ansev Avatar answered Apr 26 '26 03:04

ansev



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!