I try to create the column termination date. But only if the flag cancellations or lapses is set to yes the column termination date should contain the effective date, otherwise null. I receive the following error message for these three approaches.
df['Termination_Date'] = np.where((df['Cancellations'] == 'Yes') | (df['Lapses'] == 'Yes'), df['Effective Date'])
ValueError: either both or neither of x and y should be given
df['Termination_Date'] = np.where((df['Cancellations'] == 'Yes') | (df['Lapses'] == 'Yes'), df['Effective Date'], "")
TypeError: invalid type promotion
df['Termination_Date'] = np.where((df['Cancellations'] == 'Yes') | (df['Lapses'] == 'Yes'), df['Effective Date'], np.nan)
TypeError: invalid type promotion
thanks
Is possible use alternative with Series.where?
Sample:
df = pd.DataFrame({
'Effective Date':pd.date_range('2019-01-01', periods=6),
'Cancellations':['Yes'] * 4 + ['No'] * 2,
'Lapses':['yes'] * 2 + ['No'] * 4,
})
df['Termination_Date'] = df['Effective Date'].where((df['Cancellations'] == 'Yes') |
(df['Lapses'] == 'Yes'))
Or:
m = (df['Cancellations'] == 'Yes') | (df['Lapses'] == 'Yes')
df.loc[m, 'Termination_Date'] = df['Effective Date']
print (df)
Effective Date Cancellations Lapses Termination_Date
0 2019-01-01 Yes yes 2019-01-01
1 2019-01-02 Yes yes 2019-01-02
2 2019-01-03 Yes No 2019-01-03
3 2019-01-04 Yes No 2019-01-04
4 2019-01-05 No No NaT
5 2019-01-06 No No NaT
Use Series.where:
df['Termination_Date'] = df['Effective Date'].where( (df['Cancellations'] == 'Yes') |
(df['Lapses'] == 'Yes') )
or Series.mask
df['Termination_Date'] = df['Effective Date'].mask( df['Cancellations'].ne('yes')
.mul(df['Lapses'].ne('Yes') )
Also we can check with DataFrame.any
df['Termination_Date'] = df['Effective Date'].where( df[['Lapses','Cancellations']].eq('Yes').any(axis = 1) )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With