I have a DataFrame say - df
data = [['00898','22123'],['87','99878'],['22123','66454'],['00898','87'],
['2109','312'],['6412','3078'],['6412','3078'],['66454','03158'],['3268','87']]
df = pd.DataFrame(data,columns = ['emp_ids','master_emp_ids'])
df
emp_ids master_emp_ids
0 00898 22123
1 87 99878
2 22123 66454
3 00898 87
4 2109 312
5 6412 3078
6 6412 3078
7 66454 03158
8 3268 87
Now I want to check whether all the id's of the column emp_ids are present in the the column master_emp_ids or not. So each and every id of emp_ids should get compared with each and every id of master_emp_ids. And there should be a column created named status, where it gives yes if present and no if not. Basically, 87 in the first col is present in the second col so it's a yes and also to color green that particular cell where it says 'yes'. So my final DataFrame should be - df
data = [['no','00898','22123'],['yes','87','99878'],['yes','22123','66454'],['no','00898','87'],
['no','2109','312'],['no','6412','3078'],['no','6412','3078'],['yes','66454','03158'],['no','3268','87']]
df = pd.DataFrame(data,columns = ['Status','emp_ids','master_emp_ids'])
df
status emp_ids master_emp_ids
0 no 00898 22123
1 yes 87 99878
2 yes 22123 66454
3 no 00898 87
4 no 2109 312
5 no 6412 3078
6 no 6412 3078
7 yes 66454 03158
8 no 3268 87
please help me to get my desired OP.
Thanks!!!
Use numpy.where with Series.isin for add new last column:
df['status'] = np.where(df['emp_ids'].isin(df['master_emp_ids']), 'yes','no')
print (df)
Status emp_ids master_emp_ids status
0 no 00898 22123 no
1 yes 87 99878 yes
2 yes 22123 66454 yes
3 no 00898 87 no
4 no 2109 312 no
5 no 6412 3078 no
6 no 6412 3078 no
7 yes 66454 03158 yes
8 no 3268 87 no
If need first column use DataFrame.insert:
df.insert(0, 'status', np.where(df['emp_ids'].isin(df['master_emp_ids']), 'yes','no'))
print (df)
status Status emp_ids master_emp_ids
0 no no 00898 22123
1 yes yes 87 99878
2 yes yes 22123 66454
3 no no 00898 87
4 no no 2109 312
5 no no 6412 3078
6 no no 6412 3078
7 yes yes 66454 03158
8 no no 3268 87
EDIT: For coloring is possible use style:
def color(val):
if val == 'yes':
return 'background-color: green'
elif val == 'no':
return 'background-color: red'
else:
return ''
df.insert(0, 'status', np.where(df['emp_ids'].isin(df['master_emp_ids']), 'yes','no'))
df.style.applymap(color).to_excel('styled.xlsx', engine='openpyxl')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With