I have dataframe like this: changed labels for posting: You can see that. SSN, Student ID and Driving License will be unique. How do I do that in Python (PD/NP)
Name SSN Student_ID DrivingLicenseNumber
Smith None 1234 DL1234
Smith None None DL1234
Smith 2222 1234 None
None 2222 None None
You can notice, For Simith, not all values are present in each row. I am trying to get to one row for smith like below. Any pointers will be much appreciated. I know I can load to MySQL and do this but can't figureout best way in DF.
None SSN Student_ID DrivingLicenseNumber
Smith 2222 1234 DL1234
Search and shift the maximum index not null and drop any column that has null.
df.apply(lambda x:x.shift(-(x.notna().idxmax()))).dropna(thresh=4)
Name SSN Student_ID DrivingLicenseNumber
0 Smith 2222 1234 DL1234
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With