Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I merge multiple rows in Dataframe when some column values are empty?

I have dataframe like this: changed labels for posting: You can see that. SSN, Student ID and Driving License will be unique. How do I do that in Python (PD/NP)

Name    SSN     Student_ID   DrivingLicenseNumber

Smith   None    1234         DL1234
Smith   None    None         DL1234
Smith   2222    1234         None     
None    2222    None         None     

You can notice, For Simith, not all values are present in each row. I am trying to get to one row for smith like below. Any pointers will be much appreciated. I know I can load to MySQL and do this but can't figureout best way in DF.

None    SSN    Student_ID    DrivingLicenseNumber
Smith   2222   1234          DL1234
like image 703
Ravi P Avatar asked Dec 10 '25 23:12

Ravi P


1 Answers

Search and shift the maximum index not null and drop any column that has null.

 df.apply(lambda x:x.shift(-(x.notna().idxmax()))).dropna(thresh=4)
  
  Name    SSN    Student_ID      DrivingLicenseNumber
0  Smith  2222       1234               DL1234
like image 182
wwnde Avatar answered Dec 13 '25 11:12

wwnde



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!