Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas : splitting a dataframe based on null values in a column [duplicate]

I have a dataframe like below:

data = [['lynda', 10,'F',125,'5/21/2018'],['tom', np.nan,'M',135,'7/21/2018'], ['nick', 15,'F',99,'6/21/2018'], ['juli', 14,np.nan,120,'1/21/2018'],['juli', 19,np.nan,140,'10/21/2018'],['juli', 18,np.nan,170,'9/21/2018']]
df = pd.DataFrame(data, columns = ['Name', 'Age','Gender','Height','Date'])

df

Snapshot

How can I transform dataframe based on np.NaN values of Gender?

I want the original dataframe df to be split into df1(Name,Age,Gender,Height,Date) which will have values of gender(first 3 rows of df)

AND into df2(Name,Age,Height,Date) which won't have Gender column (last 3 rows of df)

like image 752
zavy mola Avatar asked Sep 05 '25 01:09

zavy mola


1 Answers

This is one approach:

import pandas as pd
import numpy as np


data = [['lynda', 10,'F',125,'5/21/2018'],['tom', np.nan,'M',135,'7/21/2018'], ['nick', 15,'F',99,'6/21/2018'], ['juli', 14,np.nan,120,'1/21/2018'],['juli', 19,np.nan,140,'10/21/2018'],['juli', 18,np.nan,170,'9/21/2018']]
df = pd.DataFrame(data, columns = ['Name', 'Age','Gender','Height','Date'])

df2 = df[df['Gender'].notnull()].drop("Gender", axis=1)
print(df2)

Output:

    Name   Age  Height       Date
0  lynda  10.0     125  5/21/2018
1    tom   NaN     135  7/21/2018
2   nick  15.0      99  6/21/2018
like image 78
Rakesh Avatar answered Sep 06 '25 21:09

Rakesh