I have the follwing df:
df = pd.DataFrame({
'col1': [1, np.nan, np.nan, np.nan, 1, np.nan, np.nan, np.nan],
'col2': [np.nan, 2, np.nan, np.nan, np.nan, 2, np.nan, np.nan],
'col3': [np.nan, np.nan, 3, np.nan, np.nan, np.nan, 3, np.nan],
'col4': [np.nan, np.nan, np.nan, 4, np.nan, np.nan, np.nan, 4]
})
It has the following display:
col1 col2 col3 col4
0 1.0 NaN NaN NaN
1 NaN 2.0 NaN NaN
2 NaN NaN 3.0 NaN
3 NaN NaN NaN 4.0
4 5.0 NaN NaN NaN
5 NaN 6.0 NaN NaN
6 NaN NaN 7.0 NaN
7 NaN NaN NaN 8.0
My goal is to keep all rows begining with float (not NaN value) and join to them the remaining ones.
The new_df I want to get is:
col1 col2 col3 col4
0 1 2 3 4
4 5 6 7 8
Any help form your side will be highly appreciated (I upvote all answers).
Thank you!
If need join first values per groups defined by non missing values in df['col1'] use:
df = (df.reset_index()
.groupby(df['col1'].notna().cumsum())
.first()
.set_index('index'))
Try this:
df.apply(lambda x: x.dropna().to_numpy())
Output:
col1 col2 col3 col4
0 1.0 2.0 3.0 4.0
1 5.0 6.0 7.0 8.0
You can also, cast as integers:
df.apply(lambda x: x.dropna().to_numpy(dtype='int'))
Output:
col1 col2 col3 col4
0 1 2 3 4
1 5 6 7 8
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With