Can I use Python to get a list of the column names in which all values are NaN
s, return c
and d
as result from dataframe below? Thanks.
df = pd.DataFrame({'a': [1,2,3],'b': [3,4,5], 'c':[np.nan, np.nan, np.nan],
'd':[np.nan, np.nan, np.nan]})
a b c d
0 1 3 NaN NaN
1 2 4 NaN NaN
2 3 5 NaN NaN
Use Boolean indexing with df.columns
:
res = df.columns[df.isnull().all(0)]
# Index(['c', 'd'], dtype='object')
@ahbon, you can try df.any()
. See the following sequence of statements executed on Python's interactive terminal.
Check http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.any.html
>>> import numpy as np
>>> import pandas as pd
>>>
>>> df = pd.DataFrame({'a':[1,2,3],'b':[3,4,5],'c':[np.nan, np.nan, np.nan],'d':[np.nan, np.nan, np.nan]})
>>> df
a b c d
0 1 3 NaN NaN
1 2 4 NaN NaN
2 3 5 NaN NaN
>>>
>>> # Remove all columns having all NaN values using DataFrame.any()
...
>>> df_new = df.any()
>>> df_new
a True
b True
c False
d False
dtype: bool
>>>
Finally,
>>> columns = []
>>>
>>> for key, value in df_new.iteritems():
... if value:
... columns.append(key)
...
>>> df = pd.DataFrame({'a':[1,2,3],'b':[3,4,5],'c':[np.nan, np.nan, np.nan],'d':[np.nan, np.nan, np.nan]}, columns=columns)
>>>
>>> df
a b
0 1 3
1 2 4
2 3 5
>>>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With