count values of each month, fill NaN if under certain limit

Question

I am working with a dataframe, where every column represents a company. The index is a datetime index with daily frequency. My problem is the following: For each company, I would like to fill a month with NaN if there are less than 20 values in that month. In the example below, this would mean that Company_1's entry 0.91 on 2012-08-31 would be changed to NaN, while company_2 and 3 would be unchanged.

               Company_1      Company_2   Company_3
2012-08-01     NaN            0.99        0.11
2012-08-02     NaN            0.21        NaN
2012-08-03     NaN            0.32        0.40
...            ...            ...         ...
2012-08-29     NaN            0.50       -0.36
2012-08-30     NaN            0.48       -0.32
2012-08-31     0.91           0.51       -0.33

Total Values:  1                22          21

I am struggling to find an efficient way to count the number of values for each month of each stock. I could theoretically write a function which creates a new dataframe, which reports the number of values for each month (and for each stock), to then use that dataframe for the original company information, but I am sure that there has to be an easier way. Any help is highly appreciated. Thanks in advance.

Shubham Sharma · Accepted Answer

groupby the dataframe on monthly freq and transform using count then using Series.lt create a boolean mask and use this mask to fill NaN values in dataframe:

df1 = df.mask(df.groupby(pd.Grouper(freq='M')).transform('count').lt(20))

print(df1)
            Company_1  Company_2  Company_3
2012-08-01        NaN       0.99       0.11
2012-08-02        NaN       0.21        NaN
2012-08-03        NaN       0.32       0.40
....
2012-08-29        NaN       0.50      -0.36
2012-08-30        NaN       0.48      -0.32
2012-08-31        NaN       0.51      -0.33

Henry Yik · Answer

IIUC:

df.loc[:, df.apply(lambda d: d.notnull().sum()<20)] = np.NaN

print (df)

            Company 1  Company 2  Company 3
2012-08-01        NaN       0.99       0.11
2012-08-02        NaN       0.21        NaN
2012-08-03        NaN       0.32       0.40
2012-08-29        NaN       0.50      -0.36
2012-08-30        NaN       0.48      -0.32
2012-08-31        NaN       0.51      -0.33

count values of each month, fill NaN if under certain limit

Tags:

python

pandas

dataframe

Sanoj

2 Answers

Shubham Sharma

Henry Yik

Recent Activity

Donate For Us

count values of each month, fill NaN if under certain limit

Tags:

python

pandas

dataframe

Sanoj

2 Answers

Shubham Sharma

Henry Yik

Related questions

Recent Activity

Donate For Us