Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Roll up dataframe values into a cell above it

I have a pandas dataframe grocery as follows:

#Code
import pandas as pd 
import numpy as np

data = {'Item':['Flour', 'Eggs', 'Cabbage', 'Detergent', 'No Item', 'No Item', 'No Item', 'Tissues', 'Batteries', 'No Item', 'No Item'], 
        'Price':[10, 11, 12, 13, np.nan, np.nan, np. nan, 14, 15, np.nan, np.nan],
        'Description':['one', '', 'three', '', 'five', 'six', 'seven', 'eight', '', 'ten', 'eleven'],
        'Type':['Grocery','Grocery','Grocery','Grocery','Grocery','Grocery','Grocery','Grocery','Grocery','Grocery','Grocery']} 
  
grocery = pd.DataFrame(data) 

print(grocery)

#Output

         Item  Price Description     Type
0       Flour   10.0         one  Grocery
1        Eggs   11.0              Grocery
2     Cabbage   12.0       three  Grocery
3   Detergent   13.0              Grocery
4     No Item    NaN        five  Grocery
5     No Item    NaN         six  Grocery
6     No Item    NaN       seven  Grocery
7     Tissues   14.0       eight  Grocery
8   Batteries   15.0              Grocery
9     No Item    NaN         ten  Grocery
10    No Item    NaN      eleven  Grocery

What set of operations should I perform to merge/roll up the description column contents placed in front of No Item into the description one cell above it?

For example, description = ten and description = eleven in item = No Item roll up under description of item = Batteries and then the rows containing item = No Item are dropped.

Desired output below:

    #Desired Output

        Item  Price     Description     Type
0      Flour     10             one  Grocery
1       Eggs     11                  Grocery
2    Cabbage     12           three  Grocery
3  Detergent     13  five six seven  Grocery
4    Tissues     14           eight  Grocery
5  Batteries     15      ten eleven  Grocery
like image 681
Ashu Grover Avatar asked Dec 19 '25 12:12

Ashu Grover


1 Answers

something like that?

df2 = (grocery
    .replace('No Item', np.nan)
    .ffill()
    .groupby('Item', sort = False)
    .agg({'Price':'first', 'Description':list,'Type':'first'})
)
df2['Description'] = df2['Description'].apply(lambda l: ' '.join(l))
df2

produces

             Price  Description      Type
Item            
Flour        10.0   one              Grocery
Eggs         11.0                    Grocery
Cabbage      12.0   three            Grocery
Detergent    13.0   five six seven   Grocery
Tissues      14.0   eight            Grocery
Batteries    15.0   ten eleven       Grocery

Note this assumes all the Items are unique in the original list

like image 129
piterbarg Avatar answered Dec 22 '25 01:12

piterbarg



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!