Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to break a row into multiple rows in python?

Assume I have a sample figure like this:

l = [
['Visitors', '1 February 2020', 'Saturday', 'Shop A', 'In', '100', '20', '30','150', 'Out', '90', '10', '15', '115'],
['Visitors', '1 February 2020', 'Saturday', 'Shop B', 'In', '20', '10', '40', '70', 'Out', '10', '9', '0', '19'],
['Visitors', '1 February 2020', 'Saturday', 'Shop C', 'In', '42', '18', '20', '80', 'Out', '40', '10', '20', '70'],
['Visitors', '1 February 2020', 'Saturday', 'Shop D', 'In', '0', '0', '0', '0', 'Out', '0', '0', '0', '0'],
['Visitors', '1 February 2020', 'Saturday', 'Shop E', 'In', '0', '0', '0', '0', 'Out', '0', '0', '0', '0'],
['Visitors', '1 February 2020', 'Saturday', 'Shop F', 'In', '20', '19', '11', '50', 'Out', '10', '9', '5', '24'],
['Visitors', '1 February 2020', 'Saturday', 'Shop G', 'In', '25', '8', '33', '66', 'Out', '20', '6', '30', '56'],
['Visitors', '1 February 2020', 'Saturday', 'Shop H', 'In', '180', '88', '6', '274', 'Out', '170', '80', '5', '255'],
['Visitors', '1 February 2020', 'Saturday', 'Shop I', 'In', '0', '0', '0', '0', 'Out', '0', '0', '0', '0'],
['Visitors', '1 February 2020', 'Saturday', 'Total', 'In', '387', '163', '140', '690', 'Out', '340', '124', '75', '539'],
]

The figures show about how many men/women/children visit a shop a day and record their in and out record. The figure above can be interpreted as below:

[Ppl_type, Date, Weekday, Shop, In, Men, Women, Children, Total, Out, Men, Women, Children, Total]

enter image description here <-- the expected result

This is the result I would like to see. Generate the above figure into excel with the header below:

header= ['Ppl_type', 'Date', 'Weekday', 'Shop', 'In/Out', 'Visitor_Type', 'Number']

As a result, each shop will have six rows (i.e. three rows for "In" and three rows for "Out") summarizing the figures above.

I would like to know how can I do it by python and generate the result to excel. I have tried worksheet.write but seems only work for the first four columns. Thank you so much.

like image 820
poon cl Avatar asked May 19 '26 10:05

poon cl


1 Answers

For a fully programmatic solution, you can use:

header= ['Ppl_type', 'Date', 'Weekday', 'Shop',
         'In', 'Men', 'Women', 'Children', 'Total',
         'Out', 'Men', 'Women', 'Children', 'Total']

df = pd.DataFrame(l, columns=header)

m1 = df.columns.isin(['In', 'Out'])
grp = df.columns.to_series().where(m1).ffill()
m2 = grp.notna()
m = m2 & ~m1

out = (
 df.loc[:, m2==m]
   .set_index(list(grp[~m2].index))
   .astype(int)
   .set_axis(pd.MultiIndex.from_arrays([df.columns[m], grp[m]],
                                       names=('Visitor_Type', 'In/Out')), axis=1)
   .stack(['In/Out', 'Visitor_Type']).reset_index(name='Number')
   # uncomment the line below to remove the Total
   #.loc[lambda d: d['Visitor_Type'].ne('Total') 
)

Output:

    Ppl_type             Date   Weekday    Shop In/Out Visitor_Type  Number
0   Visitors  1 February 2020  Saturday  Shop A     In     Children      30
1   Visitors  1 February 2020  Saturday  Shop A     In          Men     100
2   Visitors  1 February 2020  Saturday  Shop A     In        Total     150
3   Visitors  1 February 2020  Saturday  Shop A     In        Women      20
4   Visitors  1 February 2020  Saturday  Shop A    Out     Children      15
5   Visitors  1 February 2020  Saturday  Shop A    Out          Men      90
6   Visitors  1 February 2020  Saturday  Shop A    Out        Total     115
7   Visitors  1 February 2020  Saturday  Shop A    Out        Women      10
8   Visitors  1 February 2020  Saturday  Shop B     In     Children      40
9   Visitors  1 February 2020  Saturday  Shop B     In          Men      20
10  Visitors  1 February 2020  Saturday  Shop B     In        Total      70
...
like image 156
mozway Avatar answered May 21 '26 23:05

mozway



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!