How to assign panda group data to a multi-index Dataframe?

Question

Aim

I have an initial data frame that looks like this:

   Serial No.  Data One  Data Two
0          01  0.258625  0.667996
1          01  0.192356  0.723055
2          01  0.738066  0.266488
3          01  0.374525  0.059664
4          01  0.193977  0.104213
5          01  0.213749  0.366608
⁞
49         05  0.948550  0.823888

I want to rearrange it into a multi-index data frame with columns of data type and serial number:

Data        Data One                  Data Two                
Serial No.        01  02  03  04  05        01  02  03  04  05

Code

To generate and fill the multi-index Data Frame I've written a small block of code:

serial_numbers = ["01", "02", "03", "04", "05"]
headings = ["Serial No.", "Data One", "Data Two"]
arrays = [[], []]
for ii, heading in enumerate(headings[1:]):
    arrays[0] += [heading]*len(serial_numbers)
    arrays[1] += serial_numbers

index = pd.MultiIndex.from_tuples(list(zip(*arrays)), names=["Data", "Serial No."])

group = random_data.groupby(headings[0])
data = pd.DataFrame(np.zeros((len(group.get_group(serial_numbers[0])), len(index))), columns=index)
for heading in headings[1:]:
    for serial_number in serial_numbers:
        data.loc[:, (heading, serial_number)] = group.get_group(serial_number).loc[:, heading]

Problem

The above code creates a Data Frame where the first serial number of each data type is filled with values and the rest are NaNs:

Data        Data One                  Data Two                
Serial No.        01  02  03  04  05        01  02  03  04  05
0           0.258625 NaN NaN NaN NaN  0.667996 NaN NaN NaN NaN
1           0.192356 NaN NaN NaN NaN  0.723055 NaN NaN NaN NaN
2           0.738066 NaN NaN NaN NaN  0.266488 NaN NaN NaN NaN
3           0.374525 NaN NaN NaN NaN  0.059664 NaN NaN NaN NaN
4           0.193977 NaN NaN NaN NaN  0.104213 NaN NaN NaN NaN
5           0.213749 NaN NaN NaN NaN  0.366608 NaN NaN NaN NaN
6           0.829126 NaN NaN NaN NaN  0.972882 NaN NaN NaN NaN
7           0.494763 NaN NaN NaN NaN  0.482118 NaN NaN NaN NaN
8           0.024283 NaN NaN NaN NaN  0.538428 NaN NaN NaN NaN
9           0.700613 NaN NaN NaN NaN  0.067831 NaN NaN NaN NaN

Why does this work for the first serial number of each data type but not others?

Appendix

In the real world the input data comes from a CSV and pandas.read_csv but for the purpose of this question I've used numpy to generate random data and some other code to structure it.

serial_no_col = []
for serial_number in serial_numbers:
    for ii in range(10):
        serial_no_col.append(serial_number)
random_data = pd.DataFrame(np.random.rand(50, 3), columns=headings)
random_data.loc[:, "Serial No."] = serial_no_col

jezrael · Accepted Answer

Use cumcount for new indices created by set_index and then unstack:

print (df)
   Serial No.  Data One  Data Two
0          01  0.258625  0.667996
1          01  0.192356  0.723055
2          02  0.738066  0.266488
3          02  0.374525  0.059664
4          03  0.193977  0.104213
5          03  0.213749  0.366608
49         05  0.948550  0.823888

df = df.set_index([df.groupby('Serial No.').cumcount(), 'Serial No.']).unstack()
print (df)
            Data One                               Data Two            \
Serial No.        01        02        03       05        01        02   
0           0.258625  0.738066  0.193977  0.94855  0.667996  0.266488   
1           0.192356  0.374525  0.213749      NaN  0.723055  0.059664   


Serial No.        03        05  
0           0.104213  0.823888  
1           0.366608       NaN

How to assign panda group data to a multi-index Dataframe?

Tags:

python

pandas

pandas-groupby

Aim

Code

Problem

Appendix

RHP

1 Answers

jezrael

Recent Activity

Donate For Us

How to assign panda group data to a multi-index Dataframe?

Tags:

python

pandas

pandas-groupby

Aim

Code

Problem

Appendix

RHP

1 Answers

jezrael

Related questions

Recent Activity

Donate For Us