Suppose I have a dataframe like this
import pandas as pd
df = pd.DataFrame({'ID':[1,1,1,1,1,2,2,2,2], 'Order':[1,2,3,4,5,1,2,3,4]})
df
ID Order
0 1 1
1 1 2
2 1 3
3 1 4
4 1 5
5 2 1
6 2 2
7 2 3
8 2 4
I need to upsample by entire blocks, so essentially creating new copies of the block for ID == 1, and ID == 2. So the upsampled df (for n = 2 two samples with replacement) might look something like
df_upsampled = pd.DataFrame({'ID':[1,1,1,1,1,2,2,2,2,1,1,1,1,1,2,2,2,2], 'Order':[1,2,3,4,5,1,2,3,4,1,2,3,4,5,1,2,3,4]})
df_upsampled
ID Order
0 1 1
1 1 2
2 1 3
3 1 4
4 1 5
5 2 1
6 2 2
7 2 3
8 2 4
9 1 1
10 1 2
11 1 3
12 1 4
13 1 5
14 2 1
15 2 2
16 2 3
17 2 4
I thought I could handle this quick with resample() but haven't figured out how to copy entire blocks (per ID)
So basically all you need to do is to get a copy of your data frame and concat it to the original one:
df = pd.concat([df, df], ignore_index=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With