Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a repeated dataframe over time and map it to a list of time?

I have created the below dataframe:

df = pd.DataFrame()
df['date'] = pd.date_range(start="2019-12-01", end="2019-12-20", freq='D')
Output:
    date
0   2019-12-01
1   2019-12-02
2   2019-12-03
3   2019-12-04
4   2019-12-05
5   2019-12-06
6   2019-12-07
7   2019-12-08
8   2019-12-09
9   2019-12-10
10  2019-12-11
11  2019-12-12
12  2019-12-13
13  2019-12-14
14  2019-12-15
15  2019-12-16
16  2019-12-17
17  2019-12-18
18  2019-12-19
19  2019-12-20

And I have following two lines

Line Start        End         Amount
A    2019-12-01  2019-12-08   100
B    2019-12-06  2019-12-15   200

I want to have the following result:

Output:
    date         amount   line
0   2019-12-01   100       A
1   2019-12-02   100       A
2   2019-12-03   100       A
3   2019-12-04   100       A
4   2019-12-05   100       A
5   2019-12-06   300       A,B
6   2019-12-07   300       A,B
7   2019-12-08   300       A,B
8   2019-12-09   200       B
9   2019-12-10   200       B 
10  2019-12-11   200       B
11  2019-12-12   200       B
12  2019-12-13   200       B
13  2019-12-14   200       B
14  2019-12-15   200       B
15  2019-12-16   0
16  2019-12-17   0
17  2019-12-18   0
18  2019-12-19   0
19  2019-12-20   0

What can I do to achieve this? I have tried to use "map" function but I cant get the result.

Sorry guys, if the two lines have a index, how can i also add that column in the result?

like image 243
Kilometers Avatar asked Feb 01 '26 18:02

Kilometers


1 Answers

Try this.assume your second list as a dataframe

import pandas as pd
df = pd.DataFrame()
df['date'] = pd.date_range(start="2019-12-01", end="2019-12-20", freq='D')

df2 = pd.DataFrame({"Start":["2019-12-01","2019-12-06"],"End":["2019-12-08","2019-12-15"],"Amount":[100,200]})
df2["Start"] = pd.to_datetime(df2["Start"])
df2["End"] = pd.to_datetime(df2["End"])

def f(x):

    df_ = df2[(df2.Start<= x) & (df2.End>=x)]["Amount"]
    v = df_.values
    i = df_.index.values      
    return v,i

s=df.date.apply(lambda x: pd.Series({"amount":sum(f(x)[0]),"line":','.join(map(str, f(x)[1]))}))
df= pd.concat([df,s],axis=1)

in here line is index of the second data frame (df2).If you need line starting from 1, you can change this line by adding 1 to index values. i = df_.index.values+1
output

          date  amount line
0  2019-12-01     100    0
1  2019-12-02     100    0
2  2019-12-03     100    0
3  2019-12-04     100    0
4  2019-12-05     100    0
5  2019-12-06     300  0,1
6  2019-12-07     300  0,1
7  2019-12-08     300  0,1
8  2019-12-09     200    1
9  2019-12-10     200    1
10 2019-12-11     200    1
11 2019-12-12     200    1
12 2019-12-13     200    1
13 2019-12-14     200    1
14 2019-12-15     200    1
15 2019-12-16       0     
16 2019-12-17       0     
17 2019-12-18       0     
18 2019-12-19       0     
19 2019-12-20       0 
like image 77
Rajith Thennakoon Avatar answered Feb 03 '26 08:02

Rajith Thennakoon