Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fill columns with values from other dataframe with corresponding id in pandas

I have a dataframe in which I have to perform some operations. I got it everything all right like this:

 ID  Value      Date          Date_diff_cumsum     Val     Weight
  1   0.000000 2017-02-13 20:54:00     0.0       0.000000     nan
  1   0.029598 2017-02-13 21:02:00     8.0       0.029598     nan
  1   0.273000 2017-02-13 22:33:00    99.0       0.273000     nan
  1   0.153000 2017-02-13 23:24:00    150.0      0.15300      nan

I have another dataset in which I have the weights, like this:

ID   Value
 1   78.0
 2   75.0
 3   83.0
 4   60.0

And I would like to fill my original dataframe's weigth columns with the repetition of thr weight of each ID, like:

 ID  Value      Date          Date_diff_cumsum   Val        Weight
  1   0.000000 2017-02-13 20:54:00     0.0       0.000000     78.0
  1   0.029598 2017-02-13 21:02:00     8.0       0.029598     78.0
  1   0.273000 2017-02-13 22:33:00    99.0       0.273000     78.0
  1   0.153000 2017-02-13 23:24:00    150.0      0.15300      78.0
  ...    ...          ...              ...          ...         ...
  4   ....      .....      ....        ....        ...         60.0
  4   ....      .....      ....        ....        ...         60.0

That's because I need to calculate with this formula:

  • for each ID, (Val*1000)/(weight*Date_diff_cumsum), i.e.: multiply each Val by 1000, then divide it by the weight multiplied the difference of time between the i and the i-1 timeframe (the Date_diff_cumsum) and store it in a dataframe, where I can plot the res

That's my code:

df = df[['ID','Value', 'Date']]
df = df.sort_values(by=['Date'])
df['Date_diff_cumsum'] = df.groupby('ID').Date.diff().dt.seconds / 60.0
df['Date_diff_cumsum'] = 
df.groupby('ID').Date_diff_cumsum.cumsum().fillna(0)
df['TempVal'] = df.groupby('ID')['Value'].transform(lambda x:(x- 
x.iloc[0]*1000))

How can I do this operation of adding the repetition of the weigth from the second dataframe to the first one? Is there a more efficient way? Because I need to calculate the final result with the same way, but with 3 other dataframes with different names but similar values, for each ID, like:

score = df1[(Val*1000)/(weight*Date_diff_cumsum)]+ 
df2(Val*1000)/(weight*Date_diff_cumsum)]+...

Thank you very much

edit: now it is working, but whenever I try to find the final dataframe:

score = df1.TempVal + df2.TempVal + df3.TempVal

I get an empty dataframe full with nans. Do you know why? I need to print all the tempVal for each ID and to plot them

like image 805
erken Avatar asked Sep 06 '25 06:09

erken


1 Answers

You can use map to map values from df2 to Weight. Since you have already calculated date_diff_cumsum by grouping over ID, you can calculate tempval directly from df1,

df1['Weight'] = df1['ID'].map(df2.set_index('ID')['Value'])

df1['TempVal'] = df1['Value']*1000/(df1['Weight'] * df1['Date_diff_cumsum'])

    ID  Value       Date              Date_diff_cumsum  Val       Weight    TempVal
0   1   0.000000    2017-02-13 20:54:00 0.0             0.000000    78.0    NaN
1   1   0.029598    2017-02-13 21:02:00 8.0             0.029598    78.0    0.047433
2   1   0.273000    2017-02-13 22:33:00 99.0            0.273000    78.0    0.035354
3   1   0.153000    2017-02-13 23:24:00 150.0           0.153000    78.0    0.013077
like image 191
Vaishali Avatar answered Sep 09 '25 15:09

Vaishali