Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Standard Deviation within a list of dataframes

I have a list of say 50 dataframes 'list1', each dataframe has columns 'Speed' and 'Value', like this;

Speed   Value
1       12
2       17
3       19
4       21
5       25

I am trying to get the standard deviation of 'Value' for each speed, across all dataframes. The end goal is get a list or df of standard deviation for each speed, like this;

Speed   Standard Deviation
1       1.23
2       2.5
3       1.98
4       5.6
5       5.77

I've tried to pull the values into a new dataframe using a for loop, to then use 'statistics.stdev' on but I can't seem to get it working. Any help is really appreciatted!

Update!

pd.concat([d.set_index('Speed').values for d in df_power], axis=1).std(1)

This worked. Although, I forgot to mention that the values for Speed are not always the same between dataframes. Some dataframes miss a few and this ends up returning nan in those instances.

like image 813
Iceberg_Slim Avatar asked Dec 07 '25 05:12

Iceberg_Slim


2 Answers

You can concat and use std:

list_df = [df1, df2, df3, ...]
pd.concat([d.set_index('Speed') for d in list_dfs], axis=1).std(1)
like image 183
Quang Hoang Avatar answered Dec 08 '25 17:12

Quang Hoang


You'll want to concatenate, groupby speed, and take the standard deviation.

1) Concatenate your dataframes

list1 = [df_1, df_2, ...]
full_df = pd.concat(list1, axis=0) # stack all dataframes

2) Groupby speed and take the standard deviation

std_per_speed_df = full_df.groupby('speed')[['value']].std()
like image 43
Brandon Avatar answered Dec 08 '25 17:12

Brandon