Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exclude Columns from pandas .mean()

Tags:

python

pandas

I have a df that contains the rank of each team in a given stat for the given year. It looks like this:

teamID  yearID  W   L   IP  WHIP    K%  BB% HR/9    ERA FIP ERA-    FIP-    K/BB+   WHIP+   K%+ BB%+    WAR
1209    Athletics   2001.0  2.0 6.0 3.0 7.0 19.0    9.0 1.0 7.0 5.0 7.0 5.0 8.0 7.0 11.0    10.0    4.0

I want to create a column with the average rank for each row, but doing df.mean(axis=1) includes the year (2001) and really throws the number off. Anybody know how to get a round this with maybe a lambda and .apply(), or is there a kwarg that can exclude certain columns? I haven't found one. I want to do this across years so that is why the yearID column is necessary.

like image 304
Charles Lyman Avatar asked Oct 12 '25 11:10

Charles Lyman


1 Answers

Simply exclude it from your calc using loc[] and a comprehension on the columns.

df = pd.read_csv(io.StringIO("""teamID  yearID  W   L   IP  WHIP    K%  BB% HR/9    ERA FIP ERA-    FIP-    K/BB+   WHIP+   K%+ BB%+    WAR
1209    Athletics   2001.0  2.0 6.0 3.0 7.0 19.0    9.0 1.0 7.0 5.0 7.0 5.0 8.0 7.0 11.0    10.0    4.0"""), sep="\s+")

df["mean"] = df.loc[:,[c for c in df.columns if c!= "yearID"]].mean(axis=1)

output

         teamID  yearID    W    L   IP  WHIP    K%  BB%  HR/9  ERA  FIP  ERA-  FIP-  K/BB+  WHIP+   K%+  BB%+  WAR    mean
1209  Athletics  2001.0  2.0  6.0  3.0   7.0  19.0  9.0   1.0  7.0  5.0   7.0   5.0    8.0    7.0  11.0  10.0  4.0  6.9375
like image 71
Rob Raymond Avatar answered Oct 16 '25 07:10

Rob Raymond