I have a df that contains the rank of each team in a given stat for the given year. It looks like this:
teamID yearID W L IP WHIP K% BB% HR/9 ERA FIP ERA- FIP- K/BB+ WHIP+ K%+ BB%+ WAR
1209 Athletics 2001.0 2.0 6.0 3.0 7.0 19.0 9.0 1.0 7.0 5.0 7.0 5.0 8.0 7.0 11.0 10.0 4.0
I want to create a column with the average rank for each row, but doing df.mean(axis=1) includes the year (2001) and really throws the number off. Anybody know how to get a round this with maybe a lambda and .apply(), or is there a kwarg that can exclude certain columns? I haven't found one. I want to do this across years so that is why the yearID column is necessary.
Simply exclude it from your calc using loc[]
and a comprehension on the columns.
df = pd.read_csv(io.StringIO("""teamID yearID W L IP WHIP K% BB% HR/9 ERA FIP ERA- FIP- K/BB+ WHIP+ K%+ BB%+ WAR
1209 Athletics 2001.0 2.0 6.0 3.0 7.0 19.0 9.0 1.0 7.0 5.0 7.0 5.0 8.0 7.0 11.0 10.0 4.0"""), sep="\s+")
df["mean"] = df.loc[:,[c for c in df.columns if c!= "yearID"]].mean(axis=1)
output
teamID yearID W L IP WHIP K% BB% HR/9 ERA FIP ERA- FIP- K/BB+ WHIP+ K%+ BB%+ WAR mean
1209 Athletics 2001.0 2.0 6.0 3.0 7.0 19.0 9.0 1.0 7.0 5.0 7.0 5.0 8.0 7.0 11.0 10.0 4.0 6.9375
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With