In R I can apply a logarithmic (or square root, etc.) transformation to all numeric columns of a data frame, by using:
logdf <- log10(df)
Is there something equivalent in Python/Pandas? I see that there is a "transform" and an (R-like) "apply" function, but could not figure out how to use them in this case.
Thanks for any hints or suggestions.
Supposed you have a dataframe named df
You can first make a list of possible numeric types, then just do a loop
numerics = ['int16', 'int32', 'int64', 'float16', 'float32', 'float64']
for c in [c for c in df.columns if df[c].dtype in numerics]:
    df[c] = np.log10(df[c])
Or, a one-liner solution with lambda operator and np.dtype.kind
numeric_df = df.apply(lambda x: np.log10(x) if np.issubdtype(x.dtype, np.number) else x)
If most columns are numeric it might make sense to just try it and skip the column if it does not work:
for column in df.columns:
    try:
        df[column] = np.log10(df[column])
    except (ValueError, AttributeError):
        pass
If you want to you could wrap it in a function, of course.
If all columns are numeric, you can even simply do
df_log10 = np.log10(df)
You can use select_dtypes and numpy.log10:
import numpy as np
for c in df.select_dtype(include = [np.number]).columns:
    df[c] = np.log10(df[c])
The select_dtypes selects columns of the the data types that are passed to it's include parameter. np.number includes all numeric data types.
numpy.log10 returns the base 10 logarithm of the input, element wise
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With