Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert pandas column names from snake case to camel case

I have a pandas dataframe where the column names are capital and snake case. I want to convert them into camel case with first world starting letter to be lower case. The following code is not working for me. Please let me know how to fix this.

import pandas as pd

# Sample DataFrame with column names
data = {'RID': [1, 2, 3],
        'RUN_DATE': ['2023-01-01', '2023-01-02', '2023-01-03'],
        'PRED_VOLUME_NEXT_360': [100, 150, 200]}

df = pd.DataFrame(data)

# Convert column names to lowercase
df.columns = df.columns.str.lower()

# Convert column names to camel case with lowercase starting letter
df.columns = [col.replace('_', ' ').title().replace(' ', '').replace(col[0], col[0].lower(), 1) for col in df.columns]

# Print the DataFrame with updated column names
print(df)

I want to column names RID, RUN_DATE, PRED_VOLUME_NEXT_360 to be converted to rid, runDate, predVolumeNext360, but the code is giving Rid, RunDate and PredVolumeNext360.

like image 227
bunti papu Avatar asked Sep 13 '25 03:09

bunti papu


2 Answers

You could use a regex to replace _x by _X:

df.columns = (df.columns.str.lower()
                .str.replace('_(.)', lambda x: x.group(1).upper(),
                             regex=True)
             )

Or with a custom function:

def to_camel(s):
    l = s.lower().split('_')
    l[1:] = [x.capitalize() for x in l[1:]]
    return ''.join(l)

df = df.rename(columns=to_camel)

Output:

   rid     runDate  predVolumeNext360
0    1  2023-01-01                100
1    2  2023-01-02                150
2    3  2023-01-03                200
like image 84
mozway Avatar answered Sep 14 '25 18:09

mozway


Define methods to convert to lower camel case separately for clarity:


import pandas as pd

def to_camel_case(snake_str):
    return "".join(x.capitalize() for x in snake_str.lower().split("_"))

def to_lower_camel_case(snake_str):
    # We capitalize the first letter of each component except the first one
    # with the 'capitalize' method and join them together.
    camel_string = to_camel_case(snake_str)
    return snake_str[0].lower() + camel_string[1:]

# Sample DataFrame with column names
data = {'RID': [1, 2, 3],
        'RUN_DATE': ['2023-01-01', '2023-01-02', '2023-01-03'],
        'PRED_VOLUME_NEXT_360': [100, 150, 200]}

df = pd.DataFrame(data)

# Convert column names to camel case with lowercase starting letter
df.columns = [to_lower_camel_case(col) for col in df.columns]

# Print the DataFrame with updated column names
print(df)

Prints:

   rid     runDate  predVolumeNext360
0    1  2023-01-01                100
1    2  2023-01-02                150
2    3  2023-01-03                200

The methods are based on this answer by jbaiter.

like image 43
Timur Shtatland Avatar answered Sep 14 '25 18:09

Timur Shtatland