Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Populating column in pandas dataframe using a function call

Tags:

python

pandas

How can I use .loc with .str.match() to update column values but with a function call? The code I'm trying is like;

df.loc[df['Col1'].str.match(r'\d\d/\d\d/\d\d\d\d', na=False), 'Col2'] = _my_func(df['Col1'])

a simple regex pattern to find date format, and then _myfunc();

def _my_func(data)
    for row in data.iteritems():
        day = int(row[1][:2])
        month = int(row[1][3:5])
        year = int(row[1][6:])
        fecha = datetime.datetime(year, month, day, 0, 0, 0)
        diff =  fecha - datetime.datetime.now()
        if diff.days > 0:
            return 'Yes'
        elif diff.days < 0:
            return 'No'

Is this a correct way to return values from the function into the dataframe?

Also if I insert a print('test') into the _my_func just before either return, it only prints test one time, instead of a print for each row in the data passed to the function, does anyone know why? Thank you.

like image 531
Nordle Avatar asked Mar 23 '26 11:03

Nordle


2 Answers

You can try it using apply() function.

For example:

df['loc1'] = df['loc1'].apply(_my_func)

Then it would take each row of the dataframe and pass it as input to the function _my_func.

like image 173
Bhaskar Avatar answered Mar 25 '26 02:03

Bhaskar


Following my comment:

def _my_func(x):
    day = int(x[:2])
    month = int(x[3:5])
    year = int(x[6:])
    fecha = datetime.datetime(year, month, day, 0, 0, 0)
    diff = fecha - datetime.datetime.now()
    if diff.days > 0:
        return 'Yes'
    elif diff.days < 0:
        return 'No'

Followed by:

df.loc[df['Col1'].str.match(r'\d\d/\d\d/\d\d\d\d', na=False), 'Col2'] = df['Col1'].apply(_my_func)
like image 26
Dillon Avatar answered Mar 25 '26 02:03

Dillon