How can I use .loc with .str.match() to update column values but with a function call? The code I'm trying is like;
df.loc[df['Col1'].str.match(r'\d\d/\d\d/\d\d\d\d', na=False), 'Col2'] = _my_func(df['Col1'])
a simple regex pattern to find date format, and then _myfunc();
def _my_func(data)
for row in data.iteritems():
day = int(row[1][:2])
month = int(row[1][3:5])
year = int(row[1][6:])
fecha = datetime.datetime(year, month, day, 0, 0, 0)
diff = fecha - datetime.datetime.now()
if diff.days > 0:
return 'Yes'
elif diff.days < 0:
return 'No'
Is this a correct way to return values from the function into the dataframe?
Also if I insert a print('test') into the _my_func just before either return, it only prints test one time, instead of a print for each row in the data passed to the function, does anyone know why? Thank you.
You can try it using apply() function.
For example:
df['loc1'] = df['loc1'].apply(_my_func)
Then it would take each row of the dataframe and pass it as input to the function _my_func.
Following my comment:
def _my_func(x):
day = int(x[:2])
month = int(x[3:5])
year = int(x[6:])
fecha = datetime.datetime(year, month, day, 0, 0, 0)
diff = fecha - datetime.datetime.now()
if diff.days > 0:
return 'Yes'
elif diff.days < 0:
return 'No'
Followed by:
df.loc[df['Col1'].str.match(r'\d\d/\d\d/\d\d\d\d', na=False), 'Col2'] = df['Col1'].apply(_my_func)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With