I have the following sample data:
| date | value | |
|---|---|---|
| 0 | 2021/05 | 50 |
| 1 | 2021/06 | 60 |
| 2 | 2021/07 | 70 |
| 3 | 2021/08 | 80 |
| 4 | 2021/09 | 90 |
| 5 | 2021/10 | 100 |
I want to update the data in the 'date' column, where for example '2021/05' becomes '05/10/2021', '2021/06' becomes '06/12/2021' and so long (I have to choose the new date manually for every row).
Is there a better/more clever way to do it instead of:
for i in df.index:
if df['date'][i] == '2021/05':
df['date'][i] = '05/10/2021'
elif df['date'][i] == '2021/06':
df['date'][i] = '06/12/2021'
The problem is that there are more than hundred rows that have to be updated and the code above will be tremendously long.
We can use the select method from numpy like so :
import numpy as np
condlist = [df['date'] == '2021/05',
df['date'] == '2021/06']
choicelist = ['05/10/2021',
'06/12/2021']
df['date'] = np.select(condlist, choicelist, default=np.nan)
I would use an interactive approach, saving the amended DataFrame to a file at the end:
import pandas as pd
dt = pd.DataFrame({"date":["2021/05", "2021/06", "2021/07", "2021/08", "2021/09", "2021/10"], "value": [50, 60, 70, 80, 90, 100]})
for n, i in enumerate(dt.loc[:,"date"]):
to_be_parsed = True
while parsed:
day = input("What is the day for {:s}?".format(i))
date_str = "{:s}/{:0>2s}".format(i, day)
try:
dt.loc[n,"date"] = pd.to_datetime("{:s}/{:0>2s}".format(i, day)).strftime("%m/%d/%Y")
to_be_parsed = False
except:
print("Invalid date: {:s}. Try again".format(date_str))
output_path = input("Save amended dataframe to path (no input to skip): ")
if len(output_path) > 0:
dt.to_csv(output_path, index=False)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With