I have 2 dataframes of the type,
d1 = {'Domain': ['amazon.com', 'apple.com', 'amazon.com','xyz.com'], 'Pattern': ['kindle','music','subscribe-and-save',''],'Other Important Info':['a','b','c','d']}
df1 = pd.DataFrame(d1)
d2 = {'Domain': ['google.com','google.com','amazon.com','amazon.com', 'youtube.com', 'amazon.com'], 'Url': ['https://google.com/kindle','https://google.com/','https://amazon.com/subscribe-and-save','https://amazon.com/abc','https://youtube.com/music','https:amazon.com/kindle']}
df2 = pd.DataFrame(d2)
The main aim is to merge the two dataframes based on the 'Domain' and also when 'Pattern' is in 'Url'.
So the result should be the following dataframe
{'Domain':['amazon.com','amazon.com'],'Url':['https://amazon.com/subscribe-and-save','https:amazon.com/kindle'],'Other Important Info':['c','a']}
How I'm doing it currently is,
def lookup_table(value, df):
out = None
list_items = df['Pattern'].tolist()
for item in list_items:
if item in value:
out = item
break
return out
df2['Pattern'] = df2['url'].apply(lambda x: lookup_table(x, df1[df1['Pattern']!='']))
merged = pd.merge(df2[df2['Pattern'].notnull()], df1[df1['Pattern']!=''],on=['Domain','Pattern'],how='left')
However the lookup_table function is taking way too long to run because of the for loop
How can I do this faster? Using Python 2 on windows.
df1
Domain Pattern Other Important Info
0 amazon.com kindle a
1 apple.com music b
2 amazon.com subscribe-and-save c
3 xyz.com
df2
Domain Url
0 google.com https://google.com/kindle
1 google.com https://google.com/
2 amazon.com https://amazon.com/subscribe-and-save
3 amazon.com https://amazon.com/abc
4 youtube.com https://youtube.com/music
5 amazon.com https:amazon.com/kindle
The main aim is to merge the two dataframes based on the 'Domain' and also when 'Pattern' is in 'Url'.
df = df1.merge(df2, on='Domain')
df.loc[df.apply(lambda x: x.Pattern in x.Url, axis=1)]
Output
Domain Pattern Other Important Info \
2 amazon.com kindle a
3 amazon.com subscribe-and-save c
Url
2 https:amazon.com/kindle
3 https://amazon.com/subscribe-and-save
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With