Extracting value and creating new column out of it

Question

I would like to extract certain section of a URL, residing in a column of a Pandas Dataframe and make that a new column. This

ref = df['REFERRERURL']
ref.str.findall("\d\d\/(.*?)(;|\?)",flags=re.IGNORECASE)

returns me a Series with tuples in it. How can I take out only one part of that tuple before the Series is created, so I can simply turn that into a column? Sample data for referrerurl is

http://wap.blah.com/xxx/id/11/someproduct_step2;jsessionid=....

In this example I am interested in creating a column that only has 'someproduct_step2' in it.

Thanks,

Jeff · Accepted Answer

In [25]: df = DataFrame([['http://wap.blah.com/xxx/id/11/someproduct_step2;jsessionid=....']],columns=['A'])

In [26]: df['A'].str.findall("\d\d\/(.*?)(;|\?)",flags=re.IGNORECASE).apply(lambda x: Series(x[0][0],index=['first']))
Out[26]: 
               first
0  someproduct_step2

in 0.11.1 here is a neat way of doing this as well

In [34]: df.replace({ 'A' : "http:.+\d\d/(.*?)(;|\?).*$"}, { 'A' : r'\1'} ,regex=True)
Out[34]: 
                   A
0  someproduct_step2

BBSysDyn · Answer

This also worked

def extract(x):
    res = re.findall("\d\d\/(.*?)(;|\?)",x)
    if res: return res[0][0]

session['RU_2'] = session['REFERRERURL'].apply(extract)

Extracting value and creating new column out of it

Tags:

pandas

BBSysDyn

2 Answers

Jeff

BBSysDyn

Recent Activity

Donate For Us

Extracting value and creating new column out of it

Tags:

pandas

BBSysDyn

2 Answers

Jeff

BBSysDyn

Related questions

Recent Activity

Donate For Us