Python

Question

I am trying use the following regular expression to extract domain name from a text, but it just produce nothing, what's wrong with it?

I don't know if this is suitable to ask this "fix code" question, maybe I should read more.

I just want to save some time.

Thanks.

pat_url = re.compile(r'''

            (?:https?://)*

            (?:[\w]+[\-\w]+[.])*

            (?P<domain>[\w\-]*[\w.](com|net)([.](cn|jp|us))*[/]*)

            ''')

print re.findall(pat_url,"http://www.google.com/abcde")

I want the output to be google.com.

Amber · Accepted Answer

Don't use regex for this. Use the urlparse standard library instead. It's far more straightforward and easier to read/maintain.

http://docs.python.org/library/urlparse.html

Python - Regular Expression For Domain Names

Tags:

regex

url

dns

yasein

1 Answers

Amber

Recent Activity

Donate For Us