better system than regex

Question

I have made an application that can extracts some specific info from a specific website. For that i have used regular expression that gives me desired output. Is there any better efficient process or idea than regex for that simple crawler.

Radu Simionescu · Accepted Answer

If you say that it is a simple regex that solves your problem, than no, there is no other more efficient solution. When it comes to crawling, the alternative would be to load the entire html page in memory, in a DOM document and search using XPath or even XQuery. But really, if the information can be extracted easily with regex, then don't bother, especially if you are not familiar with XPath.

The power of XPath comes in when you want to make complex searches. And it is more elegant than regex, for this task(at least in w3c's oppinion). But if you want a quick solution, you already found it, and it is more efficient in terms of RAM too.

better system than regex

Tags:

java

web-crawler

Toukir Naim

1 Answers

Radu Simionescu

Recent Activity

Donate For Us

better system than regex

Tags:

java

web-crawler

Toukir Naim

1 Answers

Radu Simionescu

Related questions

Recent Activity

Donate For Us