How to extract href from a tag using ruby regex?

Question

I have this link which i declare like this:

link = "<a href=\"https://www.congress.gov/bill/93rd-congress/house-bill/11461\">H.R.11461</a>"

The question is how could I use regex to extract only the href value?

Thanks!

Admin · Accepted Answer

If you want to parse HTML, you can use the Nokogiri gem instead of using regular expressions. It's much easier.

Example:

require "nokogiri"

link = "<a href=\"https://www.congress.gov/bill/93rd-congress/house-bill/11461\">H.R.11461</a>"

link_data = Nokogiri::HTML(link)

href_value = link_data.at_css("a")[:href]

puts href_value # => https://www.congress.gov/bill/93rd-congress/house-bill/11461

neuronaut · Answer

You should be able to use a regular expression like this:

href\s*=\s*"([^"]*)"

See this Rubular example of that expression.

The capture group will give you the URL, e.g.:

link = "<a href=\"https://www.congress.gov/bill/93rd-congress/house-bill/11461\">H.R.11461</a>"
match = /href\s*=\s*"([^"]*)"/.match(link)
if match
  url = match[1]
end

Explanation of the expression:

href matches the href attribute
\s* matches 0 or more whitespace characters (this is optional -- you only need it if the HTML might not be in canonical form).
= matches the equal sign
\s* again allows for optional whitespace
" matches the opening quote of the href URL
( begins a capture group for extraction of whatever is matched within
[^"]* matches 0 or more non-quote characters. Since quotes inside HTML attributes must be escaped this will match all characters up to the end of the URL.
) ends the capture group
" matches the closing quote of the href attribute's value

How to extract href from a tag using ruby regex?

Tags:

regex

html-parsing

ruby

Ryzal Yusoff

2 Answers

Explanation of the expression:

neuronaut

Recent Activity

Donate For Us

How to extract href from a tag using ruby regex?

Tags:

regex

html-parsing

ruby

Ryzal Yusoff

2 Answers

Explanation of the expression:

neuronaut

Related questions

Recent Activity

Donate For Us